RetinaFace-R50 is a medium size model with ResNet50 backbone for Face Localization. It can output face bounding boxes and five facial landmarks in a single forward pass. More details provided in the paper and repository
Metric | Value |
---|---|
AP (WIDER) | 87.30% |
GFLOPs | 100.8478 |
MParams | 29.4276 |
Source framework | MXNet* |
Average Precision (AP) is defined as an area under the precision/recall curve. All numbers were evaluated by taking into account only faces bigger than 64 x 64 pixels.
Accuracy validation approach different from described in the original repo. For details about original WIDER results please see [https://github.com/deepinsight/insightface/tree/master/RetinaFace]()
Image, name: data
, shape: 1,3,640,640
, format: B,C,H,W
, where:
B
- batch sizeC
- channelH
- heightW
- widthChannel order is RGB
.
Image, name: data
, shape: 1,3,640,640
, format: B,C,H,W
, where:
B
- batch sizeC
- channelH
- heightW
- widthChannel order is BGR
.
Model outputs are floating points tensors:
face_rpn_cls_prob_reshape_stride32
, shape: 1,4, 20, 20
, format: [B, Ax2, H, W]
, represents detection scores from Feature Pyramid Network (FPN) level with stride 32 for 2 classes: background and face.face_rpn_bbox_stride32
, shape: 1,8,20,20
, format: [B, Ax4, H, W]
, represents detection box deltas from Feature Pyramid Network (FPN) level with stride 32face_rpn_landmark_pred_stride32
, shape: 1,20,20,20
, format: [B, Ax10, H, W]
, represents facial landmarks from Feature Pyramid Network (FPN) level with stride 32.face_rpn_cls_prob_reshape_stride16
, shape: 1,4,40,40
, format: [B, Ax2, H, W]
, represents detection scores from Feature Pyramid Network (FPN) level with stride 16 for 2 classes: background and face.face_rpn_bbox_stride16
, shape: 1,8,40,40
, format: [B, Ax4, H, W]
, represents detection box deltas from Feature Pyramid Network (FPN) level with stride 16.face_rpn_landmark_pred_stride16
, shape: 1,20,40,40
, format: [B, Ax10, H, W]
, represents facial landmarks from Feature Pyramid Network (FPN) level with stride 16.face_rpn_cls_prob_reshape_stride16
, shape: 1,4,80,80
, format: [B, Ax2, H, W]
, represents detection scores from Feature Pyramid Network (FPN) level with stride 8 for 2 classes: background and face.face_rpn_bbox_stride16
, shape: 1,8,80,80
, format: [B, Ax4, H, W]
, represents detection box deltas from Feature Pyramid Network (FPN) level with stride 8.face_rpn_landmark_pred_stride16
, shape: 1,20,80,80
, format: [B, Ax10, H, W]
, represents facial landmarks from Feature Pyramid Network (FPN) level with stride 8.For each output format:
B
- batch sizeA
- number of anchorsH
- feature heightW
- feature widthDetection box deltas have format [dx, dy, dh, dw]
, where:
(dx, dy)
- regression for left-upper corner of bounding box,(dh, dw)
- regression by height and width of bounding box.Facial landmarks have format [x1, y1, x2, y2, x3, y3, x4, y4, x5, y5]
, where:
(x1, y1)
- coordinates of left eye(x2, y2)
- coordinates of rights eye(x3, y3)
- coordinates of nose(x4, y4)
- coordinates of left mouth corner(x5, y5)
- coordinates of right mouth cornerThe converted model has the same parameters as the original model.
The original model is distributed under the following license: