RetinaFace-R50 is a medium size model with ResNet50 backbone for Face Localization. It can output face bounding boxes and five facial landmarks in a single forward pass. More details provided in the paper and repository
Metric | Value |
---|---|
AP (WIDER) | 87.30% |
GFLOPs | 100.8478 |
MParams | 29.4276 |
Source framework | MXNet* |
Average Precision (AP) is defined as an area under the precision/recall curve. All numbers were evaluated by taking into account only faces bigger than 64 x 64 pixels.
Accuracy validation approach different from described in the original repo. For details about original WIDER results please see [https://github.com/deepinsight/insightface/tree/master/detection/RetinaFace]()
Image, name: data
, shape: 1,3,640,640
, format: B,C,H,W
, where:
B
- batch sizeC
- channelH
- heightW
- widthChannel order is RGB
.
Image, name: data
, shape: 1,3,640,640
, format: B,C,H,W
, where:
B
- batch sizeC
- channelH
- heightW
- widthChannel order is BGR
.
Model outputs are floating points tensors:
face_rpn_cls_prob_reshape_stride32
, shape: 1,4, 20, 20
, format: [B, Ax2, H, W]
, represents detection scores from Feature Pyramid Network (FPN) level with stride 32 for 2 classes: background and face.face_rpn_bbox_stride32
, shape: 1,8,20,20
, format: [B, Ax4, H, W]
, represents detection box deltas from Feature Pyramid Network (FPN) level with stride 32face_rpn_landmark_pred_stride32
, shape: 1,20,20,20
, format: [B, Ax10, H, W]
, represents facial landmarks from Feature Pyramid Network (FPN) level with stride 32.face_rpn_cls_prob_reshape_stride16
, shape: 1,4,40,40
, format: [B, Ax2, H, W]
, represents detection scores from Feature Pyramid Network (FPN) level with stride 16 for 2 classes: background and face.face_rpn_bbox_stride16
, shape: 1,8,40,40
, format: [B, Ax4, H, W]
, represents detection box deltas from Feature Pyramid Network (FPN) level with stride 16.face_rpn_landmark_pred_stride16
, shape: 1,20,40,40
, format: [B, Ax10, H, W]
, represents facial landmarks from Feature Pyramid Network (FPN) level with stride 16.face_rpn_cls_prob_reshape_stride16
, shape: 1,4,80,80
, format: [B, Ax2, H, W]
, represents detection scores from Feature Pyramid Network (FPN) level with stride 8 for 2 classes: background and face.face_rpn_bbox_stride16
, shape: 1,8,80,80
, format: [B, Ax4, H, W]
, represents detection box deltas from Feature Pyramid Network (FPN) level with stride 8.face_rpn_landmark_pred_stride16
, shape: 1,20,80,80
, format: [B, Ax10, H, W]
, represents facial landmarks from Feature Pyramid Network (FPN) level with stride 8.For each output format:
B
- batch sizeA
- number of anchorsH
- feature heightW
- feature widthDetection box deltas have format [dx, dy, dh, dw]
, where:
(dx, dy)
- regression for left-upper corner of bounding box,(dh, dw)
- regression by height and width of bounding box.Facial landmarks have format [x1, y1, x2, y2, x3, y3, x4, y4, x5, y5]
, where:
(x1, y1)
- coordinates of left eye(x2, y2)
- coordinates of rights eye(x3, y3)
- coordinates of nose(x4, y4)
- coordinates of left mouth corner(x5, y5)
- coordinates of right mouth cornerThe converted model has the same parameters as the original model.
You can download models and if necessary convert them into Inference Engine format using the Model Downloader and other automation tools as shown in the examples below.
An example of using the Model Downloader:
An example of using the Model Converter:
The original model is distributed under the following license: