This is a reimplemented and retrained version of the YOLO v2 object detection network trained with the VOC2012 training dataset.
|Mean Average Precision (mAP)||63.9%|
For Average Precision metric description, see The PASCAL Visual Object Classes (VOC) Challenge. Tested on the VOC 2012 validation dataset.
1, 3, 416, 416 in the format
B, C, H, W, where:
B- batch size
C- number of channels
H- image height
W- image width
Expected color order is
The net outputs a blob with the shape
1, 21125 which can be reshaped to
5, 25, 13, 13, where each number corresponds to [
num_anchors: number of anchor boxes, each spatial location specified by
x_lochas five anchors
cls_reg_obj_params: parameters for classification and regression. The values are made up of the following:
x_loc: spatial location of each grid
[*] Same as the original implementation.
[**] Other names and brands may be claimed as the property of others.