RegionYolo

Versioned name: RegionYolo-1

Category: Object detection

Short description: RegionYolo computes the coordinates of regions with probability for each class.

Detailed description: This operation is directly mapped to the original YOLO layer. Reference

Attributes:

  • anchors
    • Description: anchors codes a flattened list of pairs [width, height] that codes prior box sizes. This attribute is not used in output computation, but it is required for post-processing to restore real box coordinates.
    • Range of values: list of any length of positive floating point number
    • Type: float[]
    • Default value: None
    • Required: no
  • axis
    • Description: starting axis index in the input tensor data shape that will be flattened in the output; the end of flattened range is defined by end_axis attribute.
    • Range of values: -rank(data) .. rank(data)-1
    • Type: int
    • Default value: None
    • Required: yes
  • coords
    • Description: coords is the number of coordinates for each region.
    • Range of values: an integer
    • Type: int
    • Default value: None
    • Required: yes
  • classes
    • Description: classes is the number of classes for each region.
    • Range of values: an integer
    • Type: int
    • Default value: None
    • Required: yes
  • end_axis
    • Description: ending axis index in the input tensor data shape that will be flattened in the output; the beginning of the flattened range is defined by axis attribute.
    • Range of values: -rank(data)..rank(data)-1
    • Type: int
    • Default value: None
    • Required: yes
  • num
    • Description: num is the number of regions.
    • Range of values: an integer
    • Type: int
    • Default value: None
    • Required: yes
  • do_softmax
    • Description: do_softmax is a flag that specifies the inference method and affects how the number of regions is determined. It also affects output shape. If it is 0, then output shape is 4D, and 2D otherwise.
    • Range of values:
      • false - do not perform softmax
      • true - perform softmax
    • Type: boolean
    • Default value: true
    • Required: no
  • mask
    • Description: mask specifies the number of regions. Use this attribute instead of num when do_softmax is equal to 0.
    • Range of values: a list of integers
    • Type: int[]
    • Default value: []
    • Required: no

Inputs:

  • 1: data - 4D input tensor with floating point elements and shape [N, C, H, W]. Required.

Outputs:

  • 1: output tensor of rank 4 or less that codes detected regions. Refer to the original YOLO paper to decode the output as boxes. anchors should be used to decode real box coordinates. If do_softmax is set to 0, then the output shape is [N, (classes + coords + 1)*len(mask), H, W]. If do_softmax is set to 1, then output shape is partially flattened and defined in the following way:

    flat_dim = data.shape[axis] * data.shape[axis+1] * ... * data.shape[end_axis] output.shape = [data.shape[0], ..., data.shape[axis-1], flat_dim, data.shape[end_axis + 1], ...]

Example

<!-- YOLO V3 example -->
<layer type="RegionYolo" ... >
<data anchors="10,14,23,27,37,58,81,82,135,169,344,319" axis="1" classes="80" coords="4" do_softmax="0" end_axis="3" mask="0,1,2" num="6"/>
<input>
<port id="0">
<dim>1</dim>
<dim>255</dim>
<dim>26</dim>
<dim>26</dim>
</port>
</input>
<output>
<port id="0">
<dim>1</dim>
<dim>255</dim>
<dim>26</dim>
<dim>26</dim>
</port>
</output>
</layer>
<!-- YOLO V2 Example -->
<layer type="RegionYolo" ... >
<data anchors="1.08,1.19,3.42,4.41,6.63,11.38,9.42,5.11,16.62,10.52" axis="1" classes="20" coords="4" do_softmax="1" end_axis="3" num="5"/>
<input>
<port id="0">
<dim>1</dim>
<dim>125</dim>
<dim>13</dim>
<dim>13</dim>
</port>
</input>
<output>
<port id="0">
<dim>1</dim>
<dim>21125</dim>
</port>
</output>
</layer>