ROIAlign

Versioned name: ROIAlign-3

Category: Object detection

Short description: ROIAlign is a pooling layer used over feature maps of non-uniform input sizes and outputs a feature map of a fixed size.

Detailed description: Reference.

ROIAlign performs the following for each Region of Interest (ROI) for each input feature map:

  1. Multiply box coordinates with spatial_scale to produce box coordinates relative to the input feature map size.
  2. Divide the box into bins according to the sampling_ratio attribute.
  3. Apply bilinear interpolation with 4 points in each bin and apply maximum or average pooling based on mode attribute to produce output feature map element.

Attributes

Inputs:

Outputs:

Types

Example

<layer ... type="ROIAlign" ... >
<data pooled_h="6" pooled_w="6" spatial_scale="16.0" sampling_ratio="2" mode="avg"/>
<input>
<port id="0">
<dim>7</dim>
<dim>256</dim>
<dim>200</dim>
<dim>200</dim>
</port>
<port id="1">
<dim>1000</dim>
<dim>4</dim>
</port>
<port id="2">
<dim>1000</dim>
</port>
</input>
<output>
<port id="3" precision="FP32">
<dim>1000</dim>
<dim>256</dim>
<dim>6</dim>
<dim>6</dim>
</port>
</output>
</layer>