PSROIPooling

Versioned name : PSROIPooling-1

Category : Object detection

Short description : PSROIPooling computes position-sensitive pooling on regions of interest specified by input.

Detailed description : Reference.

PSROIPooling operation takes two input blobs: with feature maps and with regions of interests (box coordinates). The latter is specified as five element tuples: [batch_id, x_1, y_1, x_2, y_2]. ROIs coordinates are specified in absolute values for the average mode and in normalized values (to [0,1] interval) for bilinear interpolation.

Attributes

  • output_dim

    • Description : output_dim is a pooled output channel number.

    • Range of values : a positive integer

    • Type : int

    • Required : yes

  • group_size

    • Description : group_size is the number of groups to encode position-sensitive score maps.

    • Range of values : a positive integer

    • Type : int

    • Default value : 1

    • Required : no

  • spatial_scale

    • Description : spatial_scale is a multiplicative spatial scale factor to translate ROI coordinates from their input scale to the scale used when pooling.

    • Range of values : a positive floating-point number

    • Type : float

    • Required : yes

  • mode

    • Description : mode specifies mode for pooling.

    • Range of values :

      • average - perform average pooling

      • bilinear - perform pooling with bilinear interpolation

    • Type : string

    • Default value : average

    • Required : no

  • spatial_bins_x

    • Description : spatial_bins_x specifies numbers of bins to divide the input feature maps over width. Used for “bilinear” mode only.

    • Range of values : a positive integer

    • Type : int

    • Default value : 1

    • Required : no

  • spatial_bins_y

    • Description : spatial_bins_y specifies numbers of bins to divide the input feature maps over height. Used for “bilinear” mode only.

    • Range of values : a positive integer

    • Type : int

    • Default value : 1

    • Required : no

Inputs :

  • 1 : 4D input tensor with shape [N, C, H, W] and type T with feature maps. Required.

  • 2 : 2D input tensor with shape [num_boxes, 5]. It contains a list of five element tuples that describe a region of interest: [batch_id, x_1, y_1, x_2, y_2]. Required. Batch indices must be in the range of [0, N-1].

Outputs :

  • 1 : 4D output tensor with areas copied and interpolated from the 1st input tensor by coordinates of boxes from the 2nd input.

Types

  • T : any supported floating-point type.

Example

<layer ... type="PSROIPooling" ... >
    <data group_size="6" mode="bilinear" output_dim="360" spatial_bins_x="3" spatial_bins_y="3" spatial_scale="1"/>
    <input>
        <port id="0">
            <dim>1</dim>
            <dim>3240</dim>
            <dim>38</dim>
            <dim>38</dim>
        </port>
        <port id="1">
            <dim>100</dim>
            <dim>5</dim>
        </port>
    </input>
    <output>
        <port id="2">
            <dim>100</dim>
            <dim>360</dim>
            <dim>6</dim>
            <dim>6</dim>
        </port>
    </output>
</layer>