NonMaxSuppression¶
Versioned name: NonMaxSuppression-5
Category: Sorting and maximization
Short description: NonMaxSuppression performs non maximum suppression of the boxes with predicted scores.
Detailed description: NonMaxSuppression performs non maximum suppression algorithm as described below:
Let
B = [b_0,...,b_n]be the list of initial detection boxes,S = [s_0,...,s_N]be the list of corresponding scores.Let
D = []be an initial collection of resulting boxes.If
Bis empty then go to step 8.Take the box with highest score. Suppose that it is the box
bwith the scores.Delete
bfromB.If the score
sis greater or equal thanscore_thresholdthen addbtoDelse go to step 8.For each input box
b_ifromBand the corresponding scores_i, sets_i = s_i * func(IOU(b_i, b))and go to step 3.Return
D, a collection of the corresponding scoresS, and the number of elements inD.
Here func(iou) = 1 if iou <= iou_threshold else 0 when soft_nms_sigma == 0, else func(iou) = exp(-0.5 * iou * iou / soft_nms_sigma) if iou <= iou_threshold else 0.
This algorithm is applied independently to each class of each batch element. The total number of output boxes for each
class must not exceed max_output_boxes_per_class.
Attributes:
box_encoding
Description: box_encoding specifies the format of boxes data encoding.
Range of values: “corner” or “center”
corner - the box data is supplied as
[y1, x1, y2, x2]where(y1, x1)and(y2, x2)are the coordinates of any diagonal pair of box corners.center - the box data is supplied as
[x_center, y_center, width, height].
Type: string
Default value: “corner”
Required: no
sort_result_descending
Description: sort_result_descending is a flag that specifies whenever it is necessary to sort selected boxes across batches or not.
Range of values: true of false
true - sort selected boxes across batches.
false - do not sort selected boxes across batches (boxes are sorted per class).
Type: boolean
Default value: true
Required: no
output_type
Description: the output tensor type
Range of values: “i64” or “i32”
Type: string
Default value: “i64”
Required: no
Inputs:
1:
boxes- tensor of type T and shape[num_batches, num_boxes, 4]with box coordinates. Required.2:
scores- tensor of type T and shape[num_batches, num_classes, num_boxes]with box scores. Required.3:
max_output_boxes_per_class- scalar or 1D tensor with 1 element of type T_MAX_BOXES specifying maximum number of boxes to be selected per class. Optional with default value 0 meaning select no boxes.4:
iou_threshold- scalar or 1D tensor with 1 element of type T_THRESHOLDS specifying intersection over union threshold. Optional with default value 0 meaning keep all boxes.5:
score_threshold- scalar or 1D tensor with 1 element of type T_THRESHOLDS specifying minimum score to consider box for the processing. Optional with default value 0.6:
soft_nms_sigma- scalar or 1D tensor with 1 element of type T_THRESHOLDS specifying the sigma parameter for Soft-NMS; see Bodla et al. Optional with default value 0.
Outputs:
1:
selected_indices- tensor of type T_IND and shape[number of selected boxes, 3]containing information about selected boxes as triplets[batch_index, class_index, box_index].2:
selected_scores- tensor of type T_THRESHOLDS and shape[number of selected boxes, 3]containing information about scores for each selected box as triplets[batch_index, class_index, box_score].3:
valid_outputs- 1D tensor with 1 element of type T_IND representing the total number of selected boxes.
Plugins which do not support dynamic output tensors produce selected_indices and selected_scores tensors of shape [min(num_boxes, max_output_boxes_per_class) * num_batches * num_classes, 3] which is an upper bound for the number of possible selected boxes. Output tensor elements following the really selected boxes are filled with value -1.
Types
T: floating-point type.
T_MAX_BOXES: integer type.
T_THRESHOLDS: floating-point type.
T_IND:
int64orint32.
Example
<layer ... type="NonMaxSuppression" ... >
<data box_encoding="corner" sort_result_descending="1" output_type="i64"/>
<input>
<port id="0">
<dim>3</dim>
<dim>100</dim>
<dim>4</dim>
</port>
<port id="1">
<dim>3</dim>
<dim>5</dim>
<dim>100</dim>
</port>
<port id="2"/> <!-- 10 -->
<port id="3"/>
<port id="4"/>
</input>
<output>
<port id="5" precision="I64">
<dim>150</dim> <!-- min(100, 10) * 3 * 5 -->
<dim>3</dim>
</port>
<port id="6" precision="FP32">
<dim>150</dim> <!-- min(100, 10) * 3 * 5 -->
<dim>3</dim>
</port>
<port id="7" precision="I64">
<dim>1</dim>
</port>
</output>
</layer>