Operation Set `opset1` Specification

This specification document describes opset1 operation set supported in OpenVINO. Support for each particular operation from the list below depends on the capabilities available in a inference plugin and may vary among different hardware platforms and devices. Examples of operation instances are expressed as IR V10 xml snippets. Such IR is generated by the Model Optimizer. The semantics match corresponding nGraph operation classes declared in namespace opset1.

Table of Contents

Sigmoid

Back to top

Category: Activation function

Short description: Sigmoid element-wise activation function.

Attributes: operations has no attributes.

Inputs:

Outputs:

Mathematical Formulation

For each element from the input tensor calculates corresponding element in the output tensor with the following formula:

\[ sigmoid( x ) = \frac{1}{1+e^{-x}} \]


Tanh

Back to top

Category: Activation function

Short description: Tanh element-wise activation function.

Attributes: has no attributes

Inputs:

Outputs:

Detailed description

For each element from the input tensor calculates corresponding element in the output tensor with the following formula:

\[ tanh ( x ) = \frac{2}{1+e^{-2x}} - 1 = 2sigmoid(2x) - 1 \]


Elu

Back to top

Category: Activation function

Short description: Exponential linear unit element-wise activation function.

Detailed Description

For each element from the input tensor calculates corresponding element in the output tensor with the following formula:

\[ elu(x) = \left\{\begin{array}{ll} alpha(e^{x} - 1) \quad \mbox{if } x < 0 \\ x \quad \mbox{if } x \geq 0 \end{array}\right. \]

Attributes

Inputs:

Outputs:


Erf

Back to top

Category: Arithmetic unary operation

Short description: Erf calculates the Gauss error function element-wise with given tensor.

Detailed Description

For each element from the input tensor calculates corresponding element in the output tensor with the following formula:

\[ erf(x) = \pi^{-1} \int_{-x}^{x} e^{-t^2} dt \]

Attributes:

No attributes available.

Inputs

Outputs

Types

Examples

Example 1

<layer ... type="Erf">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Selu

Back to top

Category: Arithmetic unary operation

Short description: Selu calculates the SELU activation function (https://arxiv.org/abs/1706.02515) element-wise with given tensor.

Detailed Description

For each element from the input tensor calculates corresponding element in the output tensor with the following formula:

\[ selu(x) = \lambda \left\{\begin{array}{ll} \alpha(e^{x} - 1) \quad \mbox{if } x \le 0 \\ x \quad \mbox{if } x > 0 \end{array}\right. \]

Attributes:

No attributes available.

Inputs

Outputs

Types

Examples

Example 1

<layer ... type="Selu">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>1</dim>
</port>
<port id="2">
<dim>1</dim>
</port>
</input>
<output>
<port id="3">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

FloorMod

Back to top

Category: Arithmetic binary operation

Short description: FloorMod returns an element-wise division reminder with two given tensors applying multi-directional broadcast rules. The result here is consistent with a flooring divide (like in Python programming language): floor(x / y) * y + mod(x, y) = x. The sign of the result is equal to a sign of the divisor.

Attributes:

Inputs

Outputs

Types

Examples

Example 1

<layer ... type="FloorMod">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="FloorMod">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

Mod

Back to top

Category: Arithmetic binary operation

Short description: Mod returns an element-wise division reminder with two given tensors applying multi-directional broadcast rules. The result here is consistent with a truncated divide (like in C programming language): truncated(x / y) * y + truncated_mod(x, y) = x. The sign of the result is equal to a sign of a dividend.

Attributes:

Inputs

Outputs

Types

Examples

Example 1

<layer ... type="FloorMod">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="FloorMod">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

HardSigmoid

Back to top

Category: Activation function

Short description: HardSigmoid calculates the hard sigmoid function y(x) = max(0, min(1, alpha * x + beta)) element-wise with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Examples

<layer ... type="HardSigmoid">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1"/>
<port id="2"/>
</input>
<output>
<port id="3">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

ShuffleChannels

Back to top

Name: ShuffleChannels

Category: Layer

Short description: ShuffleChannels permutes data in the channel dimension of the input tensor.

Attributes:

Inputs:

Outputs:

Mathematical Formulation

The operation is the equivalent with the following transformation of the input tensor x of shape [N, C, H, W]:

x' = reshape(x, [N, group, C / group, H * W])
x'' = transpose(x', [0, 2, 1, 3])
y = reshape(x'', [N, C, H, W])

where group is the layer parameter described above and the axis = 1.

Example

<layer ... type="ShuffleChannels" ...>
<data group="3" axis="1"/>
<input>
<port id="0">
<dim>5</dim>
<dim>12</dim>
<dim>200</dim>
<dim>400</dim>
</port>
</input>
<output>
<port id="1">
<dim>5</dim>
<dim>12</dim>
<dim>200</dim>
<dim>400</dim>
</port>
</output>
</layer>

NonMaxSuppression

Back to top

Short description: NonMaxSuppression performs non maximum suppression of the boxes with predicted scores.

Detailed description: NonMaxSuppression layer performs non maximum suppression algorithm as described below:

  1. Take the box with highest score. If the score is less than score_threshold then stop. Otherwise add the box to the output and continue to the next step.
  2. For each input box, calculate the IOU (intersection over union) with the box added during the previous step. If the value is greater than the iou_threshold threshold then remove the input box from further consideration.
  3. Return to step 1.

This algorithm is applied independently to each class of each batch element. The total number of output boxes for each class must not exceed max_output_boxes_per_class.

Attributes:

Inputs:

Outputs:

Example

<layer ... type="NonMaxSuppression" ... >
<data box_encoding="corner" sort_result_descending="1"/>
<input>
<port id="0">
<dim>1</dim>
<dim>1000</dim>
<dim>4</dim>
</port>
<port id="1">
<dim>1</dim>
<dim>1</dim>
<dim>1000</dim>
</port>
<port id="2"/>
<port id="3"/>
<port id="4"/>
</input>
<output>
<port id="5" precision="I32">
<dim>1000</dim>
<dim>3</dim>
</port>
</output>
</layer>

Equal

Back to top

Category: Comparison binary operation

Short description: Equal performs element-wise comparison operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing arithmetic operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting Equal does the following with the input tensors a and b:

\[ o_{i} = a_{i} == b_{i} \]

Examples

Example 1

<layer ... type="Equal">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="Equal">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

Clamp

Back to top

Category: Activation function

Short description: Clamp operation represents clipping activation function.

Attributes:

Inputs:

Outputs:

Detailed description:

Clamp does the following with the input tensor element-wise:

\[ clamp( x )=\left\{\begin{array}{ll} max\_value \quad \mbox{if } \quad input( x )>max\_value \\ min\_value \quad \mbox{if } \quad input( x ) \end{array}\right. \]

Example

<layer ... type="Clamp" ... >
<data min="10" max="50" />
<input> ... </input>
<output> ... </output>
</layer>

Constant

Back to top

Category: Infrastructure

Short description: Constant operation produces a tensor with content read from binary file by offset and size.

Attributes

Example

<layer ... type="Constant">
<data offset="1000" size="256" element_type="f32" shape="8,8"/>
<output>
<port id="1">
<dim>8</dim>
<dim>8</dim>
</port>
</output>
</layer>

Concat

Back to top

Category: data movement operation.

Short description: Concatenates arbitrary number of input tensors to a single output tensor along one axis.

Attributes:

Inputs:

Outputs:

Example

<layer id="1" type="Concat">
<data axis="1" />
<input>
<port id="0">
<dim>1</dim>
<dim>8</dim>
<dim>50</dim>
<dim>50</dim>
</port>
<port id="1">
<dim>1</dim>
<dim>16</dim>
<dim>50</dim>
<dim>50</dim>
</port>
<port id="2">
<dim>1</dim>
<dim>32</dim>
<dim>50</dim>
<dim>50</dim>
</port>
</input>
<output>
<port id="0">
<dim>1</dim>
<dim>56</dim>
<dim>50</dim>
<dim>50</dim>
</port>
</output>
</layer>

Convolution

Back to top

Category: Convolution

Short description: Reference

Detailed description: Reference

Attributes

Inputs:

Example

<layer type="Convolution" ...>
<data dilations="1,1" pads_begin="2,2" pads_end="2,2" strides="1,1"/>
<input>
<port id="0">
<dim>1</dim>
<dim>3</dim>
<dim>224</dim>
<dim>224</dim>
</port>
<port id="1">
<dim>64</dim>
<dim>3</dim>
<dim>5</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2" precision="FP32">
<dim>1</dim>
<dim>64</dim>
<dim>224</dim>
<dim>224</dim>
</port>
</output>
</layer>

ConvolutionBackpropData

Back to top

Category: Convolution

Short description: Computes the gradients of a Convolution operation with respect to the input. Also known as a Deconvolution or a Transposed Convolution.

Detailed description:

ConvolutionBackpropData takes the input tensor, weights tensor and output shape and computes the output tensor of a given shape. The shape of the output can be specified as an input 1D integer tensor explicitly or determined by other attributes implicitly. If output shape is specified as an explicit input, shape of the output exactly matches the specified size and required amount of padding is computed.

ConvolutionBackpropData accepts the same set of attributes as a regular Convolution operation, but they are interpreted in a "backward way", so they are applied to the output of ConvolutionBackpropData, but not to the input. Refer to a regular Convolution operation for detailed description of each attribute.

Output shape when specified as an input output_shape, specifies only spatial dimensions. No batch or channel dimension should be passed along with H, W or other spatial dimensions. If output_shape is omitted, then pads_begin, pads_end or auto_pad are used to determine output spatial shape [Y_1, Y_2, ..., Y_D] by input spatial shape [X_1, X_2, ..., X_D] in the following way:

Y_i = stride[i] * (X_i - 1) + ((K_i - 1) * dilations[i] + 1) - pads_begin[i] - pads_end[i] + output_padding[i]

where K_i filter kernel dimension along spatial axis i.

If output_shape is specified, neither pads_begin nor pads_end should be specified, but auto_pad defines how to distribute padding amount around the tensor. In this case pads are determined based on the next formulas to correctly align input and output tensors (similar to ONNX definition at https://github.com/onnx/onnx/blob/master/docs/Operators.md#convtranspose):

total_padding[i] = stride[i] * (X_i - 1) + ((K_i - 1) * dilations[i] + 1) - output_shape[i] + output_padding[i]
if auto_pads != SAME_UPPER:
pads_begin[i] = total_padding[i] // 2
pads_end[i] = total_padding[i] - pads_begin[i]
else:
pads_end[i] = total_padding[i] // 2
pads_begin[i] = total_padding[i] - pads_end[i]

Attributes

Inputs:

Outputs:

Example

<layer id="5" name="upsampling_node" type="ConvolutionBackpropData">
<data dilations="1,1" pads_begin="1,1" pads_end="1,1" strides="2,2"/>
<input>
<port id="0">
<dim>1</dim>
<dim>20</dim>
<dim>224</dim>
<dim>224</dim>
</port>
<port id="1">
<dim>20</dim>
<dim>10</dim>
<dim>3</dim>
<dim>3</dim>
</port>
</input>
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>10</dim>
<dim>447</dim>
<dim>447</dim>
</port>
</output>
</layer>

GRN

Back to top

Category: Normalization

Short description: GRN is the Global Response Normalization with L2 norm (across channels only).

Detailed description:

GRN computes the L2 norm by channels for input tensor with shape [N, C, ...]. GRN does the following with the input tensor:

output[i0, i1, ..., iN] = x[i0, i1, ..., iN] / sqrt(sum[j = 0..C-1](x[i0, j, ..., iN]**2) + bias)

Attributes:

Inputs

Outputs

Example

<layer id="5" name="normalization" type="GRN">
<data bias="1e-4"/>
<input>
<port id="0">
<dim>1</dim>
<dim>20</dim>
<dim>224</dim>
<dim>224</dim>
</port>
</input>
<output>
<port id="0" precision="f32">
<dim>1</dim>
<dim>20</dim>
<dim>224</dim>
<dim>224</dim>
</port>
</output>
</layer>

GroupConvolution

Back to top

Category: Convolution

Short description: Reference

Detailed description: Reference

Attributes

Inputs:

Mathematical Formulation

Example

<layer type="GroupConvolution" ...>
<data dilations="1,1" pads_begin="2,2" pads_end="2,2" strides="1,1"/>
<input>
<port id="0">
<dim>1</dim>
<dim>12</dim>
<dim>224</dim>
<dim>224</dim>
</port>
<port id="1">
<dim>4</dim>
<dim>1</dim>
<dim>3</dim>
<dim>5</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2" precision="FP32">
<dim>1</dim>
<dim>4</dim>
<dim>224</dim>
<dim>224</dim>
</port>
</output>

GroupConvolutionBackpropData

Back to top

Category: Convolution

Short description: Computes the gradients of a GroupConvolution operation with respect to the input. Also known as Deconvolution or Transposed Convolution.

Detailed description:

GroupConvolutionBackpropData is similar to ConvolutionBackpropData but also specifies the group processing in a way similar to how GroupConvolution extends behavior of a regular Convolution operation.

GroupConvolutionBackpropData takes input tensor, weights tensor and output shape and computes output tensor of a given shape. The shape of the output can be specified as an input 1D integer tensor explicitly or determined according to other attributes implicitly. If the output shape is specified as an explicit input, shape of the output exactly matches the specified size and required amount of padding is computed.

GroupConvolutionBackpropData accepts the same set of attributes as a regular GroupConvolution operation, but they are interpreted in a "backward way", so they are applied to the output of GroupConvolutionBackpropData, but not to the input. Refer to a regular GroupConvolution operation for detailed description of each attribute.

Output shape when specified as an input output_shape, specifies only spatial dimensions. No batch or channel dimension should be passed along with H, W or other spatial dimensions. If output_shape is omitted, then pads_begin, pads_end or auto_pad are used to determine output spatial shape [Y_1, Y_2, ..., Y_D] by input spatial shape [X_1, X_2, ..., X_D] in the following way:

Y_i = stride[i] * (X_i - 1) + ((K_i - 1) * dilations[i] + 1) - pads_begin[i] - pads_end[i] + output_padding[i]

where K_i filter kernel dimension along spatial axis i.

If output_shape is specified, neither pads_begin nor pads_end should be specified, but auto_pad defines how to distribute padding amount around the tensor. In this case pads are determined based on the next formulas to correctly align input and output tensors (similar to ONNX definition at https://github.com/onnx/onnx/blob/master/docs/Operators.md#convtranspose):

total_padding[i] = stride[i] * (X_i - 1) + ((K_i - 1) * dilations[i] + 1) - output_shape[i] + output_padding[i]
if auto_pads != SAME_UPPER:
pads_begin[i] = total_padding[i] // 2
pads_end[i] = total_padding[i] - pads_begin[i]
else:
pads_end[i] = total_padding[i] // 2
pads_begin[i] = total_padding[i] - pads_end[i]

Attributes

Inputs:

Outputs:

Example

<layer id="5" name="upsampling_node" type="GroupConvolutionBackpropData">
<data dilations="1,1" pads_begin="1,1" pads_end="1,1" strides="2,2"/>
<input>
<port id="0">
<dim>1</dim>
<dim>20</dim>
<dim>224</dim>
<dim>224</dim>
</port>
<port id="1">
<dim>4</dim>
<dim>5</dim>
<dim>2</dim>
<dim>3</dim>
<dim>3</dim>
</port>
</input>
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>8</dim>
<dim>447</dim>
<dim>447</dim>
</port>
</output>
</layer>

MatMul

Back to top

Category: Matrix multiplication

Short description: Generalized matrix multiplication

Detailed description

MatMul operation takes two tensors and performs usual matrix-matrix multiplication, matrix-vector multiplication or vector-matrix multiplication depending on argument shapes. Input tensors can have any rank >= 1. Two right-most axes in each tensor are interpreted as matrix rows and columns dimensions while all left-most axes (if present) are interpreted as multi-dimensional batch: [BATCH_DIM_1, BATCH_DIM_2,..., BATCH_DIM_K, ROW_INDEX_DIM, COL_INDEX_DIM]. The operation supports usual broadcast semantics for batch dimensions. It enables multiplication of batch of pairs of matrices in a single shot.

Before matrix multiplication, there is an implicit shape alignment for input arguments. It consists of the following steps:

  1. If rank of an input less than 2 it is unsqueezed to 2D tensor by adding axes with size 1 to the left of the shape. For example, if input has shape [S] it will be reshaped to [1, S]. It is applied for each input independently.
  2. Applied transpositions specified by optional transpose_a and transpose_b attributes.
  3. If ranks of input arguments are different after steps 1 and 2, each is unsqueezed from the left side of the shape by necessary number of axes to make both shapes of the same rank.
  1. Usual rules of the broadcasting are applied for batch dimensions.

Two attributes, transpose_a and transpose_b specifies embedded transposition for two right-most dimension for the first and the second input tensors correspondingly. It implies swapping of ROW_INDEX_DIM and COL_INDEX_DIM in the corresponding input tensor. Batch dimensions are not affected by these attributes.

Attributes

Inputs:

Example

Vector-matric multiplication

<layer ... type="MatMul">
<input>
<port id="0">
<dim>1024</dim>
</port>
<port id="1">
<dim>1024</dim>
<dim>1000</dim>
</port>
</input>
<output>
<port id="2">
<dim>1</dim>
<dim>1000</dim>
</port>
</output>
</layer>

Matrix-matrix multiplication (like FullyConnected with batch size 1)

<layer ... type="MatMul">
<input>
<port id="0">
<dim>1</dim>
<dim>1024</dim>
</port>
<port id="1">
<dim>1024</dim>
<dim>1000</dim>
</port>
</input>
<output>
<port id="2">
<dim>1</dim>
<dim>1000</dim>
</port>
</output>
</layer>

Matrix-vector multiplication with embedded transposition of the second matrix

<layer ... type="MatMul">
<data transpose_b="true"/>
<input>
<port id="0">
<dim>1</dim>
<dim>1024</dim>
</port>
<port id="1">
<dim>1000</dim>
<dim>1024</dim>
</port>
</input>
<output>
<port id="2">
<dim>1</dim>
<dim>1000</dim>
</port>
</output>
</layer>

Matrix-matrix multiplication (like FullyConnected with batch size 10)

<layer ... type="MatMul">
<input>
<port id="0">
<dim>10</dim>
<dim>1024</dim>
</port>
<port id="1">
<dim>1024</dim>
<dim>1000</dim>
</port>
</input>
<output>
<port id="2">
<dim>10</dim>
<dim>1000</dim>
</port>
</output>
</layer>

Multiplication of batch of 5 matrices by a one matrix with broadcasting

<layer ... type="MatMul">
<input>
<port id="0">
<dim>5</dim>
<dim>10</dim>
<dim>1024</dim>
</port>
<port id="1">
<dim>1024</dim>
<dim>1000</dim>
</port>
</input>
<output>
<port id="2">
<dim>5</dim>
<dim>10</dim>
<dim>1000</dim>
</port>
</output>
</layer>

DetectionOutput

Back to top

Category: Object detection

Short description: DetectionOutput performs non-maximum suppression to generate the detection output using information on location and confidence predictions.

Detailed description: Reference. The layer has 3 mandatory inputs: tensor with box logits, tensor with confidence predictions and tensor with box coordinates (proposals). It can have 2 additional inputs with additional confidence predictions and box coordinates described in the article. The 5-input version of the layer is supported with Myriad plugin only. The output tensor contains information about filtered detections described with 7 element tuples: [batch_id, class_id, confidence, x_1, y_1, x_2, y_2]. The first tuple with batch_id equal to *-1* means end of output.

At each feature map cell, DetectionOutput predicts the offsets relative to the default box shapes in the cell, as well as the per-class scores that indicate the presence of a class instance in each of those boxes. Specifically, for each box out of k at a given location, DetectionOutput computes class scores and the four offsets relative to the original default box shape. This results in a total of $(c + 4)k$ filters that are applied around each location in the feature map, yielding $(c + 4)kmn$ outputs for a m * n feature map.

Attributes:

Inputs

Example

<layer ... type="DetectionOutput" ... >
<data num_classes="21" share_location="1" background_label_id="0" nms_threshold="0.450000" top_k="400" input_height="1" input_width="1" code_type="caffe.PriorBoxParameter.CENTER_SIZE" variance_encoded_in_target="0" keep_top_k="200" confidence_threshold="0.010000"/>
<input> ... </input>
<output> ... </output>
</layer>

LRN

Back to top

Category: Normalization

Short description: Local response normalization.

Attributes:

Inputs

Outputs

Detailed description: Reference

Here is an example for 4D data input tensor and axes = [1]:

sqr_sum[a, b, c, d] =
    sum(input[a, b - local_size : b + local_size + 1, c, d] ** 2)
output = input / (bias + alpha * sqr_sum) ** beta

Example

<layer id="1" type="LRN" ...>
<data alpha="1.0e-04" beta="0.75" size="5" bias="1"/>
<input>
<port id="0">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="1">
<dim>1</dim>
</port>
</input>
<output>
<port id="2">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
</output>
</layer>

MaxPool

Back to top

Category: Pooling

Short description: Reference

Detailed description: Reference

Attributes: Pooling attributes are specified in the data node, which is a child of the layer node.

Inputs:

Mathematical Formulation

\[ output_{j} = MAX\{ x_{0}, ... x_{i}\} \]

Example

<layer ... type="MaxPool" ... >
<data auto_pad="same_upper" kernel="3,3" pads_begin="0,0" pads_end="1,1" strides="2,2"/>
<input> ... </input>
<output> ... </output>
</layer>

AvgPool

Back to top

Category: Pooling

Short description: Reference

Detailed description: Reference

Attributes: Pooling attributes are specified in the data node, which is a child of the layer node.

Inputs:

Mathematical Formulation

\[ output_{j} = \frac{\sum_{i = 0}^{n}x_{i}}{n} \]

Example

<layer ... type="AvgPool" ... >
<data auto_pad="same_upper" exclude_pad="true" kernel="3,3" pads_begin="0,0" pads_end="1,1" strides="2,2"/>
<input> ... </input>
<output> ... </output>
</layer>

PriorBox

Back to top

Category: Object detection

Short description: PriorBox operation generates prior boxes of specified sizes and aspect ratios across all dimensions.

Attributes:

Inputs:

Outputs:

Detailed description:

PriorBox computes coordinates of prior boxes by following:

  1. First calculates center_x and center_y of prior box:

    \[ W \equiv Width \quad Of \quad Image \]

    \[ H \equiv Height \quad Of \quad Image \]

    • If step equals 0:

      \[ center_x=(w+0.5) \]

      \[ center_y=(h+0.5) \]

    • else:

      \[ center_x=(w+offset)*step \]

      \[ center_y=(h+offset)*step \]

      \[ w \subset \left( 0, W \right ) \]

      \[ h \subset \left( 0, H \right ) \]

  2. Then, for each $ s \subset \left( 0, min_sizes \right ) $ calculates coordinates of prior boxes:

    \[ xmin = \frac{\frac{center_x - s}{2}}{W} \]

    \[ ymin = \frac{\frac{center_y - s}{2}}{H} \]

    \[ xmax = \frac{\frac{center_x + s}{2}}{W} \]

    \[ ymin = \frac{\frac{center_y + s}{2}}{H} \]

Example

<layer type="PriorBox" ...>
<data aspect_ratio="2.0" clip="0" density="" fixed_ratio="" fixed_size="" flip="1" max_size="38.46" min_size="16.0" offset="0.5" step="16.0" variance="0.1,0.1,0.2,0.2"/>
<input>
<port id="0">
<dim>2</dim>
</port>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="2">
<dim>2</dim>
<dim>16128</dim>
</port>
</output>
</layer>

PriorBoxClustered

Back to top

Category: Object detection

Short description: PriorBoxClustered operation generates prior boxes of specified sizes normalized to the input image size.

Attributes

Inputs:

Outputs:

Detailed description

PriorBoxClustered computes coordinates of prior boxes by following:

  1. Calculates the center_x and center_y of prior box:

    \[ W \equiv Width \quad Of \quad Image \]

    \[ H \equiv Height \quad Of \quad Image \]

    \[ center_x=(w+offset)*step \]

    \[ center_y=(h+offset)*step \]

    \[ w \subset \left( 0, W \right ) \]

    \[ h \subset \left( 0, H \right ) \]

  2. For each $s \subset \left( 0, W \right )$ calculates the prior boxes coordinates:

    \[ xmin = \frac{center_x - \frac{width_s}{2}}{W} \]

    \[ ymin = \frac{center_y - \frac{height_s}{2}}{H} \]

    \[ xmax = \frac{center_x - \frac{width_s}{2}}{W} \]

    \[ ymax = \frac{center_y - \frac{height_s}{2}}{H} \]

    If clip is defined, the coordinates of prior boxes are recalculated with the formula: $coordinate = \min(\max(coordinate,0), 1)$

Example

<layer type="PriorBoxClustered" ... >
<data clip="0" flip="1" height="44.0,10.0,30.0,19.0,94.0,32.0,61.0,53.0,17.0" offset="0.5" step="16.0" variance="0.1,0.1,0.2,0.2" width="86.0,13.0,57.0,39.0,68.0,34.0,142.0,50.0,23.0"/>
<input>
<port id="0">
<dim>2</dim>
</port>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="2">
<dim>2</dim>
<dim>6840</dim>
</port>
</output>
</layer>

ReLU

Back to top

Category: Activation

Short description: Reference

Detailed description: Reference

Attributes: ReLU operation has no attributes.

Mathematical Formulation

\[ Y_{i}^{( l )} = max(0, Y_{i}^{( l - 1 )}) \]

Inputs:

Example

<layer ... type="ReLU">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Reshape

Back to top

Category: Shape manipulation operations

Short description: Reshape operation changes dimensions of the input tensor according to the specified order. Input tensor volume is equal to output tensor volume, where volume is the product of dimensions.

Detailed description:

Reshape layer takes two input tensors: the tensor to be resized and the output tensor shape. The values in the second tensor could be -1, 0 and any positive integer number. The two special values -1 and 0:

Attributes:

Inputs:

Outputs:

Examples

<layer ... type="Reshape" ...>
<data special_zero="false"/>
<input>
<port id="0">
<dim>2</dim>
<dim>5</dim>
<dim>5</dim>
<dim>0</dim>
</port>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="2">
<dim>0</dim>
<dim>4</dim>
</port>
</output>
</layer>
<layer ... type="Reshape" ...>
<data special_zero="true"/>
<input>
<port id="0">
<dim>2</dim>
<dim>5</dim>
<dim>5</dim>
<dim>24</dim>
</port>
<port id="1">
<dim>3</dim>
</port>
</input>
<output>
<port id="2">
<dim>2</dim>
<dim>150</dim>
<dim>4</dim>
</port>
</output>
</layer>

Parameter

Back to top

Category: Infrastructure

Short description: Parameter layer specifies input to the model.

Attributes:

Example

<layer ... type="Parameter" ...>
<data>element_type="f32" shape="1,3,224,224"</data>
<output>
<port id="0">
<dim>1</dim>
<dim>3</dim>
<dim>224</dim>
<dim>224</dim>
</port>
</output>
</layer>

Add

Back to top

Category: Arithmetic binary operation

Short description: Add performs element-wise addition operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing arithmetic operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting Add does the following with the input tensors a and b:

\[ o_{i} = a_{i} + b_{i} \]

Examples

Example 1

<layer ... type="Add">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="Add">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

Multiply

Back to top

Category: Arithmetic binary operation

Short description: Multiply performs element-wise multiplication operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing arithmetic operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting Multiply does the following with the input tensors a and b:

\[ o_{i} = a_{i} * b_{i} \]

Examples

Example 1

<layer ... type="Multiply">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="Multiply">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

MVN

Back to top

Category: Normalization

Short description: Reference

Detailed description

MVN subtracts mean value from the input blob:

\[ o_{i} = i_{i} - \frac{\sum{i_{k}}}{C * H * W} \]

If normalize_variance is set to 1, the output blob is divided by variance:

\[ o_{i}=\frac{o_{i}}{\sum \sqrt {o_{k}^2}+\epsilon} \]

Attributes

Inputs

Outputs

Example

<layer ... type="MVN">
<data across_channels="true" eps="1e-9" normalize_variance="true"/>
<input>
<port id="0">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
</input>
<output>
<port id="2">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
</output>
</layer>

Power

Back to top

Category: Arithmetic binary operation

Short description: Power performs element-wise power operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing arithmetic operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting Power does the following with the input tensors a and b:

\[ o_{i} = {a_{i} ^ b}_{i} \]

Examples

Example 1

<layer ... type="Power">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="Power">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

Exp

Back to top

Category: Activation function

Short description: Exponential element-wise activation function.

Attributes: has no attributes

Inputs:

Outputs:


ShapeOf

Back to top

Category: Shape manipulation operations

Short description: ShapeOf produces 1D tensor with the input tensor shape.

Attributes: has no attributes.

Inputs:

Outputs:

Example

<layer ... type="ShapeOf">
<input>
<port id="0">
<dim>2</dim>
<dim>3</dim>
<dim>224</dim>
<dim>224</dim>
</port>
</input>
<output>
<port id="1">
<dim>4</dim>
</port>
</output>
</layer>

SoftMax

Back to top

Category: Activation

Short description: Reference

Detailed description: Reference

Attributes

Inputs:

Outputs:

Detailed description

\[ y_{c} = \frac{e^{Z_{c}}}{\sum_{d=1}^{C}e^{Z_{d}}} \]

where $C$ is a size of tensor along axis dimension.

Example

<layer ... type="SoftMax" ... >
<data axis="1" />
<input> ... </input>
<output> ... </output>
</layer>

PReLU

Back to top

Category: Activation function

Short description: PReLU performs element-wise parametric ReLU operation with negative slope defined by the second input.

Attributes: operation has no attributes.

Inputs

Outputs

Types

Detailed description Before performing addition operation, input tensor 2 with slope values is broadcasted to input 1. The broadcasting rules are aligned with ONNX Broadcasting. Description is available in ONNX docs.

After broadcasting PReLU does the following for each input 1 element x:

f(x) = slope * x for x < 0; x for x >= 0

Interpolate

Back to top

Category: Image processing

Short description: Interpolate layer performs interpolation of independent slices in input tensor by specified dimensions and attributes.

Attributes

Inputs

Outputs

Example

<layer ... type="Interpolate" ...>
<data axes="2,3" align_corners="0" pads_begin="0,0" pads_end="0,0" mode="linear"/>
<input>
<port id="0">
<dim>1</dim>
<dim>2</dim>
<dim>48</dim>
<dim>80</dim>
</port>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="0">
<dim>1</dim>
<dim>2</dim>
<dim>50</dim>
<dim>60</dim>
</port>
</output>
</layer>

Less

Back to top

Category: Comparison binary operation

Short description: Less performs element-wise comparison operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing arithmetic operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting Less does the following with the input tensors a and b:

\[ o_{i} = a_{i} < b_{i} \]

Examples

Example 1

<layer ... type="Less">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="Less">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

LessEqual

Back to top

Category: Comparison binary operation

Short description: LessEqual performs element-wise comparison operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing arithmetic operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting LessEqual does the following with the input tensors a and b:

\[ o_{i} = a_{i} <= b_{i} \]

Examples

Example 1

<layer ... type="LessEqual">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="LessEqual">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

PSROIPooling

Back to top

Category: Object detection

Short description: PSROIPooling computes position-sensitive pooling on regions of interest specified by input.

Detailed description: Reference.

PSROIPooling operation takes two input blobs: with feature maps and with regions of interests (box coordinates). The latter is specified as five element tuples: [batch_id, x_1, y_1, x_2, y_2]. ROIs coordinates are specified in absolute values for the average mode and in normalized values (to [0,1] interval) for bilinear interpolation.

Attributes

Inputs:

Outputs:

Example

<layer ... type="PSROIPooling" ... >
<data group_size="6" mode="bilinear" output_dim="360" spatial_bins_x="3" spatial_bins_y="3" spatial_scale="1"/>
<input>
<port id="0">
<dim>1</dim>
<dim>3240</dim>
<dim>38</dim>
<dim>38</dim>
</port>
<port id="1">
<dim>100</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>100</dim>
<dim>360</dim>
<dim>6</dim>
<dim>6</dim>
</port>
</output>
</layer>

Select

Back to top

Category: Conditions

Short description: Select returns a tensor filled with the elements from the second or the third inputs, depending on the condition (the first input) value.

Detailed description

Select takes elements from then input tensor or the else input tensor based on a condition mask provided in the first input cond. Before performing selection, input tensors then and else are broadcasted to each other if their shapes are different and auto_broadcast attributes is not none. Then the cond tensor is one-way broadcasted to the resulting shape of broadcasted then and else. Broadcasting is performed according to auto_broadcast value.

Attributes

Inputs:

Outputs:

Example

<layer ... type="Select">
<input>
<port id="0">
<dim>3</dim>
<dim>2</dim>
</port>
<port id="1">
<dim>3</dim>
<dim>2</dim>
</port>
<port id="2">
<dim>3</dim>
<dim>2</dim>
</port>
</input>
<output>
<port id="1">
<dim>3</dim>
<dim>2</dim>
</port>
</output>
</layer>

DeformableConvolution

Back to top

Category: DeformableConvolution

Detailed description: Reference

Attributes

Inputs:

Example

<layer ... type="DeformableConvolution" ... >
<data dilations="1,1" pads_begin="2,2" pads_end="3,3" strides="2,2"/>
<input> ... </input>
<output> ... </output>
</layer>

DeformablePSROIPooling

Back to top

Category: Object detection

Short description: DeformablePSROIPooling computes position-sensitive pooling on regions of interest specified by input.

Detailed description: Reference.

DeformablePSROIPooling operation takes two or three input tensors: with feature maps, with regions of interests (box coordinates) and an optional tensor with transformation values. The box coordinates are specified as five element tuples: [batch_id, x_1, y_1, x_2, y_2] in absolute values.

Attributes

Inputs:

Outputs:

Example

<layer ... type="DeformablePSROIPooling" ... >
<data group_size="7" mode="bilinear_deformable" no_trans="False" output_dim="8" part_size="7" pooled_height="7" pooled_width="7" spatial_bins_x="4" spatial_bins_y="4" spatial_scale="0.0625" trans_std="0.1"/>
<input>
<port id="0">
<dim>1</dim>
<dim>392</dim>
<dim>38</dim>
<dim>63</dim>
</port>
<port id="1">
<dim>300</dim>
<dim>5</dim>
</port>
<port id="2">
<dim>300</dim>
<dim>2</dim>
<dim>7</dim>
<dim>7</dim>
</port>
</input>
<output>
<port id="3" precision="FP32">
<dim>300</dim>
<dim>8</dim>
<dim>7</dim>
<dim>7</dim>
</port>
</output>
</layer>

FakeQuantize

Back to top

Category: Quantization

Short description: FakeQuantize is element-wise linear quantization of floating-point input values into a discrete set of floating-point values.

Detailed description: Input and output ranges as well as the number of levels of quantization are specified by dedicated inputs and attributes. There can be different limits for each element or groups of elements (channels) of the input tensors. Otherwise, one limit applies to all elements. It depends on shape of inputs that specify limits and regular broadcasting rules applied for input tensors. The output of the operator is a floating-point number of the same type as the input tensor. In general, there are four values that specify quantization for each element: input_low, input_high, output_low, output_high. input_low and input_high attributes specify the input range of quantization. All input values that are outside this range are clipped to the range before actual quantization. output_low and output_high specify minimum and maximum quantized values at the output.

Fake in FakeQuantize means the output tensor is of the same floating point type as an input tensor, not integer type.

Each element of the output is defined as the result of the following expression:

if x <= min(input_low, input_high):
output = output_low
elif x > max(input_low, input_high):
output = output_high
else:
# input_low < x <= input_high
output = round((x - input_low) / (input_high - input_low) * (levels-1)) / (levels-1) * (output_high - output_low) + output_low

Attributes

Inputs:

Inputs:

Example

<layer type="FakeQuantize">
<data levels="2"/>
<input>
<port id="0">
<dim>1</dim>
<dim>64</dim>
<dim>56</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>1</dim>
<dim>64</dim>
<dim>1</dim>
<dim>1</dim>
</port>
<port id="2">
<dim>1</dim>
<dim>64</dim>
<dim>1</dim>
<dim>1</dim>
</port>
<port id="3">
<dim>1</dim>
<dim>1</dim>
<dim>1</dim>
<dim>1</dim>
</port>
<port id="4">
<dim>1</dim>
<dim>1</dim>
<dim>1</dim>
<dim>1</dim>
</port>
</input>
<output>
<port id="5">
<dim>1</dim>
<dim>64</dim>
<dim>56</dim>
<dim>56</dim>
</port>
</output>
</layer>

BinaryConvolution

Back to top

Category: Convolution

Short description: BinaryConvolution convolution with binary weights, binary input and integer output

Attributes:

The operation has the same attributes as a regular Convolution layer and several unique attributes that are listed below:

Inputs:

Outputs:


ReverseSequence

Back to top

Category: data movement operation

Short description: ReverseSequence reverses variable length slices of data.

Detailed description: ReverseSequence slices input along the dimension specified in the batch_axis, and for each slice i, reverses the first lengths[i] (the second input) elements along the dimension specified in the seq_axis.

Attributes

Inputs:

Example

<layer ... type="ReverseSequence">
<data batch_axis="0" seq_axis="1"/>
<input>
<port id="0">
<dim>3</dim>
<dim>10</dim>
<dim>100</dim>
<dim>200</dim>
</port>
<port id="1">
<dim>3</dim>
</port>
</input>
<output>
<port id="2">
<dim>3</dim>
<dim>10</dim>
<dim>100</dim>
<dim>200</dim>
</port>
</output>
</layer>

Reverse

Back to top

Category: data movement operation

Short description: Reverse operations reverse specified axis in an input tensor.

Detailed description: Reverse produces a tensor with the same shape as the first input tensor and with elements reversed along dimensions specified in the second input tensor. The axes can be represented either by dimension indices or as a mask. The interpretation of the second input is determined by mode attribute. If index mode is used, the second tensor should contain indices of axes that should be reversed. The length of the second tensor should be in a range from 0 to rank of the 1st input tensor.

In case if mask mode is used, then the second input tensor length should be equal to the rank of the 1st input. And each value has boolean value True or False. True means the corresponding axes should be reverted, False means it should be untouched.

If no axis specified, that means either the second input is empty if index mode is used or second input has only False elements if mask mode is used, then Reverse just passes the source tensor through output not doing any data movements.

Attributes

Example

<layer ... type="Reverse">
<data mode="index"/>
<input>
<port id="0">
<dim>3</dim>
<dim>10</dim>
<dim>100</dim>
<dim>200</dim>
</port>
<port id="1">
<dim>1</dim>
</port>
</input>
<output>
<port id="2">
<dim>3</dim>
<dim>10</dim>
<dim>100</dim>
<dim>200</dim>
</port>
</output>
</layer>

RNNCell

Back to top

Category: Sequence processing

Short description: RNNCell represents a single RNN cell that computes the output using the formula described in the article.

Attributes

Inputs

Outputs


ROIPooling

Back to top

Category: Object detection

Short description: ROIPooling is a pooling layer used over feature maps of non-uniform input sizes and outputs a feature map of a fixed size.

Detailed description: deepsense.io reference

Attributes

Inputs:

Outputs:

Example

<layer ... type="ROIPooling" ... >
<data pooled_h="6" pooled_w="6" spatial_scale="0.062500"/>
<input> ... </input>
<output> ... </output>
</layer>

Proposal

Back to top

Category: Object detection

Short description: Proposal operation filters bounding boxes and outputs only those with the highest prediction confidence.

Detailed description

Proposal has three inputs: a tensor with probabilities whether particular bounding box corresponds to background and foreground, a tensor with logits for each of the bounding boxes, a tensor with input image size in the [image_height, image_width, scale_height_and_width] or [image_height, image_width, scale_height, scale_width] format. The produced tensor has two dimensions [batch_size * post_nms_topn, 5]. Proposal layer does the following with the input tensor:

  1. Generates initial anchor boxes. Left top corner of all boxes is at (0, 0). Width and height of boxes are calculated from base_size with scale and ratio attributes.
  2. For each point in the first input tensor:
    • pins anchor boxes to the image according to the second input tensor that contains four deltas for each box: for x and y of center, for width and for height
    • finds out score in the first input tensor
  3. Filters out boxes with size less than min_size
  4. Sorts all proposals (box, score) by score from highest to lowest
  5. Takes top pre_nms_topn proposals
  6. Calculates intersections for boxes and filter out all boxes with $intersection/union > nms\_thresh$
  7. Takes top post_nms_topn proposals
  8. Returns top proposals

Inputs:

Outputs:

Example

<layer ... type="Proposal" ... >
<data base_size="16" feat_stride="16" min_size="16" nms_thresh="0.6" post_nms_topn="200" pre_nms_topn="6000"
ratio="2.67" scale="4.0,6.0,9.0,16.0,24.0,32.0"/>
<input> ... </input>
<output> ... </output>
</layer>

Broadcast

Back to top

Category: Data movement

Short description: Broadcast replicates data on the first input to fit a given shape on the second input.

Detailed description:

Broadcast takes the first tensor data and, following broadcasting rules that are specified by mode attribute and the 3rd input axes_mapping, builds a new tensor with shape matching the 2nd input tensor target_shape. target_shape input is a 1D integer tensor that represents required shape of the output.

Attribute mode and the 3rd input axes_mapping are relevant for cases when rank of the input data tensor doesn't match the size of the target_shape input. They both define how axes from data shape are mapped to the output axes. If mode is set to numpy, it means that the standard one-directional numpy broadcasting rules are applied. They are similar to rules that applied in all binary element-wise operations in case when auto_broadcasting attribute is set to numpy, and are similar to rules described at here, when only one-directional broadcasting is applied: input tensor data is broadcasted to target_shape but not vice-versa.

In case if mode is set to explicit, then 3rd input axes_mapping comes to play. It contains a list of axis indices, each index maps an axis from the 1st input tensor data to axis in the output. The size of axis_mapping should match the rank of input data tensor, so all axes from data tensor should be mapped to axes of the output.

For example, axes_mapping = [1] enables broadcasting of a tensor with shape [C] to shape [N,C,H,W] by replication of initial tensor along dimensions 0, 2 and 3. Another example is broadcasting of tensor with shape [H,W] to shape [N,H,W,C] with axes_mapping = [1, 2]. Both examples requires mode set to explicit and providing mentioned axes_mapping input, because such operations cannot be expressed with axes_mapping set to numpy.

Attributes:

Inputs:

Outputs:

Example

<layer ... type="Broadcast" ...>
<data mode="numpy"/>
<input>
<port id="0">
<dim>16</dim>
<dim>1</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>4</dim>
</port>
</input>
<output>
<port id="2">
<dim>1</dim>
<dim>16</dim>
<dim>50</dim>
<dim>50</dim>
</port>
</output>
</layer>
<layer ... type="Broadcast" ...>
<data mode="explicit"/>
<input>
<port id="0">
<dim>16</dim>
</port>
<port id="1">
<dim>4</dim>
</port>
<port id="1">
<dim>1</dim>
</port>
</input>
<output>
<port id="2">
<dim>1</dim>
<dim>16</dim>
<dim>50</dim>
<dim>50</dim>
</port>
</output>
</layer>
<layer ... type="Broadcast" ...>
<data mode="explicit"/>
<input>
<port id="0">
<dim>50</dim>
<dim>50</dim>
</port>
<port id="1">
<dim>4</dim>
</port>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="2">
<dim>1</dim>
<dim>50</dim>
<dim>50</dim>
<dim>16</dim>
</port>
</output>
</layer>

CTCGreedyDecoder

Back to top

Category: Sequence processing

Short description: CTCGreedyDecoder performs greedy decoding on the logits given in input (best path).

Detailed description:

This operation is similar Reference

Given an input sequence $X$ of length $T$, CTCGreedyDecoder assumes the probability of a length $T$ character sequence $C$ is given by

\[ p(C|X) = \prod_{t=1}^{T} p(c_{t}|X) \]

Sequences in the batch can have different length. The lengths of sequences are coded as values 1 and 0 in the second input tensor sequence_mask. Value sequence_mask[j, i] specifies whether there is a sequence symbol at index i in the sequence i in the batch of sequences. If there is no symbol at j-th position sequence_mask[j, i] = 0, and sequence_mask[j, i] = 1 otherwise. Starting from j = 0, sequence_mass[j, i] are equal to 1 up to the particular index j = last_sequence_symbol, which is defined independently for each sequence i. For j > last_sequence_symbol, values in sequence_mask[j, i] are all zeros.

Attributes

Inputs

Output

Example

<layer ... type="CTCGreedyDecoder" ...>
<input>
<port id="0">
<dim>20</dim>
<dim>8</dim>
<dim>128</dim>
</port>
<port id="1">
<dim>20</dim>
<dim>8</dim>
</port>
</input>
<output>
<port id="0">
<dim>8</dim>
<dim>20</dim>
<dim>1</dim>
<dim>1</dim>
</port>
</output>
</layer>

Divide

Back to top

Category: Arithmetic binary operation

Short description: Divide performs element-wise division operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing arithmetic operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting Divide does the following with the input tensors a and b:

\[ o_{i} = a_{i} / b_{i} \]

Examples

Example 1

<layer ... type="Divide">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="Divide">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

Gather

Back to top

Category: Data movement operations

Short description: Gather operation takes slices of data in the 1st input tensor according to the indexes specified in the 2nd input tensor and axis from the 3rd input

Detailed description

output[:, ... ,:, i, ... , j,:, ... ,:] = input1[:, ... ,:, input2[i, ... ,j],:, ... ,:]

Where i is value from the 3rd input.

Attributes: Gather has no attributes

Inputs

Outputs

Example

<layer id="1" type="Gather" ...>
<input>
<port id="0">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="1">
<dim>15</dim>
<dim>4</dim>
<dim>20</dim>
<dim>28</dim>
</port>
<port id="2"/>
</input>
<output>
<port id="2">
<dim>6</dim>
<dim>15</dim>
<dim>4</dim>
<dim>20</dim>
<dim>28</dim>
<dim>10</dim>
<dim>24</dim>
</port>
</output>
</layer>

GatherTree

Back to top

Category: Beam search post-processing

Short description: Generates the complete beams from the ids per each step and the parent beam ids.

Detailed description

GatherTree operation implements the same algorithm as GatherTree operation in TensorFlow. Please see complete documentation here.

Pseudo code:

for batch in range(BATCH_SIZE):
for beam in range(BEAM_WIDTH):
max_sequence_in_beam = min(MAX_TIME, max_seq_len[batch])
parent = parent_idx[max_sequence_in_beam - 1, batch, beam]
for level in reversed(range(max_sequence_in_beam - 1)):
final_idx[level, batch, beam] = step_idx[level, batch, parent]
parent = parent_idx[level, batch, parent]

Element data types for all input tensors should match each other.

Attributes: GatherTree has no attributes

Inputs

Outputs

Types

Example

<layer type="GatherTree" ...>
<input>
<port id="0">
<dim>100</dim>
<dim>1</dim>
<dim>10</dim>
</port>
<port id="1">
<dim>100</dim>
<dim>1</dim>
<dim>10</dim>
</port>
<port id="2">
<dim>1</dim>
</port>
<port id="3">
</port>
</input>
<output>
<port id="0">
<dim>100</dim>
<dim>1</dim>
<dim>10</dim>
</port>
</output>
</layer>

Greater

Back to top

Category: Comparison binary operation

Short description: Greater performs element-wise comparison operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing arithmetic operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting Greater does the following with the input tensors a and b:

\[ o_{i} = a_{i} > b_{i} \]

Examples

Example 1

<layer ... type="Greater">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="Greater">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

GreaterEqual

Back to top

Category: Comparison binary operation

Short description: GreaterEqual performs element-wise comparison operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing arithmetic operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting GreaterEqual does the following with the input tensors a and b:

\[ o_{i} = a_{i} >= b_{i} \]

Examples

Example 1

<layer ... type="GreaterEqual">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="GreaterEqual">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

LSTMCell

Back to top

Category: Sequence processing

Short description: LSTMCell operation represents a single LSTM cell. It computes the output using the formula described in the original paper Long Short-Term Memory.

Detailed description

Formula:
* - matrix mult
(.) - eltwise mult
[,] - concatenation
sigm - 1/(1 + e^{-x})
tanh - (e^{2x} - 1)/(e^{2x} + 1)
f = sigm(Wf*[Hi, X] + Bf)
i = sigm(Wi*[Hi, X] + Bi)
c = tanh(Wc*[Hi, X] + Bc)
o = sigm(Wo*[Hi, X] + Bo)
Co = f (.) Ci + i (.) c
Ho = o (.) tanh(Co)

Attributes

Inputs

Outputs

Example

<layer ... type="LSTMCell" ... >
<input> ... </input>
<output> ... </output>
</layer>

Maximum

Back to top

Category: Arithmetic binary operation

Short description: Maximum performs element-wise maximum operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing arithmetic operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting Maximum does the following with the input tensors a and b:

\[ o_{i} = max(a_{i}, b_{i}) \]

Examples

Example 1

<layer ... type="Maximum">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="Maximum">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

Minimum

Back to top

Category: Arithmetic binary operation

Short description: Minimum performs element-wise minimum operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing arithmetic operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting Minimum does the following with the input tensors a and b:

\[ o_{i} = min(a_{i}, b_{i}) \]

Examples

Example 1

<layer ... type="Minimum">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="Minimum">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

NormalizeL2

Back to top

Category: Normalization

Short description: NormalizeL2 operation performs L2 normalization of the 1st input tensor in slices specified by the 2nd input.

Attributes

Inputs

Outputs

Detailed Description

Each element in the output is the result of division of corresponding element from the data input tensor by the result of L2 reduction along dimensions specified by the axes input:

output[i0, i1, ..., iN] = x[i0, i1, ..., iN] / sqrt(eps_mode(sum[j0,..., jN](x[j0, ..., jN]**2), eps))

Where indices i0, ..., iN run through all valid indices for the 1st input and summation sum[j0, ..., jN] have jk = ik for those dimensions k that are not in the set of indices specified by the axes input of the operation. One of the corner cases is when axes is an empty list, then we divide each input element by itself resulting value 1 for all non-zero elements. Another corner case is where axes input contains all dimensions from data tensor, which means that a single L2 reduction value is calculated for entire input tensor and each input element is divided by that value.

eps_mode selects how the reduction value and eps are combined. It can be max or add depending on eps_mode attribute value.

Example

<layer id="1" type="NormalizeL2" ...>
<data eps="1e-8" eps_mode="add"/>
<input>
<port id="0">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="2">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
</output>
</layer>

NotEqual

Back to top

Category: Comparison binary operation

Short description: NotEqual performs element-wise comparison operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing arithmetic operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting NotEqual does the following with the input tensors a and b:

\[ o_{i} = a_{i} != b_{i} \]

Examples

Example 1

<layer ... type="NotEqual">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="NotEqual">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

Pad

Back to top

Category: Data movement operations

Short description: Pad operation extends an input tensor on edges. The amount and value of padded elements are defined by inputs and attributes.

Attributes

Inputs

Outputs

Detailed Description

The attributes specify a number of elements to add along each axis and a rule by which new element values are generated: for example, whether they are filled with a given constant or generated based on the input tensor content.

The following examples illustrate how output tensor is generated for the Pad layer for a given input tensor:

INPUT =
[[ 1 2 3 4 ]
[ 5 6 7 8 ]
[ 9 10 11 12 ]]

with the following attributes:

pads_begin = [0, 1]
pads_end = [2, 3]

depending on the pad_mode.

Example

<layer ... type="Pad" ...>
<data pad_mode="constant"/>
<input>
<port id="0">
<dim>1</dim>
<dim>3</dim>
<dim>32</dim>
<dim>40</dim>
</port>
<port id="1">
<dim>4</dim>
</port>
<port id="2">
<dim>4</dim>
</port>
<port id="3">
</port>
</input>
<output>
<port id="0">
<dim>2</dim>
<dim>8</dim>
<dim>37</dim>
<dim>48</dim>
</port>
</output>
</layer>

ReduceSum

Back to top

Category: Reduction

Short description: ReduceSum operation performs reduction with addition of the 1st input tensor in slices specified by the 2nd input.

Attributes

Inputs

Outputs

Detailed Description

Each element in the output is the result of reduction with addition operation along dimensions specified by the 2nd input:

output[i0, i1, ..., iN] = sum[j0,..., jN](x[j0, ..., jN]**2))

Where indices i0, ..., iN run through all valid indices for the 1st input and summation sum[j0, ..., jN] have jk = ik for those dimensions k that are not in the set of indices specified by the 2nd input of the operation. Corner cases:

  1. When the 2nd input is an empty list, then this operation does nothing, it is an identity.
  2. When the 2nd input contains all dimensions of the 1st input, this means that a single reduction value is calculated for entire input tensor.

Example

<layer id="1" type="ReduceSum" ...>
<data keep_dims="True" />
<input>
<port id="0">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="2">
<dim>6</dim>
<dim>12</dim>
<dim>1</dim>
<dim>1</dim>
</port>
</output>
</layer>

ReduceProd

Back to top

Category: Reduction

Short description: ReduceProd operation performs reduction with multiplication of the 1st input tensor in slices specified by the 2nd input.

Attributes

Inputs

Outputs

Detailed Description

Each element in the output is the result of reduction with multiplication operation along dimensions specified by the 2nd input:

output[i0, i1, ..., iN] = prod[j0,..., jN](x[j0, ..., jN]**2))

Where indices i0, ..., iN run through all valid indices for the 1st input and multiplication prod[j0, ..., jN] have jk = ik for those dimensions k that are not in the set of indices specified by the 2nd input of the operation. Corner cases:

  1. When the 2nd input is an empty list, then this operation does nothing, it is an identity.
  2. When the 2nd input contains all dimensions of the 1st input, this means that a single reduction value is calculated for entire input tensor.

Example

<layer id="1" type="ReduceProd" ...>
<data keep_dims="True" />
<input>
<port id="0">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="2">
<dim>6</dim>
<dim>12</dim>
<dim>1</dim>
<dim>1</dim>
</port>
</output>
</layer>

TopK

Back to top

Category: Sorting and maximization

Short description: TopK computes indices and values of the k maximum/minimum values for each slice along specified axis.

Attributes

Inputs:

Outputs:

Detailed Description

Output tensor is populated by values computes in the following way:

output[i1, ..., i(axis-1), j, i(axis+1) ..., iN] = top_k(input[i1, ...., i(axis-1), :, i(axis+1), ..., iN]), k, sort, mode)

So for each slice input[i1, ...., i(axis-1), :, i(axis+1), ..., iN] which represents 1D array, top_k value is computed individually. Sorting and minimum/maximum are controlled by sort and mode attributes.

Example

<layer ... type="TopK" ... >
<data axis="1" mode="max" sort="value"/>
<input>
<port id="0">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="1">
</port>
<output>
<port id="2">
<dim>6</dim>
<dim>3</dim>
<dim>10</dim>
<dim>24</dim>
</port>
</output>
</layer>

LSTMSequence

Back to top

Category: Sequence processing

Short description: LSTMSequence operation represents a series of LSTM cells. Each cell is implemented as LSTMCell operation.

Detailed description

A single cell in the sequence is implemented in the same way as in LSTMCell operation. LSTMSequence represents a sequence of LSTM cells. The sequence can be connected differently depending on direction attribute that specifies the direction of traversing of input data along sequence dimension or specifies whether it should be a bidirectional sequence. The most of the attributes are in sync with the specification of ONNX LSTM operator defined LSTMCell.

Attributes

Inputs

Outputs


StridedSlice

Category: Data movement operation

Short description: StridedSlice extracts a strided slice of a tensor. It is similar to generalized array indexing in Python*.

Attributes

Inputs:

Example

<layer ... type="StridedSlice" ...>
<data begin_mask="1,0,1,1,1" ellipsis_mask="0,0,0,0,0" end_mask="1,0,1,1,1" new_axis_mask="0,0,0,0,0" shrink_axis_mask="0,1,0,0,0"/>
<input>
<port id="0">
<dim>1</dim>
<dim>2</dim>
<dim>384</dim>
<dim>640</dim>
<dim>8</dim>
</port>
<port id="1">
<dim>5</dim>
</port>
<port id="2">
<dim>5</dim>
</port>
<port id="3">
<dim>5</dim>
</port>
</input>
<output>
<port id="4">
<dim>1</dim>
<dim>384</dim>
<dim>640</dim>
<dim>8</dim>
</port>
</output>
</layer>

Subtract

Back to top

Category: Arithmetic binary operation

Short description: Subtract performs element-wise subtraction operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing arithmetic operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting Subtract does the following with the input tensors a and b:

\[ o_{i} = a_{i} - b_{i} \]

Examples

Example 1

<layer ... type="Substract">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="Subtract">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

Squeeze

Category: Reshaping

Short description: Squeeze removes specified dimensions (second input) equal to 1 of the first input tensor. If the second input is omitted then all dimensions equal to 1 are removed. If the specified dimension is not equal to one then error is raised.

Attributes: Squeeze operation doesn't have attributes.

Inputs:

Example

Example 1:

<layer ... type="Squeeze">
<input>
<port id="0">
<dim>1</dim>
<dim>3</dim>
<dim>1</dim>
<dim>2</dim>
</port>
</input>
<input>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="2">
<dim>3</dim>
<dim>2</dim>
</port>
</output>
</layer>

Example 2: squeeze 1D tensor with 1 element to a 0D tensor (constant)

<layer ... type="Squeeze">
<input>
<port id="0">
<dim>1</dim>
</port>
</input>
<input>
<port id="1">
<dim>1</dim>
</port>
</input>
<output>
<port id="2">
</port>
</output>
</layer>

Unsqueeze

Category: Reshaping

Short description: Unsqueeze adds dimensions of size 1 to the first input tensor. The second input value specifies a list of dimensions that will be inserted. Indices specify dimensions in the output tensor.

Attributes: Unsqueeze operation doesn't have attributes.

Inputs:

Example

Example 1:

<layer ... type="Unsqueeze">
<input>
<port id="0">
<dim>2</dim>
<dim>3</dim>
</port>
</input>
<input>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="2">
<dim>1</dim>
<dim>2</dim>
<dim>3</dim>
<dim>1</dim>
</port>
</output>
</layer>

Example 2: (unsqueeze 0D tensor (constant) to 1D tensor)

<layer ... type="Unsqueeze">
<input>
<port id="0">
</port>
</input>
<input>
<port id="1">
<dim>1</dim>
</port>
</input>
<output>
<port id="2">
<dim>1</dim>
</port>
</output>
</layer>

DepthToSpace

Back to top

Category: Data movement

Short description: DepthToSpace operation rearranges data from the depth dimension of the input tensor into spatial dimensions of the output tensor.

Attributes

Inputs

Outputs

Detailed description

DepthToSpace operation permutes elements from the input tensor with shape [N, C, D1, D2, ..., DK], to the output tensor where values from the input depth dimension (features) C are moved to spatial blocks in D1, ..., DK. Refer to the ONNX* specification for an example of the 4D input tensor case.

The operation is equivalent to the following transformation of the input tensor data with K spatial dimensions of shape [N, C, D1, D2, ..., DK] to Y output tensor. If mode = blocks_first:

x' = reshape(data, [N, block_size, block_size, ..., block_size, C / (block_size ^ K), D1, D2, ..., DK])

x'' = transpose(x', [0,  K + 1,  K + 2, 1, K + 3, 2, K + 4, 3, ..., K + (K + 1), K])

y = reshape(x'', [N, C / (block_size ^ K), D1 * block_size, D2 * block_size, D3 * block_size, ..., DK * block_size])

If mode = depth_first:

x' = reshape(data, [N, C / (block_size ^ K), block_size, block_size, ..., block_size, D1, D2, ..., DK])

x'' = transpose(x', [0,  1,  K + 2, 2, K + 3, 3, K + 4, 4, ..., K + (K + 1), K + 1])

y = reshape(x'', [N, C / (block_size ^ K), D1 * block_size, D2 * block_size, D3 * block_size, ..., DK * block_size])

Example

<layer type="DepthToSpace" ...>
<data block_size="2" mode="blocks_first"/>
<input>
<port id="0">
<dim>5</dim>
<dim>28</dim>
<dim>2</dim>
<dim>3</dim>
</port>
</input>
<output>
<port id="1">
<dim>5</dim>
<dim>7</dim>
<dim>4</dim>
<dim>6</dim>
</port>
</output>
</layer>

SpaceToDepth

Back to top

Category: Data movement

Short description: SpaceToDepth operation rearranges data from the spatial dimensions of the input tensor into depth dimension of the output tensor.

Attributes

Inputs

Outputs

Detailed description

SpaceToDepth operation permutes element from the input tensor with shape [N, C, D1, D2, ..., DK], to the output tensor where values from the input spatial dimensions D1, D2, ..., DK are moved to the new depth dimension. Refer to the ONNX* specification for an example of the 4D input tensor case.

The operation is equivalent to the following transformation of the input tensor data with K spatial dimensions of shape [N, C, D1, D2, ..., DK] to Y output tensor. If mode = blocks_first:

x' = reshape(data, [N, C, D1/block_size, block_size, D2/block_size, block_size, ... , DK/block_size, block_size])

x'' = transpose(x',  [0,  3, 5, ..., K + (K + 1), 1,  2, 4, ..., K + K])

y = reshape(x'', [N, C * (block_size ^ K), D1 / block_size, D2 / block_size, ... , DK / block_size])

If mode = depth_first:

x' = reshape(data, [N, C, D1/block_size, block_size, D2/block_size, block_size, ..., DK/block_size, block_size])

x'' = transpose(x', [0,  1, 3, 5, ..., K + (K + 1),  2, 4, ..., K + K])

y = reshape(x'', [N, C * (block_size ^ K), D1 / block_size, D2 / block_size, ..., DK / block_size])

Example

<layer type="SpaceToDepth" ...>
<data block_size="2" mode="blocks_first"/>
<input>
<port id="0">
<dim>5</dim>
<dim>7</dim>
<dim>4</dim>
<dim>6</dim>
</port>
</input>
<output>
<port id="1">
<dim>5</dim>
<dim>28</dim>
<dim>2</dim>
<dim>3</dim>
</port>
</output>
</layer>

OneHot

Back to top

Category: Sequence processing

Short description: OneHot sets the elements in the output tensor with specified indices to on_value and fills all other locations with off_value.

Detailed description

Taking a tensor with rank N as the first input indices, OneHot produces tensor with rank N+1 extending original tensor with a new dimension at axis position in shape. Output tensor is populated with two scalar values: on_value that comes from the 3rd input and off_value that comes from the 4nd input. Population is made in the following way:

output[:, ... ,:, i, :, ... ,:] = on_value if (indices[:, ..., :, :, ..., :] == i) else off_value

where i is at axis position in output shape and has values from range [0, ..., depth-1].

When index element from indices is greater or equal to depth, it is a well-formed operation. In this case the corresponding row output[..., i, ...] is populated with off_value only for all i values.

Types of input scalars on_value and off_value should match and can be any of the supported types. The type of output tensor is derived from on_value and off_value, they all have the same type.

Attributes:

Inputs:

Outputs:

Examples

<layer ... type="OneHot" ...>
<data axis="-1"/>
<input>
<port id="0">
<dim>3</dim>
</port>
<port id="1">
</port>
<port id="2">
</port>
<port id="3">
</port>
</input>
<output>
<port id="0">
<dim>3</dim>
<dim>2</dim>
</port>
</output>
</layer>

Acos

Back to top

Category: Arithmetic unary operation

Short description: Acos performs element-wise inverse cosine (arccos) operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Acos does the following with the input tensor a:

\[ a_{i} = acos(a_{i}) \]

Examples

Example 1

<layer ... type="Acos">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Asin

Back to top

Category: Arithmetic unary operation

Short description: Asin performs element-wise inverse sine (arcsin) operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Asin does the following with the input tensor a:

\[ a_{i} = asin(a_{i}) \]

Examples

Example 1

<layer ... type="Asin">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Cos

Back to top

Category: Arithmetic unary operation

Short description: Cos performs element-wise cosine operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Cos does the following with the input tensor a:

\[ a_{i} = cos(a_{i}) \]

Examples

Example 1

<layer ... type="Cos">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Sin

Back to top

Category: Arithmetic unary operation

Short description: Sin performs element-wise sine operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

sin does the following with the input tensor a:

\[ a_{i} = sin(a_{i}) \]

Examples

Example 1

<layer ... type="Sin">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Tan

Back to top

Category: Arithmetic unary operation

Short description: Tan performs element-wise tangent operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Tan does the following with the input tensor a:

\[ a_{i} = tan(a_{i}) \]

Examples

Example 1

<layer ... type="Tan">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Atan

Back to top

Category: Arithmetic unary operation

Short description: Atan performs element-wise inverse tangent (arctangent) operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

atan does the following with the input tensor a:

\[ a_{i} = atan(a_{i}) \]

Examples

Example 1

<layer ... type="Atan">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Sinh

Back to top

Category: Arithmetic unary operation

Short description: Sinh performs element-wise hyperbolic sine (sinh) operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

sinh does the following with the input tensor a:

\[ a_{i} = sinh(a_{i}) \]

Examples

Example 1

<layer ... type="Sinh">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Cosh

Back to top

Category: Arithmetic unary operation

Short description: Cosh performs element-wise hyperbolic cosine operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Cosh does the following with the input tensor a:

\[ a_{i} = cosh(a_{i}) \]

Examples

Example 1

<layer ... type="Cosh">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Log

Back to top

Category: Arithmetic unary operation

Short description: Log performs element-wise natural logarithm operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Log does the following with the input tensor a:

\[ a_{i} = log(a_{i}) \]

Examples

Example 1

<layer ... type="Log">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Sqrt

Back to top

Category: Arithmetic unary operation

Short description: Sqrt performs element-wise square root operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Sqrt does the following with the input tensor a:

\[ a_{i} = sqrt(a_{i}) \]

Examples

Example 1

<layer ... type="Sqrt">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Negative

Back to top

Category: Arithmetic unary operation

Short description: Negative performs element-wise negative operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Negative does the following with the input tensor a:

\[ a_{i} = -a_{i} \]

Examples

Example 1

<layer ... type="Negative">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Abs

Back to top

Category: Arithmetic unary operation

Short description: Abs performs element-wise the absolute value with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Abs does the following with the input tensor a:

\[ a_{i} = abs(a_{i}) \]

Examples

Example 1

<layer ... type="Abs">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Ceiling

Back to top

Category: Arithmetic unary operation

Short description: Ceiling performs element-wise ceiling operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Ceiling does the following with the input tensor a:

\[ a_{i} = ceiling(a_{i}) \]

Examples

Example 1

<layer ... type="Ceiling">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Floor

Back to top

Category: Arithmetic unary operation

Short description: Floor performs element-wise floor operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Floor does the following with the input tensor a:

\[ a_{i} = floor(a_{i}) \]

Examples

Example 1

<layer ... type="Floor">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

RegionYolo

Back to top

Category: Object detection

Short description: RegionYolo computes the coordinates of regions with probability for each class.

Detailed description: This operation is directly mapped to the original YOLO layer. Reference

Attributes:

Inputs:

Outputs:

Example

<layer type="RegionYolo" ... >
<data anchors="10,14,23,27,37,58,81,82,135,169,344,319" axis="1" classes="80" coords="4" do_softmax="0" end_axis="3" mask="0,1,2" num="6"/>
<input>
<port id="0">
<dim>1</dim>
<dim>255</dim>
<dim>26</dim>
<dim>26</dim>
</port>
</input>
<output>
<port id="0">
<dim>1</dim>
<dim>255</dim>
<dim>26</dim>
<dim>26</dim>
</port>
</output>
</layer>
<layer type="RegionYolo" ... >
<data anchors="1.08,1.19,3.42,4.41,6.63,11.38,9.42,5.11,16.62,10.52" axis="1" classes="20" coords="4" do_softmax="1" end_axis="3" num="5"/>
<input>
<port id="0">
<dim>1</dim>
<dim>125</dim>
<dim>13</dim>
<dim>13</dim>
</port>
</input>
<output>
<port id="0">
<dim>1</dim>
<dim>21125</dim>
</port>
</output>
</layer>

ReorgYolo Layer

Back to top

Category: Object detection

Short description: ReorgYolo reorganizes input tensor taking into account strides.

Detailed description:

Reference

Attributes

Inputs:

Outputs:

Example

<layer id="89" name="ExtractImagePatches" type="ReorgYolo">
<data stride="2"/>
<input>
<port id="0">
<dim>1</dim>
<dim>64</dim>
<dim>26</dim>
<dim>26</dim>
</port>
</input>
<output>
<port id="1" precision="f32">
<dim>1</dim>
<dim>256</dim>
<dim>13</dim>
<dim>13</dim>
</port>
</output>
</layer>

Sign

Back to top

Category: Arithmetic unary operation

Short description: Sign performs element-wise sign operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Sign does the following with the input tensor a:

\[ a_{i} = sign(a_{i}) \]

Examples

Example 1

<layer ... type="Sign">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

ReduceMax

Back to top

Category: ReduceMax

Short description: ReduceMax operation performs reduction with finding the maximum value of the 1st input tensor in slices specified by the 2nd input.

Attributes

Inputs

Outputs

Detailed Description

Each element in the output is the result of reduction with finding a maximum operation along dimensions specified by the 2nd input:

output[i0, i1, ..., iN] = max[j0,..., jN](x[j0, ..., jN]**2))

Where indices i0, ..., iN run through all valid indices for the 1st input and finding the maximum value max[j0, ..., jN] have jk = ik for those dimensions k that are not in the set of indices specified by the 2nd input of the operation. Corner cases:

  1. When the 2nd input is an empty list, then this operation does nothing, it is an identity.
  2. When the 2nd input contains all dimensions of the 1st input, this means that a single reduction value is calculated for entire input tensor.

Example

<layer id="1" type="ReduceMax" ...>
<data keep_dims="True" />
<input>
<port id="0">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="2">
<dim>6</dim>
<dim>12</dim>
<dim>1</dim>
<dim>1</dim>
</port>
</output>
</layer>

ReduceMin

Back to top

Category: ReduceMin

Short description: ReduceMin operation performs reduction with finding the minimum value of the 1st input tensor in slices specified by the 2nd input.

Attributes

Inputs

Outputs

Detailed Description

Each element in the output is the result of reduction with finding a minimum operation along dimensions specified by the 2nd input:

output[i0, i1, ..., iN] = min[j0,..., jN](x[j0, ..., jN]**2))

Where indices i0, ..., iN run through all valid indices for the 1st input and finding the minimum value min[j0, ..., jN] have jk = ik for those dimensions k that are not in the set of indices specified by the 2nd input of the operation. Corner cases:

  1. When the 2nd input is an empty list, then this operation does nothing, it is an identity.
  2. When the 2nd input contains all dimensions of the 1st input, this means that a single reduction value is calculated for entire input tensor.

Example

<layer id="1" type="ReduceMin" ...>
<data keep_dims="True" />
<input>
<port id="0">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="2">
<dim>6</dim>
<dim>12</dim>
<dim>1</dim>
<dim>1</dim>
</port>
</output>
</layer>

LogicalAnd

Back to top

Category: Logical binary operation

Short description: LogicalAnd performs element-wise logical AND operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing logical operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting LogicalAnd does the following with the input tensors a and b:

\[ o_{i} = a_{i} and b_{i} \]

Examples

Example 1

<layer ... type="LogicalAnd">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="LogicalAnd">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

LogicalOr

Back to top

Category: Logical binary operation

Short description: LogicalOr performs element-wise logical OR operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing logical operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting LogicalOr does the following with the input tensors a and b:

\[ o_{i} = a_{i} or b_{i} \]

Examples

Example 1

<layer ... type="LogicalOr">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="LogicalOr">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

LogicalXor

Back to top

Category: Logical binary operation

Short description: LogicalXor performs element-wise logical XOR operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing logical operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting LogicalXor does the following with the input tensors a and b:

\[ o_{i} = a_{i} xor b_{i} \]

Examples

Example 1

<layer ... type="LogicalXor">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="LogicalXor">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

LogicalXor

Back to top

Category: Logical binary operation

Short description: LogicalXor performs element-wise logical XOR operation with two given tensors applying multi-directional broadcast rules.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing logical operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting LogicalXor does the following with the input tensors a and b:

\[ o_{i} = a_{i} xor b_{i} \]

Examples

Example 1

<layer ... type="LogicalXor">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="LogicalXor">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

LogicalNot

Back to top

Category: Logical unary operation

Short description: LogicalNot performs element-wise logical negation operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

LogicalNot does the following with the input tensor a:

\[ a_{i} = not(a_{i}) \]

Examples

Example 1

<layer ... type="LogicalNot">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

ReduceMean

Back to top

Category: ReduceMean

Short description: ReduceMean operation performs reduction with finding the arithmetic mean of the 1st input tensor in slices specified by the 2nd input.

Attributes

Inputs

Outputs

Detailed Description

Each element in the output is the result of reduction with finding the arithmetic mean operation along dimensions specified by the 2nd input:

output[i0, i1, ..., iN] = mean[j0,..., jN](x[j0, ..., jN]**2))

Where indices i0, ..., iN run through all valid indices for the 1st input and finding the arithmetic mean mean[j0, ..., jN] have jk = ik for those dimensions k that are not in the set of indices specified by the 2nd input of the operation. Corner cases:

  1. When the 2nd input is an empty list, then this operation does nothing, it is an identity.
  2. When the 2nd input contains all dimensions of the 1st input, this means that a single reduction value is calculated for entire input tensor.

Example

<layer id="1" type="ReduceMean" ...>
<data keep_dims="True" />
<input>
<port id="0">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="2">
<dim>6</dim>
<dim>12</dim>
<dim>1</dim>
<dim>1</dim>
</port>
</output>
</layer>

ReduceLogicalAnd

Back to top

Category: ReduceLogicalAnd

Short description: ReduceLogicalAnd operation performs reduction with logical and operation of the 1st input tensor in slices specified by the 2nd input.

Attributes

Inputs

Outputs

Detailed Description

Each element in the output is the result of reduction with logical and operation along dimensions specified by the 2nd input:

output[i0, i1, ..., iN] = and[j0,..., jN](x[j0, ..., jN]**2))

Where indices i0, ..., iN run through all valid indices for the 1st input and logical and operation and[j0, ..., jN] have jk = ik for those dimensions k that are not in the set of indices specified by the 2nd input of the operation. Corner cases:

  1. When the 2nd input is an empty list, then this operation does nothing, it is an identity.
  2. When the 2nd input contains all dimensions of the 1st input, this means that a single reduction value is calculated for entire input tensor.

Example

<layer id="1" type="ReduceLogicalAnd" ...>
<data keep_dims="True" />
<input>
<port id="0">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="2">
<dim>6</dim>
<dim>12</dim>
<dim>1</dim>
<dim>1</dim>
</port>
</output>
</layer>

ReduceLogicalOr

Back to top

Category: ReduceLogicalOr

Short description: ReduceLogicalOr operation performs reduction with logical or operation of the 1st input tensor in slices specified by the 2nd input.

Attributes

Inputs

Outputs

Detailed Description

Each element in the output is the result of reduction with logical or operation along dimensions specified by the 2nd input:

output[i0, i1, ..., iN] = or[j0,..., jN](x[j0, ..., jN]**2))

Where indices i0, ..., iN run through all valid indices for the 1st input and logical or operation or[j0, ..., jN] have jk = ik for those dimensions k that are not in the set of indices specified by the 2nd input of the operation. Corner cases:

  1. When the 2nd input is an empty list, then this operation does nothing, it is an identity.
  2. When the 2nd input contains all dimensions of the 1st input, this means that a single reduction value is calculated for entire input tensor.

Example

<layer id="1" type="ReduceLogicalOr" ...>
<data keep_dims="True" />
<input>
<port id="0">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="2">
<dim>6</dim>
<dim>12</dim>
<dim>1</dim>
<dim>1</dim>
</port>
</output>
</layer>

SquaredDifference

Back to top

Category: Arithmetic binary operation

Short description: SquaredDifference performs element-wise subtraction operation with two given tensors applying multi-directional broadcast rules, after that each result of the subtraction is squared.

Attributes:

Inputs

Outputs

Types

Detailed description Before performing arithmetic operation, input tensors a and b are broadcasted if their shapes are different and auto_broadcast attributes is not none. Broadcasting is performed according to auto_broadcast value.

After broadcasting SquaredDifference does the following with the input tensors a and b:

\[ o_{i} = (a_{i} - b_{i})^2 \]

Examples

Example 1

<layer ... type="SquaredDifference">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Example 2: broadcast

<layer ... type="SquaredDifference">
<input>
<port id="0">
<dim>8</dim>
<dim>1</dim>
<dim>6</dim>
<dim>1</dim>
</port>
<port id="1">
<dim>7</dim>
<dim>1</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>8</dim>
<dim>7</dim>
<dim>6</dim>
<dim>5</dim>
</port>
</output>
</layer>

Transpose

Back to top

Category: Layer

Short description: Transpose operation reorders the input tensor dimensions.

Attributes:

No attributes available.

Inputs:

Outputs:

Types

Detailed description:

Transpose operation reorders the input tensor dimensions. Source indexes and destination indexes are bound by the formula:

\[ output[i(order[0]), i(order[1]), ..., i(order[N-1])] = input[i(0), i(1), ..., i(N-1)], where i(j) in range 0..(input.shape[j]-1). \]

Examples

Example 1

<layer ... type="Transpose">
<input>
<port id="0">
<dim>2</dim>
<dim>3</dim>
<dim>4</dim>
</port>
<port id="1">
<dim>3</dim>
</port>
</input>
<output>
<port id="2">
<dim>4</dim>
<dim>2</dim>
<dim>3</dim>
</port>
</output>
</layer>

Example 2: input_order in not specified

<layer ... type="Transpose">
<input>
<port id="0">
<dim>2</dim>
<dim>3</dim>
<dim>4</dim>
</port>
</input>
<output>
<port id="1">
<dim>4</dim>
<dim>3</dim>
<dim>2</dim>
</port>
</output>
</layer>

Example 3: input_order = empty_list []

<layer ... type="Transpose">
<input>
<port id="0">
<dim>2</dim>
<dim>3</dim>
<dim>4</dim>
</port>
<port id="1">
<dim>0</dim>
</port>
</input>
<output>
<port id="2">
<dim>4</dim>
<dim>3</dim>
<dim>2</dim>
</port>
</output>
</layer>

Tile

Back to top

Category: Layer

Short description: Tile operation repeats an input tensor *"data"* the number of times given by *"repeats"* input tensor along each dimension.

Attributes:

No attributes available.

Inputs:

Outputs:

Types

Detailed description:

Tile operation extends input tensor and filling in output tensor by the following rules:

\[out_i=input_i[inner_dim*t]\]

\[ t \in \left ( 0, \quad tiles \right ) \]

Examples

Example 1: number elements in "repeats" is equal to shape of data

<layer ... type="Tile">
<input>
<port id="0">
<dim>2</dim>
<dim>3</dim>
<dim>4</dim>
</port>
<port id="1">
<dim>3</dim>
</port>
</input>
<output>
<port id="2">
<dim>2</dim>
<dim>6</dim>
<dim>12</dim>
</port>
</output>
</layer>

Example 2: number of elements in "repeats" is more than shape of "data"

<layer ... type="Tile">
<input>
<port id="0">
<dim>2</dim>
<dim>3</dim>
<dim>4</dim>
</port>
<port id="1">
<dim>4</dim>
</port>
</input>
<output>
<port id="2">
<dim>5/dim>
<dim>2</dim>
<dim>6</dim>
<dim>12</dim>
</port>
</output>
</layer>

Example 3: number of elements in "repeats" is less than shape of "data"

<layer ... type="Tile">
<input>
<port id="0">
<dim>5</dim>
<dim>2</dim>
<dim>3</dim>
<dim>4</dim>
</port>
<port id="1">
<dim>3</dim>
</port>
</input>
<output>
<port id="2">
<dim>5</dim>
<dim>2</dim>
<dim>6</dim>
<dim>12</dim>
</port>
</output>
</layer>

Range

Back to top

Category: Layer

Short description: Range operation generates a sequence of numbers according input values [start, stop) with a step.

Attributes:

No attributes available.

Inputs:

Outputs:

Types

Detailed description:

Range operation generates a sequence of numbers starting from the value in the first input (start) up to but not including the value in the second input (stop) with a step equal to the value in the third input, according to the following formula:

\[ [start, start + step, start + 2*step, ..., start + K*step], where K is the maximal integer value that satisfies condition start + K*step < stop, then step is positive value and start + K*step > stop, then step is negative value. \]

Examples

Example 1: positive step

<layer ... type="Range">
<input>
<port id="0">
</port>
<port id="1">
</port>
<port id="2">
</port>
</input>
<output>
<port id="3">
<dim> 7 </dim>
</port>
</output>
</layer>

Example 2: negative step

<layer ... type="Range">
<input>
<port id="0">
</port>
<port id="1">
</port>
<port id="2">
</port>
</input>
<output>
<port id="3">
<dim> 7 </dim>
</port>
</output>
</layer>

Asinh

Back to top

Category: Arithmetic unary operation

Short description: Asinh performs element-wise hyperbolic inverse sine (arcsinh) operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Asinh does the following with the input tensor a:

\[ a_{i} = asinh(a_{i}) \]

Examples

Example 1

<layer ... type="Asinh">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Atanh

Back to top

Category: Arithmetic unary operation

Short description: Atanh performs element-wise hyperbolic inverse tangent (arctangenth) operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Atanh does the following with the input tensor a:

\[ a_{i} = atanh(a_{i}) \]

Examples

Example 1

<layer ... type="Atanh">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Acosh

Back to top

Category: Arithmetic unary operation

Short description: Acosh performs element-wise hyperbolic inverse cosine (arccosh) operation with given tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Acosh does the following with the input tensor a:

\[ a_{i} = acosh(a_{i}) \]

Examples

Example 1

<layer ... type="Acosh">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

VariadicSplit

Back to top

Category: Data movement operations

Short description: VariadicSplit operation splits an input tensor into pieces along some axis. The pieces may have variadic lengths depending on *"split_lengths*" attribute.

Attributes

No attributes available.

Inputs

Outputs

Detailed Description

VariadicSplit operation splits the data input tensor into pieces along axis. The i-th shape of output tensor will be equal to the data shape except along dimension axis where the size will be split_lengths[i]. The sum of elements of split_lengths must match data.shape[axis].

Shape of output tensor will be:

\[ shape_output_tensor = shape_input_tensor[shape_input_tensor[0], shape_input_tensor[1], ..., split_lengths[axis], ..., shape_input_tensor[D-1]], where D rank of input tensor. \]

Types

Examples

<layer id="1" type="VariadicSplit" ...>
<input>
<port id="0">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="1">
</port>
<port id="2">
<dim>1</dim>
</port>
</input>
<output>
<port id="3">
<dim>1</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="4">
<dim>2</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="5">
<dim>3</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
</output>
</layer>
<layer id="1" type="VariadicSplit" ...>
<input>
<port id="0">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="1">
</port>
<port id="2">
<dim>1</dim>
</port>
</input>
<output>
<port id="3">
<dim>4</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="4">
<dim>2</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
</output>
</layer>

Split

Back to top

Category: Data movement operations

Short description: Split operation splits an input tensor into pieces of the same length along some axis.

Attributes

Inputs

Outputs

Detailed Description

Split operation splits the *"data"* input tensor into pieces of the same length along *"axis"*. The i-th shape of output tensor will be equal to the *"data"* shape except along dimension *"axis"* where the shape will be data.shape[i]/num_splits. The sum of elements of split_lengths must match data.shape[axis].

Shape of output tensor will be:

\[ shape_output_tensor = shape_input_tensor[shape_input_tensor[0], shape_input_tensor[1], ... ,split_lengths[axis], ... shape_input_tensor[D-1]], where D rank of input tensor. \]

Types

Example

<layer id="1" type="Split" ...>
<data num_splits="3" />
<input>
<port id="0">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="1">
</port>
</input>
<output>
<port id="2">
<dim>6</dim>
<dim>4</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="3">
<dim>6</dim>
<dim>4</dim>
<dim>10</dim>
<dim>24</dim>
</port>
<port id="4">
<dim>6</dim>
<dim>4</dim>
<dim>10</dim>
<dim>24</dim>
</port>
</output>
</layer>

Convert

Back to top

Category: type conversion

Short description: Operation converts all elements of the input tensor to a type specified in the *"destination_type"* attribute.

Attributes:

Inputs

Outputs

Types

Detailed description

Conversion from one supported type to another supported type is always allowed. User must be aware of precision loss and value change caused by range difference between two types. For example, a 32-bit float 3.141592 may be round to a 32-bit int 3.

\[ o_{i} = convert(a_{i}) \]

Examples

Example 1

<layer ... type="Convert">
<data destination_type="f32"/>
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
</input>
<output>
<port id="1">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

Result

Back to top

Category: Infrastructure

Short description: Result layer specifies output of the model.

Attributes:

No attributes available.

Inputs

Types

Example

<layer ... type="Result" ...>
<input>
<port id="0">
<dim>1</dim>
<dim>3</dim>
<dim>224</dim>
<dim>224</dim>
</port>
</input>
</layer>

BatchNormInference

Back to top

Category: Normalization

Short description: BatchNormInference layer normalizes a input tensor by mean and variance, and applies a scale (gamma) to it, as well as an offset (beta).

Attributes:

Inputs

Outputs

Types

Mathematical Formulation

BatchNormInference normalizes the output in each hidden layer.

Example

<layer ... type="BatchNormInference" ...>
<data epsilon="9.99e-06" />
<input>
<port id="0">
<dim>1</dim>
<dim>3</dim>
<dim>224</dim>
<dim>224</dim>
</port>
<port id="1">
<dim>3</dim>
</port>
<port id="2">
<dim>3</dim>
</port>
<port id="3">
<dim>3</dim>
</port>
<port id="4">
<dim>3</dim>
</port>
</input>
<output>
<port id="5">
<dim>1</dim>
<dim>3</dim>
<dim>224</dim>
<dim>224</dim>
</port>
</output>
</layer>

ConvertLike

Back to top

Category: type convertion

Short description: Operation converts all elements of the 1st input tensor to a type of elements of 2nd input tensor.

Attributes:

No attributes available.

Inputs

Outputs

Types

Detailed description

Conversion from one supported type to another supported type is always allowed. User must be aware of precision loss and value change caused by range difference between two types. For example, a 32-bit float 3.141592 may be round to a 32-bit int 3.

a - data input tensor, b - like input tensor.

\[ o_{i} = Convert[destination_type=type(b)](a_{i}) \]

Examples

Example 1

<layer ... type="ConvertLike">
<input>
<port id="0">
<dim>256</dim>
<dim>56</dim>
</port>
<port id="1">
<dim>3</dim>
</port>
</input>
<output>
<port id="2">
<dim>256</dim>
<dim>56</dim>
</port>
</output>
</layer>

TensorIterator

Back to top

Category: Loops

Short description: TensorIterator layer performs recurrent execution of the network, which is discribed in the body, iterating through the data.

TensorIterator attributes:

Inputs

Outputs

Detailed description

Similar to other layers, TensorIterator has regular sections: input and output. It allows connecting TensorIterator to the rest of the IR. TensorIterator also has several special sections: body, port_map, back_edges. The principles of their work are described below.

How body is iterated:

At the first iteration: TensorIterator slices input tensors by a specified axis and iterates over all parts in a specified order. It process input tensors with arbitrary network specified as an IR network in the body section. IR is executed as no back-edges are present. Edges from port map are used to connect input ports of TensorIterator to Parameters in body.

[inputs] - Port map edges -> [Parameters:body:Results]

Parameter and Result layers are part of the body. Parameters are stable entry points in the body. The results of the execution of the body are presented as stable Result layers. Stable means that these nodes cannot be fused.

Next iterations: Back edges define which data is copied back to Parameters layers from Results layers between IR iterations in TensorIterator body. That means they pass data from source layer back to target layer. Each layer that is a target for back-edge has also an incoming port map edge as an input. The values from back-edges are used instead of corresponding edges from port map. After each iteration of the network, all back edges are executed. Iterations can be considered as statically unrolled sequence: all edges that flow between two neighbor iterations are back-edges. So in the unrolled loop, each back-edge is transformed to regular edge.

... -> [Parameters:body:Results] - back-edges -> [Parameters:body:Results] - back-edges -> [Parameters:body:Results] - back-edges -> ...

Calculation of results:

If output entry in the Port map doesn't have partitioning (axis, begin, end, strides) attributes, then the final value of output of TensorIterator is the value of Result node from the last iteration. Otherwise the final value of output of TensorIterator is a concatenation of tensors in the Result node for all body iterations. Concatenation order is specified by stride attribute.

The last iteration:

[Parameters:body:Results] - Port map edges -> [outputs], if partitioning attributes are not set.

if there are partitioning attributes, then an output tensor is a concatenation of tensors from all body iterations. If stride > 0:

output = Concat(S[0], S[1], ..., S[N-1])

where Si is value of Result operation at i-th iteration in the tensor iterator body that corresponds to this output port. If stride < 0, then output is concatenated in a reverse order:

output = Concat(S[N-1], S[N-2], ..., S[0])

Examples

Example 1: a typical TensorIterator structure

<layer type="TensorIterator" ... >
<input> ... </input>
<output> ... </output>
<port_map>
<input external_port_id="0" internal_layer_id="0" axis="1" start="-1" end="0" stride="-1"/>
<input external_port_id="1" internal_layer_id="1"/>
...
<output external_port_id="3" internal_layer_id="2" axis="1" start="-1" end="0" stride="-1"/>
...
</port_map>
<back_edges>
<edge from-layer="1" to-layer="1"/>
...
</back_edges>
<body>
<layers> ... </layers>
<edges> ... </edges>
</body>
</layer>

Example 2: a full TensorIterator layer

<layer type="TensorIterator" ...>
<input>
<port id="0">
<dim>1</dim>
<dim>25</dim>
<dim>512</dim>
</port>
<port id="1">
<dim>1</dim>
<dim>256</dim>
</port>
<port id="2">
<dim>1</dim>
<dim>256</dim>
</port>
</input>
<output>
<port id="3" precision="FP32">
<dim>1</dim>
<dim>25</dim>
<dim>256</dim>
</port>
</output>
<port_map>
<input axis="1" external_port_id="0" internal_layer_id="0" start="0"/>
<input external_port_id="1" internal_layer_id="3"/>
<input external_port_id="2" internal_layer_id="4"/>
<output axis="1" external_port_id="3" internal_layer_id="12"/>
</port_map>
<back_edges>
<edge from-layer="8" to-layer="4"/>
<edge from-layer="9" to-layer="3"/>
</back_edges>
<body>
<layers>
<layer id="0" type="Parameter" ...>
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>1</dim>
<dim>512</dim>
</port>
</output>
</layer>
<layer id="1" type="Const" ...>
<data offset="0" size="16"/>
<output>
<port id="1" precision="I64">
<dim>2</dim>
</port>
</output>
</layer>
<layer id="2" type="Reshape" ...>
<input>
<port id="0">
<dim>1</dim>
<dim>1</dim>
<dim>512</dim>
</port>
<port id="1">
<dim>2</dim>
</port>
</input>
<output>
<port id="2" precision="FP32">
<dim>1</dim>
<dim>512</dim>
</port>
</output>
</layer>
<layer id="3" type="Parameter" ...>
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>256</dim>
</port>
</output>
</layer>
<layer id="4" type="Parameter" ...>
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>256</dim>
</port>
</output>
</layer>
<layer id="5" type="Const" ...>
<data offset="16" size="3145728"/>
<output>
<port id="1" precision="FP32">
<dim>1024</dim>
<dim>768</dim>
</port>
</output>
</layer>
<layer id="6" type="Const" ...>
<data offset="3145744" size="4096"/>
<output>
<port id="1" precision="FP32">
<dim>1024</dim>
</port>
</output>
</layer>
<layer id="7" type="LSTMCell" ...>
<data hidden_size="256"/>
<input>
<port id="0">
<dim>1</dim>
<dim>512</dim>
</port>
<port id="1">
<dim>1</dim>
<dim>256</dim>
</port>
<port id="2">
<dim>1</dim>
<dim>256</dim>
</port>
<port id="3">
<dim>1024</dim>
<dim>768</dim>
</port>
<port id="4">
<dim>1024</dim>
</port>
</input>
<output>
<port id="5" precision="FP32">
<dim>1</dim>
<dim>256</dim>
</port>
<port id="6" precision="FP32">
<dim>1</dim>
<dim>256</dim>
</port>
</output>
</layer>
<layer id="8" type="Result" ...>
<input>
<port id="0">
<dim>1</dim>
<dim>256</dim>
</port>
</input>
</layer>
<layer id="9" type="Result" ...>
<input>
<port id="0">
<dim>1</dim>
<dim>256</dim>
</port>
</input>
</layer>
<layer id="10" type="Const" ...>
<data offset="3149840" size="24"/>
<output>
<port id="1" precision="I64">
<dim>3</dim>
</port>
</output>
</layer>
<layer id="11" type="Reshape" ...>
<input>
<port id="0">
<dim>1</dim>
<dim>256</dim>
</port>
<port id="1">
<dim>3</dim>
</port>
</input>
<output>
<port id="2" precision="FP32">
<dim>1</dim>
<dim>1</dim>
<dim>256</dim>
</port>
</output>
</layer>
<layer id="12" type="Result" ...>
<input>
<port id="0">
<dim>1</dim>
<dim>1</dim>
<dim>256</dim>
</port>
</input>
</layer>
</layers>
<edges>
<edge from-layer="0" from-port="0" to-layer="2" to-port="0"/>
<edge from-layer="1" from-port="1" to-layer="2" to-port="1"/>
<edge from-layer="2" from-port="2" to-layer="7" to-port="0"/>
<edge from-layer="3" from-port="0" to-layer="7" to-port="1"/>
<edge from-layer="4" from-port="0" to-layer="7" to-port="2"/>
<edge from-layer="5" from-port="1" to-layer="7" to-port="3"/>
<edge from-layer="6" from-port="1" to-layer="7" to-port="4"/>
<edge from-layer="7" from-port="6" to-layer="8" to-port="0"/>
<edge from-layer="7" from-port="5" to-layer="9" to-port="0"/>
<edge from-layer="7" from-port="5" to-layer="11" to-port="0"/>
<edge from-layer="10" from-port="1" to-layer="11" to-port="1"/>
<edge from-layer="11" from-port="2" to-layer="12" to-port="0"/>
</edges>
</body>
</layer>