Table of Сontents
Activation Layer
Back to top
Name: Activation
Category: Activation
Short description: Activation layer represents an activation function of each neuron in a layer, which is used to add non-linearity to the computational flow.
Detailed description: Reference
Parameters: Activation layer parameters should be specified in the data
node, which is a child of the layer node.
-
Parameter name: type
-
Description: type represents particular activation function. For example, type equal sigmoid means that neurons of this layer have a sigmoid activation function.
-
Range of values:
-
sigmoid - sigmoid activation function. Learn more from the Detailed description section.
-
tanh - tanh activation function. Learn more from the Detailed description section.
-
elu - elu activation function. Learn more from the Detailed description section.
-
relu6 - relu6 activation function.
-
not - logical NOT function.
-
exp - exponent function.
-
Type: string
-
Default value: None
-
Required: yes
Mathematical Formulation
- Sigmoid function:
- Tahn function:
- Elu function:
- Relu6 function:
Inputs:
-
1: Multidimensional input blob. Required.
Example
<layer ... type="Activation" ... >
<data type="sigmoid" />
<input> ... </input>
<output> ... </output>
</layer>
ArgMax Layer
Back to top
Name: ArgMax
Category: Layer
Short description: ArgMax layer computes indices and values of the top_k maximum values for each datum across all dimensions CxHxW.
Detailed description: Intended for use after a classification layer to produce a prediction. If parameter out_max_val is set to "true", output is a vector of pairs *(max_ind, max_val)* for each batch. The axis parameter specifies an axis along which to maximize.
Parameters: ArgMax layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: out_max_val
-
Description: if out_max_val equals 1 then output is a list of pairs *(max_ind, max_val)*, otherwise output is list indices of size top_k.
-
Range of values: 0 or 1
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: top_k
-
Description: Amount of elements to save in the output.
-
Range of values: positive integer number
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: axis
-
Description: if set, maximizes along the specified axis, else maximizes the flattened trailing dimensions for each index of the first / num dimension.
-
Range of values: integer value. Negative value means counting dimension from the end.
-
Type: int
-
Default value: None
-
Required: yes
Inputs:
-
1: 4D input blob. Required.
Mathematical Formulation
ArgMax generally does the following with the input blobs:
Example
<layer ... type="ArgMax" ... >
<data top_k="10" out_max_val="1" axis="-1"/>
<input> ... </input>
<output> ... </output>
</layer>
BatchNormalization Layer
Back to top
Name: BatchNormalization
Category: Normalization
Short description: Reference
Detailed description: Reference
Parameters: BatchNormalization layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: epsilon
-
Description: epsilon is the number to be added to the variance to avoid division by zero when normalizing the value. For example, epsilon equal 0.001 means that 0.001 is added to the variance.
-
Range of values: positive floating point number
-
Type: float
-
Default value: 1
-
Required: yes
Inputs:
-
1: 4D input blob. Required.
Mathematical Formulation
BatchNormalization is the normalization of the output in each hidden layer.
-
Input: Values of over a mini-batch:
-
Parameters to learn:
-
Output:
-
Mini-batch mean:
-
Mini-batch variance:
-
Normalize:
-
Scale and shift:
Example
<layer ... type="BatchNormalization" ... >
<data epsilon="9.99e-06" />
<input> ... </input>
<output> ... </output>
</layer>
BinaryConvolution Layer
Back to top
Name: BinaryConvolution
Category: Layer
Short description: BinaryConvolution convolution with binary weights.
Parameters: BinaryConvolution layer parameters should be specified as the data
node, which is a child of the layer node. The layer has the same parameters as regular Convolution layer and several unique.
-
Parameter name: input
-
Description: input number of input channels.
-
Range of values: positive integer number
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: mode
-
Description: mode defines how input tensor 0/1 values and weights 0/1 are interpreted as real numbers and how the result is computed.
-
Range of values:
-
Type: string
-
Default value: None
-
Required: yes
-
Parameter name: pad_value
-
Description: pad_value floating point value used to fill pad area.
-
Range of values: floating point value
-
Type: float
-
Default value: None
-
Required: yes
Inputs:
-
1: 4D input blob containing integer or floats; filled with 0/1 values. 0 means -1, 1 means 1 for mode = xnor-popcount. Required.
Clamp Layer
Back to top
Name: Clamp
Category: Layer
Short description: Clamp layer represents clipping activation operation.
Detailed description: Reference
Parameters: Clamp layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: min
-
Description: min is the lower bound of values in the output. Any value in the input that is smaller than the bound, is replaced with the min value. For example, min equal 10 means that any value in the input that is smaller than the bound, is replaced by 10.
-
Range of values: non-negative positive floating point number
-
Type: float
-
Default value: 0.0
-
Required: yes
-
Parameter name: max
-
Description: max is the upper bound of values in the output. Any value in the input that is greater than the bound, is replaced with the max value. For example, max equals 50 means that any value in the input that is greater than the bound, is replaced by 50.
-
Range of values: positive floating point number
-
Type: float
-
Default value: 6.0
-
Required: yes
Inputs:
-
1: Multidimensional input blob. Required.
Mathematical Formulation
Clamp generally does the following with the input blobs:
Example
<layer ... type="Clamp" ... >
<data min="10" max="50" />
<input> ... </input>
<output> ... </output>
</layer>
Concat Layer
Back to top
Name: Concat
Category: Layer
Short description: Reference
Parameters: Concat layer parameters should be specified in the data
node, which is a child of the layer node.
-
Parameter name: axis
-
Description: axis is the number of axis over which input blobs are concatenated. For example, axis equal 1 means that input blobs are concatenated over the first axis.
-
Range of values: positive number greater or equal to 0
-
Type: int
-
Default value: 1
-
Required: yes
Inputs:
-
1: Multidimensional input blob. Required.
-
2: Multidimensional input blob. Required.
Mathematical Formulation
Axis parameter specifies a blob dimension to concat values. For example, for two input blobs B1xC1xH1xW1 and B2xC2xH2xW2 if axis: 1, output blob is: B1xC1+C2xH1xW1. This is only possible if B1=B2, H1=H2, W1=W2.
Example
<layer ... type="Concat" ... >
<data axis="1"/>
<input> ... </input>
<output> ... </output>
</layer>
Const Layer
Back to top
Name: Const
Category: Layer
Short description: Const layer produces blob with a constant value specified in the blobs section.
Parameters: Const layer does not have parameters.
Example
<layer ... type="Const" ...>
<output>
<port id="1">
<dim>3</dim>
<dim>100</dim>
</port>
</output>
<blobs>
<custom offset="..." size="..."/>
</blobs>
</layer>
Convolution Layer
Back to top
Name: Convolution
Category: Layer
Short description: Reference
Detailed description: Reference
Parameters: Convolution layer parameters are specified in the data
node, which is a child of the layer node.
-
Parameter name: strides
-
Description: strides is a distance (in pixels) to slide the filter on the feature map over the (z, y, x) axes for 3D convolutions and (y, x) axes for 2D convolutions. For example, strides equal 4,2,1 means sliding the filter 4 pixel at a time over depth dimension, 2 over height dimension and 1 over width dimension.
-
Range of values: integer values starting from 0
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: pads_begin
-
Description: pads_begin is a number of pixels to add to the beginning along each axis. For example, pads_begin equal 1,2 means adding 1 pixel to the top of the input and 2 to the left of the input.
-
Range of values: integer values starting from 0
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: pads_end
-
Description: pads_end is a number of pixels to add to the ending along each axis. For example, pads_end equal 1,2 means adding 1 pixel to the bottom of the input and 2 to the right of the input.
-
Range of values: integer values starting from 0
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: kernel
-
Description: kernel is a size of each filter. For example, kernel equal 2,3 means that each filter has height equal to 2 and width equal to 3.
-
Range of values: integer values starting from 1
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: output
-
Description: output is a number of output feature maps per whole output (when group > 1, output still matches the number of output features regardless of group value). For example, output equals 1 means that there is 1 output feature map in a layer.
-
Range of values: integer values starting from 0
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: group
-
Description: group denotes the number of groups to which output and input should be split. For example, group equal 1 means that all the filters are applied to full input (usual convolution), group equals 2 means that both input and output channels are separated into 2 groups and i-th output group is connected to i-th input group channels. group equals number of output feature maps denotes depth-wise separable convolution (Reference).
-
Range of values: integer values starting from 0
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: dilations
-
Description: dilations denotes the distance in width and height between elements (weights) in the filter. For example, dilation equal 1,1 means that all the elements in the filter are neighbors, so it is the same as for the usual convolution. dilation equal 2,2 means that all the elements in the filter are matched not to adjacent elements in the input matrix, but to those that are adjacent with distance 1.
-
Range of values: integer value starting from 0
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: auto_pad
-
Description: auto_pad how the padding is calculated. Possible values:
- Not specified: use explicit padding values.
-
same_upper (same_lower) the input is padded to match the output size. In case of odd padding value an extra padding is added at the end (at the beginning).
-
valid - do not use padding.
-
Type: string
-
Default value: None
-
Required: no
Inputs:
-
1: 4D or 5D input blob. Required.
Weights Layout
Weights layout is GOIYX (GOIZYX for 3D convolution), which means that X is changing the fastest, then Y, then Input, Output, then Group.
Mathematical Formulation
- For the convolutional layer, the number of output features in each dimension is calculated using the formula:
- The receptive field in each layer is calculated using the formulas:
- Jump in the output feature map:
- Size of the receptive field of output feature:
- Center position of the receptive field of the first output feature:
- Output is calculated using the following formula:
Example
<layer ... type="Convolution" ... >
<data auto_pad="same_upper" dilations="1,1" group="3" kernel="7,7" output="24" pads_begin="2,2" pads_end="3,3" strides="2,2"/>
<input> ... </input>
<output> ... </output>
<weights ... />
<biases ... />
</layer>
Crop (Type 1) Layer
Back to top
Name: Crop
Category: Layer
Short description: Crop layer changes selected dimensions of the input blob according to the specified parameters.
Parameters: Crop layer parameters should be specified in data
section, which is placed as a child of the layer node. Crop Type 1 layer takes two input blobs, and the shape of the second blob specifies the Crop size. The layer has two attributes: axis and offset. The Crop layer of this type supports shape inference.
-
Parameter name: axis
-
Description: axis is a number of a dimension to be used for cropping. For example, axis equal to 1 means that cropping is performed over the first dimension.
-
Range of values: a list of unique integers, where each element is greater than or equal to 0 and less than input shape length.
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: offset
-
Description: offset denotes the starting point for crop in the input blob. For example, offset equal to 2 means that crop is starting from the second value in the given axis.
-
Range of values: a list of integers of the length equal to the length of axis attribute. In the list, offset[i] is greater than or equal to 0 and less than or equal to input_shape[axis[i]] - crop_size[axis[i]], where crop_size is the shape of the second input.
-
Type: int[]
-
Default value: None
-
Required: yes
Inputs
-
1: Multidimensional input blob
-
2: Shape of this input will be used for crop
Example
<layer id="39" name="score_pool4c" precision="FP32" type="Crop">
<data axis="2,3" offset="0,0"/>
<input>
<port id="0">
<dim>1</dim>
<dim>21</dim>
<dim>44</dim>
<dim>44</dim>
</port>
<port id="1">
<dim>1</dim>
<dim>21</dim>
<dim>34</dim>
<dim>34</dim>
</port>
</input>
<output>
<port id="2">
<dim>1</dim>
<dim>21</dim>
<dim>34</dim>
<dim>34</dim>
</port>
</output>
</layer>
Crop (Type 2) Layer
Back to top
Name: Crop
Category: Layer
Short description: Crop layer changes selected dimensions of the input blob according to the specified parameters.
Parameters: Crop layer parameters should be specified in data
section, which is placed as a child of the layer node. Crop Type 2 layer takes one input blob to Crop and has three attributes: axis, offset, and dim. The Crop layer of this type supports shape inference only when shape propagation is applied to dimensions that are not specified in the axis attribute.
-
Parameter name: axis
-
Description: axis is a number of a dimension to be used for cropping. For example, axis equal to 1 means that cropping is performed over the first dimension.
-
Range of values: a list of unique integers, where each element is greater than or equal to 0 and less than input shape length
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: offset
-
Description: offset denotes the starting point for crop in the input blob. For example, offset equal to 2 means that cropping starts from the second value in the given axis.
-
Range of values: a list of integers with the length equal to length of axis attribute, where offset[i] is greater than or equal to 0 and less or equal to input_shape[axis[i]] - dim[i]
-
Type: int[]
-
Default value: None
-
Required: yes
-
Parameter name: dim
-
Description: dim is the resulting size of the output blob for the given axis. For example, dim equal to 88 means that the output blob gets the dimension equal to 88 for the given axis.
-
Range of values: a list of integers
-
Type: int[]
-
Default value: 1
-
Required: yes Example
<layer id="39" name="score_pool4c" precision="FP32" type="Crop">
<data axis="2,3" offset="0,0" dim="34,34"/>
<input>
<port id="0">
<dim>1</dim>
<dim>21</dim>
<dim>44</dim>
<dim>44</dim>
</port>
</input>
<output>
<port id="1">
<dim>1</dim>
<dim>21</dim>
<dim>34</dim>
<dim>34</dim>
</port>
</output>
</layer>
Crop (Type 3) Layer
Back to top
Name: Crop
Category: Layer
Short description: Crop layer changes selected dimensions of the input blob according to the specified parameters.
Parameters: Crop layer parameters should be specified in data
section, which is placed as a child of the layer node. Crop Type 3 layer takes one input blob to Crop and has three attributes: axis, crop_begin, and crop_end. The Crop layer of this type supports shape inference.
-
Parameter name: axis
-
Description: axis is the number of the dimension to be used for cropping. For example, axis equal 1 means that cropping is performed over the first dimension.
-
Range of values: a list of unique integers, where each element is greater than or equal to 0 and less than input shape length
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: crop_begin
-
Description: crop_begin specifies the starting offset for crop in the input blob for given axes.
-
Range of values: a list of integers, where crop_begin[i] is greater than or equal to 0 and less than input_shape[axis[i]] - crop_end[i]
-
Type: int[]
-
Default value: None
-
Required: yes
-
Parameter name: crop_end
-
Description: crop_end specifies the ending offset for crop in the input blob for given axes.
-
Range of values: a list of integers, where crop_end[i] is greater than or equal to 0 and less than input_shape[axis[i]] - crop_begin[i]
-
Type: int[]
-
Default value: None
-
Required: yes Example
<layer id="39" name="score_pool4c" precision="FP32" type="Crop">
<data axis="2,3" crop_begin="4,4" crop_end="6,6"/>
<input>
<port id="0">
<dim>1</dim>
<dim>21</dim>
<dim>44</dim>
<dim>44</dim>
</port>
</input>
<output>
<port id="1">
<dim>1</dim>
<dim>21</dim>
<dim>34</dim>
<dim>34</dim>
</port>
</output>
</layer>
CTCGreedyDecoder Layer
Back to top
Name: CTCGreedyDecoder
Category: Layer
Short description: CTCGreedyDecoder performs greedy decoding on the logits given in input (best path).
Detailed description: Reference
Parameters: CTCGreedyDecoder layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: ctc_merge_repeated
-
Description: ctc_merge_repeated is a flag for collapsing the repeated labels during the ctc calculation or not.
-
Range of values: 0 or 1
-
Type: int
-
Default value: 1
-
Required: yes
Mathematical Formulation
Given an input sequence of length , CTCGreadyDecoder assumes the probability of a length character sequence is given by
Example
<layer ... type="CTCGreadyDecoder" ... >
<data stride="1"/>
<input> ... </input>
<output> ... </output>
</layer>
Deconvolution Layer
Back to top
Name: Deconvolution
Category: Layer
Short description: Deconvolution layer is applied for upsampling the output to the higher image resolution.
Detailed description: Reference
Parameters: Deconvolution layer parameters should be specified in the data
node, which is a child of the layer node.
-
Parameter name: strides
-
Description: strides is a distance (in pixels) to slide the filter on the feature map over the (z, y, x) axes for 3D convolutions and (y, x) axes for 2D convolutions. For example, strides equal 4,2,1 means sliding the filter 4 pixel at a time over depth dimension, 2 over height dimension and 1 over width dimension.
-
Range of values: integer values starting from 0
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: pads_begin
-
Description: pads_begin is a number of pixels to add to the beginning along each axis. For example, pads_begin equal 1,2 means adding 1 pixel to the top of the input and 2 to the left of the input.
-
Range of values: integer values starting from 0
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: pads_end
-
Description: pads_end is a number of pixels to add to the ending along each axis. For example, pads_end equal 1,2 means adding 1 pixel to the bottom of the input and 2 to the right of the input.
-
Range of values: integer values starting from 0
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: kernel
-
Description: kernel is a size of each filter. For example, kernel equal 2,3 means that each filter has height equal to 2 and width equal to 3.
-
Range of values: integer values starting from 1
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: output
-
Description: output is a number of output feature maps per whole output (when group > 1, output still matches the number of output features regardless of group value). For example, output equals 1 means that there is 1 output feature map in a layer.
-
Range of values: integer values starting from 0
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: group
-
Description: group denotes the number of groups to which output and input should be split. For example, group equal 1 means that all the filters are applied to full input (usual convolution), group equals 2 means that both input and output channels are separated into 2 groups and i-th output group is connected to i-th input group channels. group equals number of output feature maps denotes depth-wise separable convolution (Reference).
-
Range of values: integer values starting from 0
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: dilations
-
Description: dilations denotes the distance in width and height between elements (weights) in the filter. For example, dilation equal 1,1 means that all the elements in the filter are neighbors, so it is the same as for the usual convolution. dilation equal 2,2 means that all the elements in the filter are matched not to adjacent elements in the input matrix, but to those that are adjacent with distance 1.
-
Range of values: integer value starting from 0
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: auto_pad
-
Description: auto_pad how the padding is calculated. Possible values:
- Not specified: use explicit padding values.
-
same_upper (same_lower) the input is padded to match the output size. In case of odd padding value an extra padding is added at the end (at the beginning).
-
valid - do not use padding.
-
Type: None
-
Default value: None
-
Required: No
Inputs:
-
1: 4D or 5D blob with input data. Required.
Weights Layout
Weights layout is the following: GOIYX, which means that X is changing the fastest, then Y, then Input, Output, then Group.
Mathematical Formulation
Deconvolution is also called transpose convolution and performs operation, reverse to convolution. The number of output features for each dimensions is calculated:
Where is size of output, input and filter. Output is calculated in the same way as for convolution layer:
Example
<layer ... type="Deconvolution" ...>
<data auto_pad="valid" kernel="2,2,2" output="512" pads_begin="0,0,0" pads_end="0,0,0" strides="2,2,2"/>
<input>
<port id="0">
<dim>1</dim>
<dim>512</dim>
<dim>8</dim>
<dim>8</dim>
<dim>8</dim>
</port>
</input>
<output>
<port id="3">
<dim>1</dim>
<dim>512</dim>
<dim>16</dim>
<dim>16</dim>
<dim>16</dim>
</port>
</output>
<blobs>
<weights offset="..." size="..."/>
<biases offset="..." size="..."/>
</blobs>
</layer>
DepthToSpace Layer
Back to top
Name: DepthToSpace
Category: Layer
Short description: DepthToSpace permutes data from the depth dimension of the input blob into spatial dimensions.
Detailed description: DepthToSpace layer produces a copy of the input blob where values from the depth (features) dimension are moved in spatial blocks. Refer to ONNX* specification for example of the 4D input blob case.
Parameters: DepthToSpace layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: block_size
-
Description: block_size specifies the size of the block of values to be moved. The depth dimension size must be evenly divided by block_size ^ (len(input.shape) - 2).
-
Range of values: positive integer value
-
Type: int
-
Default value: 1
-
Required: No
Inputs:
-
1: 3D+ blob with input data. Required.
Mathematical Formulation
The operation is equivalent to the following transformation of the input blob x with K spatial dimensions of shape [N, C, D1, D2, D3 , ... , DK]:
x' = reshape(x, [N, block_size, block_size, ... , block_size, D1 * block_size, D2 * block_size, ... Dk * block_size])
x'' = transpose(x', [0, K + 1, K + 2, 1, K + 3, 2, K + 4, 3, ... K + K + 1, K])
y = reshape(x'', [N, C / block_size ^ K, D1 * block_size, D2 * block_size, D3 * block_size, ... , DK * block_size])
Example
<layer ... type="DepthToSpace">
<data block_size="2"/>
<input>
<port id="0">
<dim>5</dim>
<dim>4</dim>
<dim>2</dim>
<dim>3</dim>
</port>
</input>
<output>
<port id="1">
<dim>5</dim>
<dim>1</dim>
<dim>4</dim>
<dim>6</dim>
</port>
</output>
</layer>
DetectionOutput Layer
Back to top
Name: DetectionOutput
Category: Layer
Short description: DetectionOutput layer performs non-maximum suppression to generate the detection output using information on location and confidence predictions.
Detailed description: Reference. The layer has 3 mandatory inputs: blob with box logits, blob with confidence predictions and blob with box coordinates (proposals). It can have 2 additional inputs with additional confidence predictions and box coordinates described in the article. The 5-input version of the layer is supported with Myriad plugin only. The output blob contains information about filtered detections described with 7 element tuples: [batch_id, class_id, confidence, x_1, y_1, x_2, y_2]. The first tuple with batch_id equal to *-1* means end of output.
Parameters: DetectionOutput layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: num_classes
-
Description: number of classes to be predicted
-
Range of values: positive integer number
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: background_label_id
-
Description: background label id. If there is no background class, set it to -1.
-
Range of values: integer values
-
Type: int
-
Default value: 0
-
Required: no
-
Parameter name: top_k
-
Description: maximum number of results to be kept per batch after NMS step. -1 means keeping all bounding boxes.
-
Range of values: integer values
-
Type: int
-
Default value: -1
-
Required: no
-
Parameter name: variance_encoded_in_target
-
Description: variance_encoded_in_target is a flag that denotes if variance is encoded in target. If flag is false then it is necessary to adjust the predicted offset accordingly.
-
Range of values: 0 or 1
-
Type: int
-
Default value: 0
-
Required: no
-
Parameter name: keep_top_k
-
Description: maximum number of bounding boxes per batch to be kept after NMS step. -1 means keeping all bounding boxes after NMS step.
-
Range of values: integer values
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: code_type
-
Description: type of coding method for bounding boxes
-
Range of values: "caffe.PriorBoxParameter.CENTER_SIZE", "caffe.PriorBoxParameter.CORNER"
-
Type: string
-
Default value: "caffe.PriorBoxParameter.CORNER"
-
Required: no
-
Parameter name: share_location
-
Description: share_location is a flag that denotes if bounding boxes are shared among different classes.
-
Range of values: 0 or 1
-
Type: int
-
Default value: 1
-
Required: no
-
Parameter name: nms_threshold
-
Description: threshold to be used in the NMS stage
-
Range of values: floating point values
-
Type: float
-
Default value: None
-
Required: yes
-
Parameter name: confidence_threshold
-
Description: only consider detections whose confidences are larger than a threshold. If not provided, consider all boxes.
-
Range of values: floating point values
-
Type: float
-
Default value: -FLT_MAX
-
Required: no
-
Parameter name: clip_after_nms
-
Description: clip_after_nms flag that denotes whether to perform clip bounding boxes after non-maximum suppression or not.
-
Range of values: 0 or 1
-
Type: int
-
Default value: 0
-
Required: no
-
Parameter name: clip_before_nms
-
Description: clip_before_nms flag that denotes whether to perform clip bounding boxes before non-maximum suppression or not.
-
Range of values: 0 or 1
-
Type: int
-
Default value: 0
-
Required: no
-
Parameter name: decrease_label_id
-
Description: decrease_label_id flag that denotes how to perform NMS.
-
Range of values:
- 0 - perform NMS like in Caffe*.
- 1 - perform NMS like in MxNet*.
-
Type: int
-
Default value: 0
-
Required: no
-
Parameter name: normalized
-
Description: normalized flag that denotes whether input blobs with boxes are normalized. If blobs are not normalized then input_height and input_width parameters are used to normalize box coordinates.
-
Range of values: 0 or 1
-
Type: int
-
Default value: 0
-
Required: no
-
Parameter name: input_height (input_width)
-
Description: input image height (width). If the normalized is 1 then these parameters are not used.
-
Range of values: positive integer number
-
Type: int
-
Default value: 1
-
Required: no
-
Parameter name: objectness_score
-
Description: threshold to sort out confidence predictions. Used only when the DetectionOutput layer has 5 inputs.
-
Range of values: non-negative float number
-
Type: float
-
Default value: 0
-
Required: no
-
1: 2D input blob with box logits. Required.
-
2: 2D input blob with class predictions. Required.
-
3: 3D input blob with proposals. Required.
-
4: 2D input blob with additional class predictions information described in the article. Optional.
-
5: 2D input blob with additional box predictions information described in the article. Optional.
Mathematical Formulation
At each feature map cell, DetectionOutput predicts the offsets relative to the default box shapes in the cell, as well as the per-class scores that indicate the presence of a class instance in each of those boxes. Specifically, for each box out of k at a given location, DetectionOutput computes class scores and the four offsets relative to the original default box shape. This results in a total of filters that are applied around each location in the feature map, yielding outputs for a m * n feature map.
Example
<layer ... type="DetectionOutput" ... >
<data num_classes="21" share_location="1" background_label_id="0" nms_threshold="0.450000" top_k="400" input_height="1" input_width="1" code_type="caffe.PriorBoxParameter.CENTER_SIZE" variance_encoded_in_target="0" keep_top_k="200" confidence_threshold="0.010000"/>
<input> ... </input>
<output> ... </output>
</layer>
Eltwise Layer
Back to top
Name: Eltwise
Category: Layer
Short description: Eltwise layer performs element-wise operation specified in parameters, over given inputs.
Parameters: Eltwise layer parameters should be specified in the data
node, which is placed as a child of the layer node. Eltwise accepts 2 inputs of arbitrary number of dimensions. The operation supports broadcasting input blobs according to the NumPy specification.
-
Parameter name: operation
-
Description: operation is a mathematical operation to be performed over inputs.
-
Range of values:
-
sum - summation
-
sub - subtraction
-
mul - multiplication
-
div - division
-
max - maximum
-
min - minimum
-
squared_diff - squared difference
-
floor_mod - reminder of division
-
pow - power
-
logical_and - logical AND
-
logical_or - logical OR
-
logical_xor - logical XOR
-
less - less
-
less_equal - less or equal
-
greater - greater
-
greater_equal - greater equal
-
equal - equal
-
not_equal - not equal
-
Type: string
-
Default value: sum
-
Required: no
Inputs
-
1: Multidimensional input blob. Required.
-
2: Multidimensional input blob. Required.
Mathematical Formulation Eltwise does the following with the input blobs:
where - first blob -th element, - second blob -th element, - output blob -th element, - is a function that performs an operation over its two arguments .
Example
<layer ... type="Eltwise" ... >
<data operation="sum"/>
<input> ... </input>
<output> ... </output>
</layer>
Fill Layer
Back to top
Name: Fill
Category: Layer
Short description: Fill layer generates a blob of the specified shape filled with the specified value.
Parameters: Fill layer has no parameters.
Inputs:
-
1: 1D blob with an output blob shape. Required.
-
2: 0D blob (constant) with the value for fill. Required.
Example
<layer ... type="Fill">
<input>
<port id="0">
<dim>2</dim>
</port>
<port id="1"/>
</input>
<output>
<port id="2">
<dim>3</dim>
<dim>4</dim>
</port>
</output>
</layer>
Flatten Layer
Back to top
Name: Flatten
Category: Layer
Short description: Flatten layer performs flattening of specific dimensions of the input blob.
Parameters: Flatten layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: axis
-
Description: axis specifies the first axis to start flattening.
-
Range of values: non-negative integer values
-
Type: int
-
Default value: 0
-
Required: no
-
Parameter name: end_axis
-
Description: end_axis speficies the last dimension to flatten. The value could be negative meaning counting axes from the end.
-
Range of values: integer number
-
Type: int
-
Default value: -1
-
Required: no
Inputs
-
1: Multidimensional input blob. Required.
Example
<layer ... type="Flatten" ...>
<data axis="1" end_axis="-1"/>
<input>
<port id="0">
<dim>7</dim>
<dim>19</dim>
<dim>19</dim>
<dim>12</dim>
</port>
</input>
<output>
<port id="1">
<dim>7</dim>
<dim>4332</dim>
</port>
</output>
</layer>
FullyConnected Layer
Back to top
Name: FullyConnected
Category: Layer
Short description: Reference
Detailed description: Reference
Parameters: Specify FullyConnected layer parameters in the data
node, which is a child of the layer node.
-
Parameter name: out-size
-
Description: out-size is a length of the output vector. For example, out-size equal 4096 means that the output vector length is 4096.
-
Range of values: integer value starting from 0
-
Type: int
-
Default value: 1
-
Required: yes
Inputs
-
1: 2D or 4D input blob. Required.
Weights Layout
OI, which means that Input is changing the fastest, then Output.
Mathematical Formulation
- If previous layer is FullyConnected:
- Otherwise:
Example
<layer ... type="FullyConnected" ... >
<data out-size="4096"/>
<input> ... </input>
<output> ... </output>
</layer>
Gather Layer
Back to top
Name: Gather
Category: Layer
Short description: Gather layer takes slices of data in the second input blob according to the indices specified in the first input blob. The output blob shape is input2.shape[:axis] + input1.shape + input2.shape[axis + 1:]
.
Parameters: Gather layer parameters are specified in the data
section, which is placed as a child of the layer node.
-
Parameter name: axis
-
Description: axis is a index of a dimension to gather data. For example, axis equal to 1 means that gathering is performed over the first dimension.
-
Range of values: a single integer in the range
[-len(input2.shape), len(input2.shape) - 1]
.
-
Type: int
-
Default value: 1
-
Required: yes
Mathematical Formulation
Inputs
-
1: Multidimensional input blob with indices to gather. The values for indices are in the range
[0, input1[axis] - 1]
.
-
2: Multidimensional input blob with arbitrary data.
Example
<layer id="1" name="gather_node" precision="FP32" type="Gather">
<data axis=1 />
<input>
<port id="0">
<dim>15</dim>
<dim>4</dim>
<dim>20</dim>
<dim>28</dim>
</port>
<port id="1">
<dim>6</dim>
<dim>12</dim>
<dim>10</dim>
<dim>24</dim>
</port>
</input>
<output>
<port id="2">
<dim>6</dim>
<dim>15</dim>
<dim>4</dim>
<dim>20</dim>
<dim>28</dim>
<dim>10</dim>
<dim>24</dim>
</port>
</output>
</layer>
GRN Layer
Back to top
Name: GRN
Category: Normalization
Short description: GRN is Global Response Normalization with L2 norm (across channels only).
Parameters: GRN layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: bias
-
Description: bias is added to the variance.
-
Range of values: floating point value
-
Type: float
-
Default value: 1
-
Required: yes
Inputs
-
1: 2D, 3D or 4D input blob. Required.
Mathematical Formulation
GRN computes L2 norm by channels for input blob. GRN generally does the following with the input blob:
Example
<layer ... type="GRN" ... >
<data bias="1.0"/>
<input> ... </input>
<output> ... </output>
</layer>
GRUCell Layer
Back to top
Name: GRUCell
Category: Layer
Short description: GRUCell layer computes the output using the formula described in the paper.
Parameters: GRUCell layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: hidden_size
-
Description: hidden_size specifies hidden state size.
-
Range of values: positive integer value
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: activations
-
Description: activation functions for gates
-
Range of values: any combination of relu, sigmoid, tanh
-
Type: list of strings
-
Default value: sigmoid,tanh
-
Required: no
-
Parameter name: activations_alpha, activations_beta
-
Description: activations_alpha, activations_beta functions parameters
-
Range of values: list of floats
-
Type: float[]
-
Default value: None
-
Required: no
-
Parameter name: clip
-
Description: clip specifies value for tensor clipping to be in [-C, C] before activations
-
Range of values: positive float value
-
Type: float
-
Default value: None
-
Required: no
-
Parameter name: linear_before_reset
-
Description: linear_before_reset flag denoting that the layer should behave according to modification of GRUCell described in formula in ONNX documentation.
-
Range of values: 0 r 1
-
Type: int
-
Default value: 0
-
Required: no
Inputs
-
1:
X
- 2D ([batch_size, input_size]) input data. Required.
-
2:
Hi
- 2D ([batch_size, hidden_size]) input hidden state data. Required.
Outputs
-
1:
Ho
- 2D ([batch_size, hidden_size]) output hidden state.
Input Layer
Back to top
Name: Input
Category: Layer
Short description: Input layer specifies input to the model.
Parameters: Input layer does not have parameters.
Example
<layer ... type="Input" ...>
<output>
<port id="0">
<dim>1</dim>
<dim>3</dim>
<dim>224</dim>
<dim>224</dim>
</port>
</output>
</layer>
Interp Layer
Back to top
Name: Interp
Category: Layer
Short description: Interp layer performs bilinear interpolation of the input blob by the specified parameters.
Parameters: Interp layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: height (width)
-
Description: height (width) specifies output height (width). If the parameter is not specified then other parameters are used for output size calculation.
-
Range of values: positive integer values
-
Type: int
-
Default value: None
-
Required: no
-
Parameter name: zoom_factor
-
Description: zoom_factor is a factor that the input height (width) are multiplied with to calculate the output height (width). If the parameter is equal to 0 then other parameters are used for output size calculation.
-
Range of values: floating point number
-
Type: float
-
Default value: 0
-
Required: no
-
Parameter name: shrink_factor
-
Description: shrink_factor is a factor that the input height (width) are divided with to calculate the output height (width). If the parameter is equal to 0 then other parameters are used for output size calculation.
-
Range of values: floating point number
-
Type: float
-
Default value: 0
-
Required: no
-
Parameter name: factor
-
Description: factor is a factor that the input height (width) are multiplied with to calculate the output height (width). If the parameter is equal to 0 then other parameters are used for output size calculation.
-
Range of values: floating point number
-
Type: float
-
Default value: 1.0
-
Required: no
-
Parameter name: align_corners
-
Description: align_corners is a flag that denotes whether to perform aligning of corners or not.
-
Range of values: 0 or 1
-
Type: int
-
Default value: 1
-
Required: no
-
Parameter name: pad_beg (pad_end)
-
Description: pad_beg (pad_end) specifies number of pixels to be added to the beginning (ending) of the image being interpolated.
-
Range of values: non negative integer values
-
Type: int
-
Default value: 0
-
Required: yes
Inputs
-
1: 4D input blob. Required.
Example
<layer ... type="Interp" ...>
<data align_corners="0" factor="2.0" pad_beg="0" pad_end="0"/>
<input>
<port id="0">
<dim>1</dim>
<dim>2</dim>
<dim>48</dim>
<dim>80</dim>
</port>
</input>
<output>
<port id="1">
<dim>1</dim>
<dim>2</dim>
<dim>96</dim>
<dim>160</dim>
</port>
</output>
</layer>
LSTMCell Layer
Back to top
Name: LSTMCell
Category: Layer
Short description: LSTMCell layer computes the output using the formula described in the original paper Long Short-Term Memory.
Parameters: LSTMCell layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: hidden_size
-
Description: hidden_size specifies hidden state size.
-
Range of values: positive integer value
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: activations
-
Description: activation functions for gates
-
Range of values: any combination of relu, sigmoid, tanh
-
Type: list of strings
-
Default value: sigmoid,tanh,tanh
-
Required: no
-
Parameter name: activations_alpha, activations_beta
-
Description: activations_alpha, activations_beta functions parameters
-
Range of values: list of floats
-
Type: float[]
-
Default value: None
-
Required: no
-
Parameter name: clip
-
Description: clip specifies value for tensor clipping to be in [-C, C] before activations
-
Range of values: positive float value
-
Type: float
-
Default value: None
-
Required: no
Inputs
-
1:
X
- 2D ([batch_size, input_size]) input data. Required.
-
2:
Hi
- 2D ([batch_size, hidden_size]) input hidden state data. Required.
-
3:
Ci
- 2D ([batch_size, hidden_size]) input cell state data. Required.
Outputs
-
1:
Ho
- 2D ([batch_size, hidden_size]) output hidden state.
-
2:
Co
- 2D ([batch_size, hidden_size]) output cell state.
Mathematical Formulation
Formula:
* - matrix mult
(.) - eltwise mult
[,] - concatenation
sigm - 1/(1 + e^{-x})
tanh - (e^{2x} - 1)/(e^{2x} + 1)
f = sigm(Wf*[Hi, X] + Bf)
i = sigm(Wi*[Hi, X] + Bi)
c = tanh(Wc*[Hi, X] + Bc)
o = sigm(Wo*[Hi, X] + Bo)
Co = f (.) Ci + i (.) c
Ho = o (.) tanh(Co)
Example
<layer ... type="LSTMCell" ... >
<input> ... </input>
<output> ... </output>
</layer>
Memory Layer
Back to top
Name: Memory
Category: Layer
Short description: Memory layer represents delay layer in terms of LSTM terminology. To read more about LSTM topologies please refer this link.
Detailed description: Memory layer saves state between two infer requests. In the topology, it is the single layer, however, in the Intermediate Representation, it is always represented as a pair of Memory layers. One of these layers does not have outputs and another does not have inputs (in terms of the Intermediate Representation).
Parameters: Memory layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: id
-
Description: id is the id of the pair of Memory layers. For example, id equals r_27-28 means that layers with id 27 and 28 are in one pair.
-
Range of values: positive integer number
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: index
-
Description: index represents if the given layer is input or output. For example, index equal 0 means this layer is output one.
-
Range of values:
- 0 - current layer is output one
- 1 - current layer is input one
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: size
-
Description: size represents the size of the group. For example, size equals 2 means this group is a pair.
-
Range of values: only 2 is supported
-
Type: int
-
Default value: 1
-
Required: yes
Mathematical Formulation
Memory save data from the input blob.
Example
<layer ... type="Memory" ... >
<data id="r_27-28" index="0" size="2" />
<input> ... </input>
<output> ... </output>
</layer>
MVN Layer
Back to top
Name: MVN
Category: Normalization
Short description: Reference
Parameters: MVN layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: across_channels
-
Description: across_channels is a flag that denotes if mean values are shared across channels. For example, across_channels equal 0 means that mean values are not shared across channels.
-
Range of values:
- 0 - mean values are not shared across channels
- 1 - mean values are shared across channels
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: normalize_variance
-
Description: normalize_variance is a flag that denotes whether to perform variance normalization.
-
Range of values:
- 0 - variance normalization is not performed
- 1 - variance normalization is performed
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: eps
-
Description: eps is the number to be added to the variance to avoid division by zero when normalizing the value. For example, epsilon equal 0.001 means that 0.001 is added to the variance.
-
Range of values: floating point positive number
-
Type: float
-
Default value: 1
-
Required: yes
Inputs
-
1: 4D or 5D input blob. Required.
Mathematical Formulation
MVN subtracts mean from the input blob:
If normalize_variance is set to 1, the output blob is divided by variance:
Example
<layer ... type="MVN">
<data across_channels="1" eps="9.999999717180685e-10" normalize_variance="1"/>
<input>
...
</input>
<output>
...
</output>
</layer>
Norm Layer
Back to top
Name: Norm
Category: Normalization
Short description: Reference
Detailed description: Reference
Parameters: Norm layer parameters should be specified in the data
node, which is a child of the layer node.
-
Parameter name: alpha
-
Description: alpha represents the scaling parameter for the normalizing sum. For example, alpha equal 0.0001 means that the normalizing sum is multiplied by 0.0001.
-
Range of values: floating point positive number
-
Type: float
-
Default value: 1
-
Required: yes
-
Parameter name: beta
-
Description: beta represents the exponent for the normalizing sum. For example, beta equal 0.75 means that the normalizing sum is raised to the power of 0.75.
-
Range of values: floating point positive number
-
Type: float
-
Default value: 1
-
Required: yes
-
Parameter name: region
-
Description: region represents strategy of local regions extension. For example, region equal across means that the normalizing sum is performed over adjacent channels.
-
Range of values:
-
across - normalizing sum is performed over adjacent channels
-
same - normalizing sum is performed over nearby spatial locations
-
Type: string
-
Default value: 1
-
Required: yes
-
Parameter name: local-size
-
Description: local-size represents the side length of the region to be used for the normalization sum or number of channels depending on the strategy specified in the region parameter. For example, local-size equal 5 for the across strategy means application of sum across 5 adjacent channels.
-
Range of values: positive integer bigger than zero
-
Type: int
-
Default value: 1
-
Required: yes
Inputs
-
1: 4D input blob. Required.
Mathematical Formulation
Where is the size of each local region.
Example
<layer ... type="Norm" ... >
<data alpha="9.9999997e-05" beta="0.75" local-size="5" region="across"/>
<input> ... </input>
<output> ... </output>
</layer>
Normalize Layer
Back to top
Name: Normalize
Category: Normalization
Short description: Normalize layer performs l-p normalization of 1 of input blob.
Parameters: Normalize layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: across_spatial
-
Description: across_spatial is a flag that denotes if normalization is performed over CHW or HW. For example, across_spatial equals 0 means that normalization is not shared across channels.
-
Range of values:
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: channel_shared
-
Description: channel_shared is a flag that denotes if scale parameters are shared across channels. For example, channel_shared equal 0 means that scale parameters are not shared across channels.
-
Range of values:
- 0 - scale parameters are not shared across channels
- 1 - not supported
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: eps
-
Description: eps is the epsilon used to avoid division by zero when normalizing the value. For example, eps equals 0.001 means that 0.001 is used if all the values in normalization are equal to zero.
-
Range of values: positive floating point number
-
Type: float
-
Default value: 1
-
Required: yes
Inputs
-
1: 2D, 3D or 4D input blob. Required.
Mathematical Formulation
Example
<layer ... type="Normalize" ... >
<data across_spatial="0" channel_shared="0" eps="0.000000"/>
<input> ... </input>
<output> ... </output>
</layer>
Pad Layer
Back to top
Name: Pad
Category: Layer
Short description: Pad layer extends an input blob on edges. New element values are generated based on the Pad layer parameters described below.
Parameters: Pad layer parameters should be specified in the data
section, which is placed as a child of the layer node. The parameters specify a number of elements to added along each axis and a rule by which new element values are generated: for example, whether they are filled with a given constant or generated based on the input blob content.
-
Parameter name: pads_begin
-
Description: A number of padding elements at the beginning of each axis. Required.
-
Range of values: A list of non-negative integers. The length of the list must be equal to the number of dimensions in the input blob.
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: pads_end
-
Description: A number of padding elements at the end of each axis. Required.
-
Range of values: A list of non-negative integers. The length of the list must be equal to the number of dimensions in the input blob.
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: pad_mode
-
Description: A method used to generate new element values. Required.
-
Range of values: Name of the method in string format:
-
constant
: Padded values are equal to the value of the pad_value layer parameter.
-
edge
: Padded values are copied from the respective edge of the input blob.
-
reflect
: Padded values are a reflection of the input blob; values on the edges are not duplicated. pads_begin[D]
and pads_end[D]
must be not greater than input.shape[D] – 1
for any valid D
.
-
symmetric
: Padded values are symmetrically added from the input blob. This method is similar to the reflect
, but values on edges are duplicated. Refer to the examples below for more details. pads_begin[D]
and pads_end[D]
must be not greater than input.shape[D]
for any valid D
.
-
Type: string
-
Default value: 1
-
Required: yes
-
Parameter name: pad_value
-
Description: Applicable for the
pad_mode = "constant"
only. All new elements are filled with this value. Optional, default value is 0.
-
Range of values: An arbitrary floating point value.
-
Type: float
-
Default value: 1
-
Required: yes
Inputs
-
1: Multidimensional input blob. Required.
Outputs
-
1: Multidimensional input blob with dimensions
pads_begin[D] + input.shape[D] + pads_end[D]
for each D
from 0
to len(input.shape) - 1
.
pad_mode Examples
The following examples illustrate how output blob is generated for the Pad layer for a given input blob:
INPUT =
[[ 1 2 3 4 ]
[ 5 6 7 8 ]
[ 9 10 11 12 ]]
with the following parameters:
pads_begin = [0, 1]
pads_end = [2, 3]
depending on the pad_mode.
-
pad_mode = "constant"
:
OUTPUT =
[[ 0 1 2 3 4 0 0 0 ]
[ 0 5 6 7 8 0 0 0 ]
[ 0 9 10 11 12 0 0 0 ]
[ 0 0 0 0 0 0 0 0 ]
[ 0 0 0 0 0 0 0 0 ]]
-
pad_mode = "edge"
:
OUTPUT =
[[ 1 1 2 3 4 4 4 4 ]
[ 5 5 6 7 8 8 8 8 ]
[ 9 9 10 11 12 12 12 12 ]
[ 9 9 10 11 12 12 12 12 ]
[ 9 9 10 11 12 12 12 12 ]]
-
pad_mode = "reflect"
:
OUTPUT =
[[ 2 1 2 3 4 3 2 1 ]
[ 6 5 6 7 8 7 6 5 ]
[ 10 9 10 11 12 11 10 9 ]
[ 6 5 6 7 8 7 6 5 ]
[ 2 1 2 3 4 3 2 1 ]]
-
pad_mode = "symmetric"
:
OUTPUT =
[[ 1 1 2 3 4 4 3 2 ]
[ 5 5 6 7 8 8 7 6 ]
[ 9 9 10 11 12 12 11 10 ]
[ 9 9 10 11 12 12 11 10 ]
[ 5 5 6 7 8 8 7 6 ]]
Example
<layer ... type="Pad" ...>
<data pads_begin="0,5,2,1" pads_end="1,0,3,7" pad_mode="constant" pad_value="666.0"/>
<input>
<port id="0">
<dim>1</dim>
<dim>3</dim>
<dim>32</dim>
<dim>40</dim>
</port>
</input>
<output>
<port id="2">
<dim>2</dim>
<dim>8</dim>
<dim>37</dim>
<dim>48</dim>
</port>
</output>
</layer>
Permute Layer
Back to top
Name: Permute
Category: Layer
Short description: Permute layer performs reordering of input blob dimensions.
Detailed description: Reference
Parameters: Permute layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: order
-
Description: order is the set of dimensions indexes for output blob. For example, order equal 0,2,3,1 means that the output blob has following dimensions: first dimension from the input blob, third dimension from the input blob, fourth dimension from the input blob, second dimension from the input blob.
-
Range of values: set of positive integer numbers separated by comma
-
Type: int[]
-
Default value: 1
-
Required: yes
Inputs:
-
1: Multidimensional input blob. Required.
Mathematical Formulation
Permute layer performs reordering input blob. Source indexes and destination indexes are bound by formula:
Example
<layer ... type="Permute" ... >
<data order="0,2,3,1"/>
<input> ... </input>
<output> ... </output>
</layer>
Pooling Layer
Back to top
Name: Pooling
Category: Pool
Short description: Reference
Detailed description: Reference
Parameters: Pooling layer parameters are specified in the data
node, which is a child of the layer node.
-
Parameter name: strides
-
Description: strides is a distance (in pixels) to slide the window on the feature map over the (z, y, x) axes for 3D poolings and (y, x) axes for 2D poolings. For example, strides equal "4,2,1" means sliding the window 4 pixel at a time over depth dimension, 2 over height dimension and 1 over width dimension.
-
Range of values: integer values starting from 0
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: pads_begin
-
Description: pads_begin is a number of pixels to add to the beginning along each axis. For example, pads_begin equal "1,2" means adding 1 pixel to the top of the input and 2 to the left of the input.
-
Range of values: integer values starting from 0
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: pads_end
-
Description: pads_end is a number of pixels to add to the ending along each axis. For example, pads_end equal "1,2" means adding 1 pixel to the bottom of the input and 2 to the right of the input.
-
Range of values: integer values starting from 0
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: kernel
-
Description: kernel is a size of each filter. For example, kernel equal (2, 3) means that each filter has height equal to 2 and width equal to 3.
-
Range of values: integer values starting from 1
-
Type: int[]
-
Default value: 1
-
Required: yes
-
Parameter name: pool-method
-
Description: pool-method is a type of pooling strategy for values.
-
Range of values:
-
max - chooses the biggest value in a feature map for each window position
-
avg - takes the average value in a feature map for each windows position
-
Type: string
-
Default value: 1
-
Required: yes
-
Parameter name: exclude-pad
-
Description: exclude-pad is a type of pooling strategy for values in the padding area. For example, if exclude-pad is "true", zero-values in the padding are not used.
-
Range of values: "true" or "false"
-
Type: string
-
Default value: 1
-
Required: yes
-
Parameter name: rounding_type
-
Description: rounding_type is a type of rounding to be applied.
-
Range of values:
-
Type: string
-
Default value: floor
-
Parameter name: auto_pad
-
Description: auto_pad how the padding is calculated. Possible values:
- Not specified: use explicit padding values.
-
same_upper (same_lower) the input is padded to match the output size. In case of odd padding value an extra padding is added at the end (at the beginning).
-
valid - do not use padding.
-
Type: string
-
Default value: None
-
Required: no
Inputs:
-
1: 4D or 5D input blob. Required.
Mathematical Formulation
- For max pool-method:
- For avg pool-method:
Example
<layer ... type="Pooling" ... >
<data auto_pad="same_upper" exclude-pad="true" kernel="3,3" pads_begin="0,0" pads_end="1,1" pool-method="max" strides="2,2"/>
<input> ... </input>
<output> ... </output>
</layer>
Power Layer
Back to top
Name: Power
Category: Layer
Short description: Power layer computes the output as (shift + scale * x) ^ power for each input element x.
Parameters: Power layer parameters should be specified as the data
node, which is a child of the layer node.
Inputs:
-
1: Multidimensional input blob. Required.
Mathematical Formulation
Example
<layer ... type="Power" ... >
<data power="2" scale="0.1" shift="5"/>
<input> ... </input>
<output> ... </output>
</layer>
PReLU Layer
Back to top
Name: PReLU
Category: Activation
Short description: PReLU is the Parametric Rectifier Linear Unit. The difference from ReLU is that negative slopes can vary across channels.
Parameters: PReLU layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: channel_shared
-
Description: channel_shared specifies whether negative slope is shared across channels or not.
-
Range of values: 0 or 1
-
Type: int
-
Default value: 0
-
Required: yes
Inputs:
-
1: 4D or 5D input blob. Required.
Mathematical Formulation
PReLU accepts one input with four dimensions. The produced blob has the same dimensions as input. PReLU does the following with the input blob:
where is from weights blob.
Example
<layer ... type="PReLU" ... >
<data channel_shared="1"/>
<input> ... </input>
<output> ... </output>
</layer>
PriorBox Layer
Back to top
Name: PriorBox
Category: Layer
Short description: PriorBox layer generates prior boxes of specified sizes and aspect ratios across all dimensions.
Parameters: PriorBox layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: min_size (max_size)
-
Description: min_size (max_size) is the minimum (maximum) box size (in pixels). For example, min_size (max_size) equal 15 means that the minimum (maximum) box size is 15.
-
Range of values: positive floating point numbers
-
Type: float[]
-
Default value: []
-
Required: yes
-
Parameter name: aspect_ratio
-
Description: aspect_ratio is a variance of aspect ratios. Duplicate values are ignored. For example, aspect_ratio equal "2.0,3.0" means that for the first box aspect_ratio is equal to 2.0 and for the second box is 3.0.
-
Range of values: set of positive integer numbers
-
Type: float[]
-
Default value: []
-
Required: yes
-
Parameter name: flip
-
Description: flip is a flag that denotes that each aspect_ratio is duplicated and flipped. For example, flip equals 1 and aspect_ratio equals to "4.0,2.0" mean that aspect_ratio is equal to "4.0,2.0,0.25,0.5".
-
Range of values:
- 0 - each aspect_ratio is flipped
- 1 - each aspect_ratio is not flipped
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: clip
-
Description: clip is a flag that denotes if each value in the output blob should be clipped to [0,1] interval.
-
Range of values:
- 0 - clipping is not performed
- 1 - each value in the output blob is clipped to [0,1] interval.
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: step
-
Description: step is a distance between box centers. For example, step equal 85 means that the distance between neighborhood prior boxes centers is 85.
-
Range of values: floating point non-negative number
-
Type: float
-
Default value: 0
-
Required: yes
-
Parameter name: offset
-
Description: offset is a shift of box respectively to top left corner. For example, offset equal 85 means that the shift of neighborhood prior boxes centers is 85.
-
Range of values: floating point non-negative number
-
Type: float
-
Default value: None
-
Required: yes
-
Parameter name: variance
-
Description: variance denotes a variance of adjusting bounding boxes. The parameter could contain 0, 1 or 4 elements.
-
Range of values: floating point positive numbers
-
Type: float[]
-
Default value: []
-
Required: yes
-
Parameter name: scale_all_sizes
-
Description: scale_all_sizes is a flag that denotes type of inference. For example, scale_all_sizes equals 0 means that the PriorBox layer is inferred in MXNet-like manner. In particular, max_size parameter is ignored.
-
Range of values:
- 0 - max_size is ignored
- 1 - max_size is used
-
Type: int
-
Default value: 1
-
Required: yes
Inputs:
-
1: 4D input blob. Used to get height and width only. Required.
-
2: 4D input blob. Used to get image height and image width only. Required.
Mathematical Formulation:
PriorBox computes coordinates of prior boxes by following:
- First calculates center_x and center_y of prior box:
- Then, for each calculates coordinates of prior boxes:
Example
<layer ... type="PriorBox" ... >
<data step="64.000000" min_size="162.000000" max_size="213.000000" offset="0.500000" flip="1" clip="0" aspect_ratio="2.000000,3.000000" variance="0.100000,0.100000,0.200000,0.200000" />
<input> ... </input>
<output> ... </output>
</layer>
PriorBoxClustered Layer
Back to top
Name: PriorBoxClustered
Category: Layer
Short description: PriorBoxClustered layer generates prior boxes of specified sizes normalized to the input image size.
Parameters: PriorBoxClustered layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: width (height)
-
Description: width (height) specifies desired boxes widths (heights) in pixels.
-
Range of values: floating point positive numbers
-
Type: float[]
-
Default value: 1.0
-
Required: yes
-
Parameter name: clip
-
Description: clip is a flag that denotes if each value in the output blob should be clipped within [0,1].
-
Range of values:
- 0 - clipping is not performed
- 1 - each value in the output blob is within [0,1]
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: step (step_w, step_h)
-
Description: step (step_w, step_h) is a distance between box centers. For example, step equal 85 means that the distance between neighborhood prior boxes centers is 85. If both step_h and step_w are 0 then they are updated with value of step. If after that they are still 0 then they are calculated as input image width(height) divided with first input width(height).
-
Range of values: floating point positive number
-
Type: float
-
Default value: 0.0
-
Required: yes
-
Parameter name: offset
-
Description: offset is a shift of box respectively to top left corner. For example, offset equal 85 means that the shift of neighborhood prior boxes centers is 85.
-
Range of values: floating point positive number
-
Type: float
-
Default value: None
-
Required: yes
-
Parameter name: variance
-
Description: variance denotes a variance of adjusting bounding boxes.
-
Range of values: floating point positive numbers
-
Type: float[]
-
Default value: []
-
Required: yes
-
Parameter name: img_h (img_w)
-
Description: img_h (img_w) specifies height (width) of input image. These parameters are calculated as second input height(width) unless provided explicitly.
-
Range of values: floating point positive number
-
Type: float
-
Default value: 1
-
Required: yes
Inputs:
-
1: 4D input blob. Used to get height and width only. Required.
-
2: 4D input blob. Used to get image height and image width only. Required.
Mathematical Formulation
PriorBoxClustered computes coordinates of prior boxes by following:
- Calculates the center_x and center_y of prior box:
- For each calculates the prior boxes coordinates:
If clip is defined, the coordinates of prior boxes are recalculated with the formula:
Example
<layer ... type="PriorBoxClustered">
<data clip="0" flip="0" height="44.0,10.0,30.0,19.0,94.0,32.0,61.0,53.0,17.0" offset="0.5" step="16.0" variance="0.1,0.1,0.2,0.2"
width="86.0,13.0,57.0,39.0,68.0,34.0,142.0,50.0,23.0"/>
<input>
...
</input>
<output>
...
</output>
</layer>
Proposal Layer
Back to top
Name: Proposal
Category: Layer
Short description: Proposal layer performs filtering of only those bounding boxes and outputs with the highest confidence of prediction.
Parameters: Proposal layer parameters should be specified as the data
node, which is a child of the layer node. The layer has three inputs: blob with probabilities whether particular bounding box corresponds to background and foreground, blob with logits for each of the bounding boxes, blob with input image size: [image_height, image_width, scale_height_and_width] or [image_height, image_width, scale_height, scale_width].
-
Parameter name: base_size
-
Description: base_size is the size of the anchor to which scale and ratio parameters are applied.
-
Range of values: positive integer number
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: pre_nms_topn (post_nms_topn)
-
Description: pre_nms_topn (post_nms_topn) is the quantity of bounding boxes before (after) applying NMS operation. For example, pre_nms_topn (post_nms_topn) equal 15 means that the minimum (maximum) box size is 15.
-
Range of values: positive integer number
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: nms_thresh
-
Description: nms_thresh is the minimum value of the proposal to be taken into consideration. For example, nms_thresh equal 0.5 means that all boxes with prediction probability less than 0.5 are filtered out.
-
Range of values: floating point positive number
-
Type: float
-
Default value: 1
-
Required: yes
-
Parameter name: feat_stride
-
Description: feat_stride is the step size to slide over boxes (in pixels). For example, feat_stride equal 16 means that all boxes are analyzed with the slide 16.
-
Range of values: positive integer number
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: min_size
-
Description: min_size is the minimum size of box to be taken into consideration. For example, min_size equal 35 means that all boxes with box size less than 35 are filtered out.
-
Range of values: positive integer number
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: ratio
-
Description: ratio is the ratios for anchor generation.
-
Range of values: array of float numbers
-
Type: float
-
Default value: 1
-
Required: yes
-
Parameter name: scale
-
Description: scale is the scales for anchor generation.
-
Range of values: array of float numbers
-
Type: float
-
Default value: 1
-
Required: yes
-
Parameter name: clip_before_nms
-
Description: clip_before_nms flag that denotes whether to perform clip bounding boxes before non-maximum suppression or not.
-
Range of values: 0 or 1
-
Type: int
-
Default value: 1
-
Required: no
-
Parameter name: clip_after_nms
-
Description: clip_after_nms flag that denotes whether to perform clip bounding boxes after non-maximum suppression or not.
-
Range of values: 0 or 1
-
Type: int
-
Default value: 0
-
Required: no
-
Parameter name: normalize
-
Description: normalize flag that denotes whether to perform normalization of output boxes to [0,1] interval or not.
-
Range of values: 0 or 1
-
Type: int
-
Default value: 0
-
Required: no
-
Parameter name: box_size_scale
-
Description: box_size_scale specifies the scale factor applied to logits of box sizes before decoding.
-
Range of values: positive floating point number
-
Type: float
-
Default value: 1.0
-
Required: no
-
Parameter name: box_coordinate_scale
-
Description: box_coordinate_scale specifies the scale factor applied to logits of box coordinates before decoding.
-
Range of values: positive floating point number
-
Type: float
-
Default value: 1.0
-
Required: no
-
Parameter name: framework
-
Description: framework affects how the box coordinates are calculated.
-
Range of values:
- "" (empty string) - calculate box coordinates like in Caffe*
-
tensorflow - calculate box coordinates like in the TensorFlow* Object Detection API models
-
Type: string
-
Default value: "" (empty string)
-
Required: no
Mathematical Formulation
Proposal layer accepts three inputs with four dimensions. The produced blob has two dimensions: first one equals batch_size * post_nms_topn. Proposal layer does the following with the input blob:
- Generates initial anchor boxes Left top corner of all boxes in (0, 0). Width and height of boxes are calculated from base_size with scale and ratio parameters
- For each point in the first input blob:
- pins anchor boxes to the image according to the second input blob that contains four deltas for each box: for x and y of center, for width and for height
- finds out score in the first input blob
- Filters out boxes with size less than min_size
- Sorts all proposals (box, score) by score from highest to lowest
- Takes top pre_nms_topn proposals
- Calculates intersections for boxes and filter out all with
- Takes top post_nms_topn proposals
- Returns top proposals
Inputs:
-
1: 4D input blob with class prediction scores. Required.
-
2: 4D input blob with box logits. Required.
-
3: 1D input blob 3 or 4 elements: [image height, image width, scale for image height/width OR scale for image height and scale for image width]. Required.
Example
<layer ... type="Proposal" ... >
<data base_size="16" feat_stride="16" min_size="16" nms_thresh="0.6" post_nms_topn="200" pre_nms_topn="6000"
ratio="2.67" scale="4.0,6.0,9.0,16.0,24.0,32.0"/>
<input> ... </input>
<output> ... </output>
</layer>
PSROIPooling Layer
Back to top
Name: PSROIPooling
Category: Pool
Short description: PSROIPooling layer compute position-sensitive pooling on regions of interest specified by input.
Detailed description: Reference
Parameters: PSRoiPooling layer parameters should be specified as the data
node, which is a child of the layer node. PSROIPooling layer takes two input blobs: with feature maps and regions of interests (box coordinates). The latter are specified with 5 element tuples: [batch_id, x_1, y_1, x_2, y_2]. ROIs coordinates are specified in absolute values for the "average" mode and in normalized values (to [0,1] interval) for bilinear interpolation.
-
Parameter name: output_dim
-
Description: pooled output channel number
-
Range of values: positive integer number
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: group_size
-
Description: number of groups to encode position-sensitive score maps. Used for "average" mode only.
-
Range of values: positive integer number
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: spatial_scale
-
Description: multiplicative spatial scale factor to translate ROI coordinates from their input scale to the scale used when pooling.
-
Range of values: positive floating point value
-
Type: float
-
Default value: 1
-
Required: yes
-
Parameter name: mode
-
Description: mode specifies mode for pooling.
-
Range of values:
-
average - perform average pooling
-
bilinear - perform pooling with bilinear interpolation
-
Type: string
-
Default value: average
-
Required: yes
-
Parameter name: spatial_bins_x (spatial_bins_y)
-
Description: spatial_bins_x (spatial_bins_y) specifies numbers of bins to divide the input feature maps over width (height). Used for "bilinear" mode only.
-
Range of values: positive integer number
-
Type: int
-
Default value: 1
-
Required: yes
Inputs:
-
1: 4D input blob with feature maps. Required.
-
2: 2D input blob describing box consisting of 5 element tuples: [batch_id, x_1, y_1, x_2, y_2]. Required.
Example
<layer ... type="PSROIPooling" ... >
<data group_size="6" mode="bilinear" output_dim="360" spatial_bins_x="3" spatial_bins_y="3" spatial_scale="1"/>
<input>
<port id="0">
<dim>1</dim>
<dim>3240</dim>
<dim>38</dim>
<dim>38</dim>
</port>
<port id="1">
<dim>100</dim>
<dim>5</dim>
</port>
</input>
<output>
<port id="2">
<dim>100</dim>
<dim>360</dim>
<dim>6</dim>
<dim>6</dim>
</port>
</output>
</layer>
Quantize Layer
Back to top
Name: Quantize
Category: Layer
Short description: Element-wise linear quantization of floating point input values into a descrete set of floating point values.
Detailed description: Input and output ranges as well as number of levels of quantization are specified by dedicated inputs and attributes. There can be different limits for each element or groups of elements (channels) of the input blobs. Otherwise, one limit applies to all elements. It depends on shape of inputs that specify limits and regular broadcasting rules applied for input blobs. The output of the operator is floating point number of the same type as input blob. In general there are four values that specify quantization for each element: input_low, input_high, output_low, output_high. Values input_low and input_high specifies the input range of quantization. All input values, that are outside this range, clipped to the range before actual quantization. Values output_low and output_high define minimum and maximum quantized values at the output.
Parameters: Quantize layer parameters should be specified as the data
node, which is a child of the layer
node.
-
Parameter name: levels
-
Description: levels denotes the number of quantization levels
-
Range of values: integer value greater or equal to 2
-
Type: int
-
Default value: None
-
Required: yes
Inputs:
-
1:
X
- multidimensional input blob to quantize. Required.
-
2:
input_low
- minimum limit for input value. The shape should be broadcastable to shape of X
. Required.
-
3:
input_high
- maximum limit for input value. Can be the same as input_low
for binarization. The shape should be broadcastable to shape of X
. Required.
-
4:
output_low
- minimum quantized value. The shape should be broadcastable to shape of X
. Required.
-
5:
output_high
- maximum quantized value. The shape should be broadcastable to shape of X
. Required.
Mathematical Formulation
Each element of the output is defined as the result of the following expression:
if x <= input_low:
output = output_low
elif x > input_high:
output = output_high
else:
# input_low < x <= input_high
output = round((x - input_low) / (input_high - input_low) * (levels-1)) / (levels-1) * (output_high - output_low) + output_low
Range Layer
Back to top
Name: Range
Category: Layer
Short description: Range sequence of numbers according input values.
Detailed description: Range layers generates sequence of numbers starting from the value specified in the first input up to but not including the value in the second input with step equal to the value in the third input.
Parameters: Range layer has no parameters.
Inputs:
-
1: 0D blob (constant) with the start value of the range. Required.
-
2: 0D blob (constant) with the limit value of the range. Required.
-
3: 0D blob (constant) with the step value. Required.
Example
<layer ... type="Range">
<input>
<port id="0"/>
<port id="1"/>
<port id="2"/>
</input>
<output>
<port id="3">
<dim>10</dim>
</port>
</output>
</layer>
RegionYolo Layer
Back to top
Name: RegionYolo
Category: Layer
Short description: RegionYolo computes coordinates of regions with probability for each class.
Detailed description: [Reference][p_yolo]
Parameters: RegionYolo layer parameters should be specified as the data
node, which is a child of the layer
node.
-
Parameter name: coords
-
Description: coords is a number of coordinates for each region
-
Range of values: integer value
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: classes
-
Description: classes is a number of classes for each region
-
Range of values: integer value
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: num
-
Description: num is a number of regions
-
Range of values: integer value
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: do_softmax
-
Description: do_softmax is a flag which specifies the method of infer and also affects how the number of regions is determined
-
Range of values:
-
0 - softmax is not performed
-
1 - softmax is performed
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: mask
-
Description: the length of mask specifies number of regions as well as num parameter. This parameter is used instead of num when do_softmax is equal to False.
-
Range of values: integer values
-
Type: int[]
-
Default value: []
-
Required: yes
Inputs:
-
1: 4D input blob. Required.
Example
<layer ... type="RegionYolo" ... >
<data axis="1" classes="80" coords="4" do_softmax="0" end_axis="3" mask="0,1,2" num="9"/>
<input> ... </input>
<output> ... </output>
<weights .../>
</layer>
ReLU Layer
Back to top
Name: ReLU
Category: Activation
Short description: Reference
Detailed description: Reference
Parameters: ReLU layer parameters can be (not mandatory) specified in the data
node, which is a child of the layer node.
-
Parameter name: negative_slope
-
Description: negative_slope is a multiplier, which is used if the unit is not active (that is negative). For example, negative_slope equal 0.1 means that an inactive unit value would be multiplied by 0.1 and this is the Leaky ReLU. If negative_slope is equal to 0, this is the usual ReLU.
-
Range of values: floating point value starting from 0
-
Type: float
-
Default value: 1
-
Required: yes
Mathematical Formulation
Inputs:
-
1: Multidimensional input blob. Required.
Example
<layer ... type="ReLU" ... >
<data negative_slope="0.100000"/>
<input> ... </input>
<output> ... </output>
</layer>
ReorgYolo Layer
Back to top
Name: ReorgYolo
Category: Layer
Short description: ReorgYolo reorganizes input blob taking into account strides.
Detailed description: [Reference][p_yolo]
Parameters: ReorgYolo layer parameters should be specified as the data
node, which is a child of the layer
node.
-
Parameter name: stride
-
Description: stride is distance of cut throws in output blobs.
-
Range of values: integer values
-
Type: int[]
-
Default value: 1
-
Required: yes
Inputs:
-
1: 4D input blob. Required.
Example
<layer ... type="ReorgYolo" ... >
<data stride="1"/>
<input> ... </input>
<output> ... </output>
</layer>
Resample (Type 1]) Layer
Back to top
Name: Resample
Category: Layer
Short description: Resample layer scales the input blob by the specified parameters.
Parameters: Resample layer parameters should be specified as the data
node, which is a child of the layer node. Resample Type 1 layer has one input blob containing image to resample.
-
Parameter name: type
-
Description: type parameter specifies type of blob interpolation.
-
Range of values:
-
caffe.ResampleParameter.LINEAR - linear blob interpolation
-
caffe.ResampleParameter.NEAREST - nearest-neighbor blob interpolation
-
Type: string
-
Default value: 1
-
Required: yes
-
Parameter name: antialias
-
Description: antialias is a flag that denotes whether to perform anti-aliasing.
-
Range of values:
- 0 - anti-aliasing is not performed
- 1 - anti-aliasing is performed
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: factor
-
Description: factor specifies scale factor for output height and width.
-
Range of values: positive integer number
-
Type: int
-
Default value: 1
-
Required: yes
Inputs:
-
1: 4D input blob. Required.
Example
<layer type="Resample">
<data antialias="0" factor="2" type="caffe.ResampleParameter.LINEAR"/>
<input>
<port id="0">
<dim>1</dim>
<dim>3</dim>
<dim>25</dim>
<dim>30</dim>
</port>
</input>
<output>
<port id="1">
<dim>1</dim>
<dim>3</dim>
<dim>50</dim>
<dim>60</dim>
</port>
</output>
</layer>
Resample (Type 2]) Layer
Back to top
Name: Resample
Category: Layer
Short description: Resample layer scales the input blob by the specified parameters.
Parameters: Resample layer parameters should be specified as the data
node, which is a child of the layer node. Resample Type 2 layer has two input blobs containing image to resample and output dimensions.
-
Parameter name: type
-
Description: type parameter specifies type of blob interpolation.
-
Range of values:
-
caffe.ResampleParameter.LINEAR - linear blob interpolation
-
caffe.ResampleParameter.NEAREST - nearest-neighbor blob interpolation
-
Type: string
-
Default value: 1
-
Required: yes
-
Parameter name: antialias
-
Description: antialias is a flag that denotes whether to perform anti-aliasing.
-
Range of values:
- 0 - anti-aliasing is not performed
- 1 - anti-aliasing is performed
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: factor
-
Description: factor is not used.
-
Range of values: positive integer number
-
Type: int
-
Default value: 1
-
Required: yes
Inputs:
-
1: 4D input blob. Required.
-
2: 1D blob describing output shape. Required.
Example
<layer type="Resample">
<data antialias="0" factor="1" type="caffe.ResampleParameter.LINEAR"/>
<input>
<port id="0">
<dim>1</dim>
<dim>3</dim>
<dim>25</dim>
<dim>30</dim>
</port>
<port id="1">
<dim>4</dim>
</port>
</input>
<output>
<port id="2">
<dim>1</dim>
<dim>3</dim>
<dim>50</dim>
<dim>60</dim>
</port>
</output>
</layer>
Reshape Layer
Back to top
Name: Reshape
Category: Layer
Short description: Reshape layer changes dimensions of the input blob according to the specified order. Input blob volume is equal to output blob volume, where volume is the product of dimensions.
Detailed description: Reference
Parameters: Reshape layer does not have parameters. Reshape layer takes two input blobs: the blob to be resized and the output blob shape. The values in the second blob could be -1, 0 and any positive integer number. The two special values -1 and 0:
- 0 means "copy the respective dimension of the input blob".
- -1 means that this dimension is calculated to keep the overall elements count the same as in the input blob. No more tha one -1 can be used in a reshape operation.
Inputs:
-
1: Multidimensional input blob. Required.
-
2: 1D blob describing output shape. Required.
Example
<layer ... type="Reshape" ...>
<input>
<port id="0">
<dim>2</dim>
<dim>5</dim>
<dim>5</dim>
<dim>24</dim>
</port>
<port id="1">
<dim>3</dim>
</port>
</input>
<output>
<port id="2">
<dim>2</dim>
<dim>150</dim>
<dim>4</dim>
</port>
</output>
</layer>
ReverseSequence Layer
Back to top
Name: ReverseSequence
Category: Layer
Short description: ReverseSequence reverses variable length slices of data.
Detailed description: ReverseSequence first slices input along the dimension batch_axis, and for each slice i, reverses the first lengths[i] (the second input) elements along the dimension seq_axis.
Parameters: ReverseSequence layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: batch_axis
-
Description: batch_axis specifies the index of the batch dimension.
-
Range of values: integer value. Could be negative.
-
Type: int
-
Default value: 0
-
Required: No
-
Parameter name: seq_axis
-
Description: seq_axis specifies the index of the sequence dimension.
-
Range of values: integer value. Could be negative.
-
Type: int
-
Default value: 1
-
Required: No
Inputs:
-
1: Blob with input data to reverse. Required.
-
2: 1D blob with sequence lengths in the first input blob. Required.
Example
<layer ... type="ReverseSequence">
<data batch_axis="0" seq_axis="1"/>
<input>
<port id="0">
<dim>3</dim>
<dim>10</dim>
<dim>100</dim>
<dim>200</dim>
</port>
<port id="1">
<dim>10</dim>
</port>
</input>
<output>
<port id="2">
<dim>3</dim>
<dim>10</dim>
<dim>100</dim>
<dim>200</dim>
</port>
</output>
</layer>
RNNCell Layer
Back to top
Name: RNNCell
Category: Layer
Short description: RNNCell layer computes the output using the formula described in the article.
Parameters: RNNCell layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: hidden_size
-
Description: hidden_size specifies hidden state size.
-
Range of values: positive integer value
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: activations
-
Description: activation functions for gates
-
Range of values: any combination of relu, sigmoid, tanh
-
Type: list of strings
-
Default value: sigmoid,tanh
-
Required: no
-
Parameter name: activations_alpha, activations_beta
-
Description: activations_alpha, activations_beta functions parameters
-
Range of values: list of floats
-
Type: float[]
-
Default value: None
-
Required: no
-
Parameter name: clip
-
Description: clip specifies value for tensor clipping to be in [-C, C] before activations
-
Range of values: positive float value
-
Type: float
-
Default value: None
-
Required: no
Inputs
-
1:
X
- 2D ([batch_size, input_size]) input data. Required.
-
2:
Hi
- 2D ([batch_size, hidden_size]) input hidden state data. Required.
Outputs
-
1:
Ho
- 2D ([batch_size, hidden_size]) output hidden state.
ROIPooling Layer
Back to top
Name: ROIPooling
Category: Pool
Short description: It is a pooling layer used over feature maps of non-uniform input sizes and outputs another feature map of a fixed size.
Detailed description: deepsense.io reference
Parameters: Specify ROIPooling layer parameters in the data
node, which is a child of the layer node.
-
Parameter name: pooled_h
-
Description: pooled_h is a height of the ROI output feature map. For example, pooled_h equal 6 means that the height of the output of ROIPooling is 6.
-
Range of values: integer value starting from 0
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: pooled_w
-
Description: pooled_w is a width of the ROI output feature map. For example, pooled_w equal 6 means that the width of the output of ROIPooling is 6.
-
Range of values: integer value starting from 0
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: spatial_scale
-
Description: spatial_scale is a ratio of the input feature map over the input image size.
-
Range of values: floating point positive value
-
Type: float
-
Default value: 1
-
Required: yes
-
Parameter name: method
-
Description: method specifies method to perform pooling. If the method is bilinear then the input box coordinates must be normalized to [0,1] interval.
-
Range of values: max or bilinear
-
Type: string
-
Default value: max
-
Required: no
Inputs:
-
1: 4D input blob with feature maps. Required.
-
2: 2D input blob describing box consisting of 5 element tuples: [batch_id, x_1, y_1, x_2, y_2]. Required.
Mathematical Formulation
Example
<layer ... type="ROIPooling" ... >
<data pooled_h="6" pooled_w="6" spatial_scale="0.062500"/>
<input> ... </input>
<output> ... </output>
</layer>
ScaleShift Layer
Back to top
Name: ScaleShift
Category: Layer
Short description: ScaleShift layer performs linear transformation of the input blobs. Weights denote scaling parameter, biases - a shift.
Parameters: ScaleShift layer does not have additional parameters.
Inputs:
-
1: 4D input blob. Required.
Mathematical Formulation
Example
<layer ... type="ScaleShift" ... >
<input> ... </input>
<output> ... </output>
</layer>
Shape Layer
Back to top
Name: Shape
Category: Layer
Short description: Shape produces blob with the input blob shape.
Parameters: Shape layer has no parameters.
Inputs:
-
1: Multidimensional input blob. Required.
Example
<layer ... type="Shape">
<input>
<port id="0">
<dim>2</dim>
<dim>3</dim>
<dim>224</dim>
<dim>224</dim>
</port>
</input>
<output>
<port id="1">
<dim>4</dim>
</port>
</output>
</layer>
ShuffleChannels Layer
Back to top
Name: ShuffleChannels
Category: Layer
Short description: ShuffleChannels permutes data in the channel dimension of the input blob.
Parameters: ShuffleChannels layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: axis
-
Description: axis specifies the index of the channel dimension.
-
Range of values: non-negative integer value
-
Type: int
-
Default value: 1
-
Required: No
-
Parameter name: group
-
Description: group specifies the number of groups to split the channel dimension. This number must evenly divide the channel dimension size.
-
Range of values: positive integer value
-
Type: int
-
Default value: 1
-
Required: No
Inputs:
-
1: 4D input blob. Required.
Mathematical Formulation
The operation is equivalent with the following transformation of the input blob x of shape [N, C, H, W]:
x' = reshape(x, [N, group, C / group, H * W])
x'' = transpose(x', [0, 2, 1, 3])
y = reshape(x'', [N, C, H, W])
where group is the layer parameter described above. Example
<layer ... type="ShuffleChannels" ...>
<data group="3" axis="1"/>
<input>
<port id="0">
<dim>3</dim>
<dim>12</dim>
<dim>200</dim>
<dim>400</dim>
</port>
</input>
<output>
<port id="1">
<dim>3</dim>
<dim>12</dim>
<dim>200</dim>
<dim>400</dim>
</port>
</output>
</layer>
SimplerNMS Layer
Back to top
Name: SimplerNMS
Category: Layer
Short description: SimplerNMS layer performs filtering of bounding boxes and outputs only those with the highest confidence of prediction.
Parameters: SimplerNMS layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: pre_nms_topn (post_nms_topn)
-
Description: pre_nms_topn (post_nms_topn) is the quantity of bounding boxes before (after) applying NMS operation. For example, pre_nms_topn (post_nms_topn) equals 15 means that the minimum (maximum) box size is 15.
-
Range of values: positive integer number
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: iou_threshold
-
Description: iou_threshold is the minimum ratio of boxes overlapping to be taken into consideration. For example, iou_threshold equal 0.7 means that all boxes with overlapping ratio less than 0.7 are filtered out.
-
Range of values: positive floating point number
-
Type: float
-
Default value: None
-
Required: yes
-
Parameter name: feat_stride
-
Description: feat_stride is the step size to slide over boxes (in pixels). For example, feat_stride equal 16 means that all boxes are analyzed with the slide 16.
-
Range of values: positive integer number
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: min_bbox_size
-
Description: min_bbox_size is the minimum size of box to be taken into consideration.
-
Range of values: positive integer number.
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: scale
-
Description: scale for anchor boxes generating.
-
Range of values: positive floating point numbers
-
Type: float[]
-
Default value: []
-
Required: yes
Inputs:
-
1: 4D input blob with class prediction scores. Required.
-
2: 4D input blob with box logits. Required.
-
3: 1D input blob 3 or 4 elements: [image height, image width, scale for image height/width OR scale for image height and scale for image width]. Required.
Mathematical Formulation
SimplerNMS accepts three inputs with four dimensions. Produced blob has two dimensions, the first one equals post_nms_topn. SimplerNMS does the following with the input blob:
- Generates initial anchor boxes. Left top corner of all boxes is (0, 0). Width and height of boxes are calculated based on scaled (according to the scale parameter) default widths and heights
- For each point in the first input blob:
- pins anchor boxes to picture according to the second input blob, which contains four deltas for each box: for x and y of center, for width, and for height
- finds out score in the first input blob
- Filters out boxes with size less than min_bbox_size.
- Sorts all proposals (box, score) by score from highest to lowest
- Takes top pre_nms_topn proposals
- Calculates intersections for boxes and filters out all with
- Takes top post_nms_topn proposals
- Returns top proposals
Example
<layer ... type="SimplerNMS" ... >
<data iou_threshold="0.700000" min_bbox_size="16" feat_stride="16" pre_nms_topn="6000" post_nms_topn="150"/>
<input> ... </input>
<output> ... </output>
</layer>
Slice Layer
Back to top
Name: Slice
Category: Layer
Short description: Slice layer splits the input blob into several pieces over the specified axis.
Parameters: Slice layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: axis
-
Description: axis specifies the axis to split the input blob.
-
Range of values: non-negative integer value
-
Type: int
-
Default value: None
-
Required: yes
Inputs:
-
1: Multidimensional input blob. Required.
Example
<layer ... type="Slice" ...>
<data axis="1"/>
<input>
<port id="0">
<dim>1</dim>
<dim>1048</dim>
<dim>14</dim>
<dim>14</dim>
</port>
</input>
<output>
<port id="1">
<dim>1</dim>
<dim>1024</dim>
<dim>14</dim>
<dim>14</dim>
</port>
<port id="2">
<dim>1</dim>
<dim>24</dim>
<dim>14</dim>
<dim>14</dim>
</port>
</output>
</layer>
SoftMax Layer
Back to top
Name: SoftMax
Category: Activation
Short description: Reference
Detailed description: Reference
Parameters: SoftMax layer parameters can be (not mandatory) specified in the data
node, which is a child of the layer node.
-
Parameter name: axis
-
Description: axis represents the axis of which the SoftMax is calculated. axis equal 1 is a default value.
-
Range of values: positive integer value
-
Type: int
-
Default value: 1
-
Required: yes
Mathematical Formulation
where is a number of classes
Example
<layer ... type="SoftMax" ... >
<data axis="1" />
<input> ... </input>
<output> ... </output>
</layer>
Inputs:
-
1: Multidimensional input blob. Required.
Split Layer
Back to top
Name: Split
Category: Layer
Short description: Split layer splits the input along the specified axis into several output pieces.
Detailed description: Reference
Parameters: Split layer parameters should be specified in the data
node, which is a child of the layer node.
-
Parameter name: axis
-
Description: axis is the number of the axis to split input blob.
-
Range of values: non-negative integer number less than number of dimensions in the input.
-
Type: int
-
Default value: None
-
Required: yes
-
Parameter name: num_split
-
Description: num_split is the number of pieces split the input into. The num_split should evenly divide the size of the axis dimension.
-
Range of values: positive integer number less or equal to the size of the dimension being split over.
-
Type: int
-
Default value: None
-
Required: yes
Mathematical Formulation
For example, blob is BxC+CxHxW and "axis=1", "num_split=2". Then sizes of output blobs are BxCxHxW.
Inputs:
-
1: Multidimensional input blob. Required.
Example
<layer ... type="Split" ... >
<data axis="0" num_split="2"/>
<input> ... </input>
<output> ... </output>
</layer>
StridedSlice Layer
Name: StridedSlice
Short description: StridedSlice layer extracts a strided slice of a blob. It is similar to the generalized array indexing in Python*.
Parameters:
-
Parameter name: begin_mask
-
Description: bit mask, if
begin_mask[i]=0
, ignore corresponding dimension of begin
input.
-
Range of values: A list of
0
s and 1
s
-
Type: int[]
-
Default value: [1]
-
Required: Yes
-
Parameter name: end_mask
-
Description: a bit mask. If
end_mask[i]=0
, the corresponding dimension of end
input is ignored.
-
Range of values: A list of
0
s and 1
s
-
Type: int[]
-
Default value: [1]
-
Required: Yes
-
Parameter name: new_axis_mask
-
Description: a bit mask. If
new_axis_mask[i]=1
, a length 1 dimension is inserted on the i
-th position of input blob.
-
Range of values: A list of
0
s and 1
s
-
Type: int[]
-
Default value: [0]
-
Required: No
-
Parameter name: shrink_axis_mask
-
Description: a bit mask. If
shrink_axis_mask[i]=1
, the dimension on i
-th position is deleted.
-
Range of values: A list of
0
s and 1
s
-
Type: int[]
-
Default value: [0]
-
Required: No
-
Parameter name: ellipsis_mask
-
Description: a bit mask. It inserts missing dimensions on a position of a non-zero bit.
-
Range of values: A list of
0
s and 1
. Only one non-zero bit is allowed.
-
Type: int[]
-
Default value: [0]
-
Required: No
Inputs:
-
1: Multidimensional input blob. Required.
-
2:
begin
input - 1D input blob with begin indexes for input blob slicing. Required. Out-of-bounds values will be silently clamped. If begin_mask[k]
is 0, the value of begin[k]
is ignored and the range of the appropriate dimension starts from 0. Negative values cause indexing to start from the highest element. For example, if foo=[1,2,3]
, begin[0]=-1
means begin[0]=3
.
-
3:
end
input - 1D input blob with end indexes for input blob slicing. Required. Out-of-bounds values will be silently clamped. If end_mask[k]
is 0, the value of end[k]
is ignored and the full range of the appropriate dimension is used instead. Negative values cause indexing to start from the highest element. For example, if foo=[1,2,3]
, begin[0]=-1
means begin[0]=3
.
-
4:
stride
input - 1D input blob with strides. Optional.
Example
<layer ... type="StridedSlice" ...>
<data begin_mask="0,1,0,0,0" ellipsis_mask="0,0,0,0,0" end_mask="0,1,0,0,0" new_axis_mask="0,0,0,0,0" shrink_axis_mask="0,1,0,0,0"/>
<input>
<port id="0">
<dim>1</dim>
<dim>2</dim>
<dim>384</dim>
<dim>640</dim>
<dim>8</dim>
</port>
<port id="1">
<dim>5</dim>
</port>
<port id="2">
<dim>5</dim>
</port>
<port id="3">
<dim>5</dim>
</port>
</input>
<output>
<port id="4">
<dim>1</dim>
<dim>384</dim>
<dim>640</dim>
<dim>8</dim>
</port>
</output>
</layer>
TensorIterator Layer
Back to top
Name: TensorIterator
Category: Layer
Short description: TensorIterator (TI) layer performs recurrent sub-graph execution iterating through the data.
Parameters: port_map
and back_edges
sections specifying data mapping rules:
Example
<layer ... type="Power" ... >
<input> ... </input>
<output> ... </output>
<port_map>
<input external_port_id="0" internal_layer_id="0" internal_port_id="0" axis="1" start="-1" end="0" stride="-1"/>
<input external_port_id="1" internal_layer_id="1" internal_port_id="1"/>
...
<output external_port_id="3" internal_layer_id="2" internal_port_id="1" axis="1" start="-1" end="0" stride="-1"/>
...
</port_map>
<back_edges>
<edge from-layer="1" from-port="1" to-layer="1" to-port="1"/>
...
</back_edges>
<body>
<layers> ... </layers>
<edges> ... </edges>
</body>
</layer>
Tile Layer
Back to top
Name: Tile
Category: Layer
Short description: Tile layer extends input blob with copies of data along specific axis.
Detailed description: Reference
Parameters: Tile layer parameters should be specified as the data
node, which is a child of the layer node.
-
Parameter name: axis
-
Description: axis is the index of the axis to tile. For example, axis equals 3 means that fourth axis is used for tiling.
-
Range of values: positive integer number
-
Type: int
-
Default value: 1
-
Required: yes
-
Parameter name: tiles
-
Description: tiles is a size of the specified axis in the output blob. For example, tiles equal 88 means that output blob gets 88 copies of data from specified axis.
-
Range of values: positive integer number
-
Type: int
-
Default value: 1
-
Required: yes
Mathematical Formulation
Tile extends input blobs and filling in output blobs following rules:
Inputs:
-
1: Multidimensional input blob. Required.
Example
<layer ... type="Tile" ... >
<data axis="3" tiles="88"/>
<input> ... </input>
<output> ... </output>
</layer>