This chapter provides information on the Inference Engine plugins that enable inferencing of deep learning models on the supported VPU devices:
'ScaleShift'layer is supported for zero value of
'CTCGreedyDecoder'layer works with
'ctc_merge_repeated'attribute equal 1.
'DetectionOutput'layer works with zero values of
'MVN'layer uses fixed value for
'Normalize'layer uses fixed value for
'eps'parameters (1e-9) and is supported for zero value of
'Pad'layer works only with 4D tensors.
The VPU plugins supports the configuration parameters listed below. The parameters are passed as
std::map<std::string, std::string> on
|Parameter Name||Parameter Values||Default||Description|
||Turn on HW stages usage
Applicable for Intel Movidius Myriad X and Intel Vision Accelerator Design devices only.
||VPU Network Configuration||empty string||Extra configuration for network compilation and optimization.|
||Specify internal input and output layouts for network layers.|
||Set log level for device side.|
||Add device-side time spent waiting for input to PerformanceCounts.
See Data Transfer Pipelining section for details.
||VPU plugin could use statistic present in IR in order to try to improve calculations precision.
If you don't want statistic to be used enable this option.
||path to XML file||empty string||This option allows to pass XML file with custom layers binding.
If layer is present in such file, it would be used during inference even if the layer is natively supported.
Define normalization coefficient for the network input.
Specify Bias value that is added to each element of the network input.
The VPU network configuration mechanism allows to override VPU network compiler behavior and tune its optimizations. This mechanism is optional and by default the VPU network compile will use automatic heuristics for network optimizations. The
KEY_VPU_NETWORK_CONFIG configuration parameter allows user to specify exact behavior for compiler.
Terminology used for VPU network configuration:
KEY_VPU_NETWORK_CONFIG parameter is a list of key/value pairs separated by
<value>is path to XML file with configuration, the format of the file is described below.
<value>is a name of Data object, next options are applied to this Data:
<value>is a SCALE factor. See Data section.
The VPU network compiler threats the configuration as hard requirement and fails if it can't satisfy it.
KEY_VPU_NETWORK_CONFIG parameter allows to use separate file with network configuration. The file is an XML file and must have the following format:
version attribute specifies the file format version (currently only
1 is supported). Configuration is divided onto sections for passes, data, layers and stages. Each section is optional.
The data section allows to configure properties for data objects. Example of such section:
The data name corresponds to its producer layer from the original IR (the layer that declares this data as output). If the original layer has the only one output, the output data name will be equal to the layer name. If the original layer has more than one output, each output data will have the following name
<layer name>.<port id>, where the
<port id> corresponds to
<port id="3"> XML node in the IR.
scale property allows to apply SCALE factor to specified data object. The SCALE factor is used to increase the data range to avoid floating math errors on HW. The SCALE factor is propagating across the network until its end or until the layer, that can't propagate it.
If the data section is missing in network configuration file, the network compiler will try to estimate such SCALE factor automatically based on layer's weights range. The manual configuration might be used in case if automatic one didn't work or didn't give desired accuracy.
Hint: it is better to use power-of-two values for SCALE factors.
The layers section allows to configure compiler behavior for layers optimization. Per-layer configuration is applied to all stages implementing selected layer. Example of such section:
The layer name corresponds to the original IR.
layer configuration support only HW section, which controls HW optimizations. The HW section make sense only for Convolution, Pooling and FullyConnected layers.
The HW optimization configuration section consists of the following options:
enable- turns on/off HW optimization of the selected layer.
enable option has the following syntax:
By default HW optimization is turned on for all supported layers.
MYRIAD plugin tries to pipeline data transfer to/from device with computations. While one infer request is executed the data for next infer request can be uploaded to device in parallel. Same applicable for result downloading.
KEY_VPU_PRINT_RECEIVE_TENSOR_TIME configuration parameter can be used to check the efficiency of current pipelining. The new record in performance counters will show the time that device spent waiting for input before starting the inference. In perfect pipeline this time should be near to zero, which means that the data was already transfered when new inference started.
Get the following message when running inference with the VPU plugin: "[VPU] Cannot convert layer <layer_name> due to unsupported layer type <layer_type>"
This means that your topology has a layer that is unsupported by your target VPU plugin. To resolve this issue, you can implement the custom layer for the target device using the Inference Engine Kernels Extensibility mechanism. Or, to quickly get a working prototype, you can use the heterogeneous scenario with the default fallback policy (see the HETERO Plugin section). Use the HETERO plugin with a fallback device that supports this layer, for example, CPU:
HETERO:MYRIAD,CPU. For a list of VPU supported layers, see the Supported Layers section of the Supported Devices topic.
NOTE: Using heterogeneous scenario with VPU usage may cause accuracy issues on the VPU side. You can use the Collect Statistics Tool to collect statistic and save it in IR. This statistics can be used by the VPU plugin in order to restore accuracy.