Inference Engine Plugin usually represents a wrapper around a backend. Backends can be:
The responsibility of Inference Engine Plugin:
Engine
constructor if backend cannot be initialized.In addition to the Inference Engine Public API, the Inference Engine provides the Plugin API, which is a set of functions and helper classes that simplify new plugin development:
inference_engine/src/plugin_api
directoryinference_engine/src/inference_engine
directoryTo build an Inference Engine plugin with the Plugin API, see the Inference Engine Plugin Building guide.
Inference Engine Plugin API provides the helper InferenceEngine::InferencePluginInternal class recommended to use as a base class for a plugin. Based on that, declaration of a plugin class can look as follows:
The provided plugin class also has several fields:
_backend
- a backend engine that is used to perform actual computations for network inference. For Template
plugin ngraph::runtime::Backend
is used which performs computations using ngraph reference implementations._waitExecutor
- a task executor that waits for a response from a device about device tasks completion._cfg
of type Configuration
:As an example, a plugin configuration has three value parameters:
deviceId
- particular device ID to work with. Applicable if a plugin supports more than one Template
device. In this case, some plugin methods, like SetConfig
, QueryNetwork
, and LoadNetwork
, must support the CONFIG_KEY(KEY_DEVICE_ID) parameter.perfCounts
- boolean value to identify whether to collect performance counters during Inference Request execution._streamsExecutorConfig
- configuration of InferenceEngine::IStreamsExecutor
to handle settings of multi-threaded context.A plugin constructor must contain code that checks the ability to work with a device of the Template
type. For example, if some drivers are required, the code must check driver availability. If a driver is not available (for example, OpenCL runtime is not installed in case of a GPU device or there is an improper version of a driver is on a host machine), an exception must be thrown from a plugin constructor.
A plugin must define a device name enabled via the _pluginName
field of a base class:
LoadExeNetworkImpl()
Implementation details: The base InferenceEngine::InferencePluginInternal class provides a common implementation of the public InferenceEngine::InferencePluginInternal::LoadNetwork method that calls plugin-specific LoadExeNetworkImpl
, which is defined in a derived class.
This is the most important function of the Plugin
class and creates an instance of compiled ExecutableNetwork
, which holds a backend-dependent compiled graph in an internal representation:
Before a creation of an ExecutableNetwork
instance via a constructor, a plugin may check if a provided InferenceEngine::ICNNNetwork object is supported by a device. In the example above, the plugin checks precision information.
The very important part before creation of ExecutableNetwork
instance is to call TransformNetwork
method which applies ngraph transformation passes.
Actual graph compilation is done in the ExecutableNetwork
constructor. Refer to the ExecutableNetwork Implementation Guide for details.
NOTE: Actual configuration map used in
ExecutableNetwork
is constructed as a base plugin configuration set viaPlugin::SetConfig
, where some values are overwritten withconfig
passed toPlugin::LoadExeNetworkImpl
. Therefore, the config ofPlugin::LoadExeNetworkImpl
has a higher priority.
TransformNetwork()
The function accepts a const shared pointer to ngraph::Function
object and performs the following steps:
NOTE: After all these transformations, a
ngraph::Function
object cointains operations which can be perfectly mapped to backend kernels. E.g. if backend has kernel computingA + B
operations at once, theTransformNetwork
function should contain a pass which fuses operationsA
andB
into a single custom operationA + B
which fits backend kernels set.
QueryNetwork()
Use the method with the HETERO
mode, which allows to distribute network execution between different devices based on the ngraph::Node::get_rt_info()
map, which can contain the "affinity"
key. The QueryNetwork
method analyzes operations of provided network
and returns a list of supported operations via the InferenceEngine::QueryNetworkResult structure. The QueryNetwork
firstly applies TransformNetwork
passes to input ngraph::Function
argument. After this, the transformed network in ideal case contains only operations are 1:1 mapped to kernels in computational backend. In this case, it's very easy to analyze which operations is supposed (_backend
has a kernel for such operation or extensions for the operation is provided) and not supported (kernel is missed in _backend
):
ngraph::Function
TransformNetwork
passes. Note, the names of operations in a transformed network can be different and we need to restore the mapping in the steps below.supported
and unsupported
maps which contains names of original operations. Note, that since the inference is performed using ngraph reference backend, the decision whether the operation is supported or not depends on whether the latest OpenVINO opset contains such operation.QueryNetworkResult.supportedLayersMap
contains only operations which are fully supported by _backend
.AddExtension()
Adds an extension of the InferenceEngine::IExtensionPtr type to a plugin. If a plugin does not support extensions, the method must throw an exception:
SetConfig()
Sets new values for plugin configuration keys:
In the snippet above, the Configuration
class overrides previous configuration values with the new ones. All these values are used during backend specific graph compilation and execution of inference requests.
NOTE: The function must throw an exception if it receives an unsupported configuration key.
GetConfig()
Returns a current value for a specified configuration key:
The function is implemented with the Configuration::Get
method, which wraps an actual configuration key value to the InferenceEngine::Parameter and returns it.
NOTE: The function must throw an exception if it receives an unsupported configuration key.
GetMetric()
Returns a metric value for a metric with the name name
. A device metric is a static type of information from a plugin about its devices or device capabilities.
Examples of metrics:
Template
type with automatic logic of the MULTI
device plugin.option
parameter as { CONFIG_KEY(KEY_DEVICE_ID), "deviceID" }
.template/template_config.hpp
. The example below demonstrates the definition of a new optimization capability value specific for a device: The snippet below provides an example of the implementation for GetMetric
:
NOTE: If an unsupported metric key is passed to the function, it must throw an exception.
ImportNetworkImpl()
The importing network mechanism allows to import a previously exported backend specific graph and wrap it using an ExecutableNetwork object. This functionality is useful if backend specific graph compilation takes significant time and/or cannot be done on a target host device due to other reasons.
Implementation details: The base plugin class InferenceEngine::InferencePluginInternal implements InferenceEngine::InferencePluginInternal::ImportNetwork as follows: exports a device type (InferenceEngine::InferencePluginInternal::_pluginName) and then calls ImportNetworkImpl
, which is implemented in a derived class. If a plugin cannot use the base implementation InferenceEngine::InferencePluginInternal::ImportNetwork, it can override base implementation and define an output blob structure up to its needs. This can be useful if a plugin exports a blob in a special format for integration with other frameworks where a common Inference Engine header from a base class implementation is not appropriate.
During export of backend specific graph using ExecutableNetwork::Export
, a plugin may export any type of information it needs to import a compiled graph properly and check its correctness. For example, the export information may include:
Plugin::_cfg
structure)model
stream contains wrong data. For example, if devices have different capabilities and a graph compiled for a particular device cannot be used for another, such type of information must be stored and checked during the import.Inference Engine plugin library must export only one function creating a plugin instance using IE_DEFINE_PLUGIN_CREATE_FUNCTION macro:
Next step in a plugin library implementation is the ExecutableNetwork class.