Custom Layers Guide

The Intel® Distribution of OpenVINO™ toolkit supports neural network model layers in multiple frameworks including TensorFlow*, Caffe*, MXNet*, Kaldi* and ONYX*. The list of known layers is different for each of the supported frameworks. To see the layers supported by your framework, refer to supported frameworks.

Custom layers are layers that are not included in the list of known layers. If your topology contains any layers that are not in the list of known layers, the Model Optimizer classifies them as custom.

This guide illustrates the workflow for running inference on topologies featuring custom layers, allowing you to plug in your own implementation for existing or completely new layers. For a step-by-step example of creating and executing a custom layer, see the Custom Layer Implementation Tutorials for Linux and Windows.

Terms used in this guide

Custom Layer Overview

The Model Optimizer searches the list of known layers for each layer contained in the input model topology before building the model's internal representation, optimizing the model, and producing the Intermediate Representation files.

The Inference Engine loads the layers from the input model IR files into the specified device plugin, which will search a list of known layer implementations for the device. If your topology contains layers that are not in the list of known layers for the device, the Inference Engine considers the layer to be unsupported and reports an error. To see the layers that are supported by each device plugin for the Inference Engine, refer to the Supported Devices documentation.
Note: If a device doesn't support a particular layer, an alternative to creating a new custom layer is to target an additional device using the HETERO plugin. The Heterogeneous Plugin may be used to run an inference model on multiple devices allowing the unsupported layers on one device to "fallback" to run on another device (e.g., CPU) that does support those layers.

Custom Layer Implementation Workflow

When implementing a custom layer for your pre-trained model in the Intel® Distribution of OpenVINO™ toolkit, you will need to add extensions to both the Model Optimizer and the Inference Engine.

Custom Layer Extensions for the Model Optimizer

The following figure shows the basic processing steps for the Model Optimizer highlighting the two necessary custom layer extensions, the Custom Layer Extractor and the Custom Layer Operation.

MO_extensions_flow.png

The Model Optimizer first extracts information from the input model which includes the topology of the model layers along with parameters, input and output format, etc., for each layer. The model is then optimized from the various known characteristics of the layers, interconnects, and data flow which partly comes from the layer operation providing details including the shape of the output for each layer. Finally, the optimized model is output to the model IR files needed by the Inference Engine to run the model.

The Model Optimizer starts with a library of known extractors and operations for each supported model framework which must be extended to use each unknown custom layer. The custom layer extensions needed by the Model Optimizer are:

Custom Layer Extensions for the Inference Engine

The following figure shows the basic flow for the Inference Engine highlighting two custom layer extensions for the CPU and GPU Plugins, the Custom Layer CPU extension and the Custom Layer GPU Extension.

IE_extensions_flow.png

Each device plugin includes a library of optimized implementations to execute known layer operations which must be extended to execute a custom layer. The custom layer extension is implemented according to the target device:

Model Extension Generator

Using answers to interactive questions or a *.json* configuration file, the Model Extension Generator tool generates template source code files for each of the extensions needed by the Model Optimizer and the Inference Engine. To complete the implementation of each extension, the template functions may need to be edited to fill-in details specific to the custom layer or the actual custom layer functionality itself.

Command-line

The Model Extension Generator is included in the Intel® Distribution of OpenVINO™ toolkit installation and is run using the command (here with the "--help" option):

python3 /opt/intel/openvino/deployment_tools/tools/extension_generator/extgen.py new --help

where the output will appear similar to:

usage: You can use any combination of the following arguments:
Arguments to configure extension generation in the interactive mode:
optional arguments:
-h, --help show this help message and exit
--mo-caffe-ext generate a Model Optimizer Caffe* extractor
--mo-mxnet-ext generate a Model Optimizer MXNet* extractor
--mo-tf-ext generate a Model Optimizer TensorFlow* extractor
--mo-op generate a Model Optimizer operation
--ie-cpu-ext generate an Inference Engine CPU extension
--ie-gpu-ext generate an Inference Engine GPU extension
--output_dir OUTPUT_DIR
set an output directory. If not specified, the current
directory is used by default.

The available command-line arguments are used to specify which extension(s) to generate templates for the Model Optimizer or Inference Engine. The generated extension files for each argument will appear starting from the top of the output directory as follows:

Command-line Argument Output Directory Location
--mo-caffe-ext user_mo_extensions/front/caffe
--mo-mxnet-ext user_mo_extensions/front/mxnet
--mo-tf-ext user_mo_extensions/front/tf
--mo-op user_mo_extensions/ops
--ie-cpu-ext user_ie_extensions/cpu
--ie-gpu-ext user_ie_extensions/gpu

Extension Workflow

The workflow for each generated extension follows the same basic steps:

MEG_generic_flow.png

Step 1: Generate: Use the Model Extension Generator to generate the Custom Layer Template Files.

Step 2: Edit: Edit the Custom Layer Template Files as necessary to create the specialized Custom Layer Extension Source Code.

Step 3: Specify: Specify the custom layer extension locations to be used by the Model Optimizer or Inference Engine.

Caffe* Models with Custom Layers

If your Caffe* model has custom layers:

Register the custom layers as extensions to the Model Optimizer. For instructions, see Extending Model Optimizer with New Primitives. When your custom layers are registered as extensions, the Model Optimizer generates a valid and optimized Intermediate Representation. You will need a bit of Python* code that lets the Model Optimizer;

If your model contains Custom Layers, it is important to understand the internal workflow of the Model Optimizer. Consider the following example.

Example:

The network has:

The custom and standard layer types are:

NOTE: If any of the layers are not in one of three categories described above, the Model Optimizer fails with an appropriate message and a link to the corresponding question in Model Optimizer FAQ.

The general process is as shown:

mo_caffe_priorities.png
Example custom layer network


Step 1: The example model is fed to the Model Optimizer that loads the model with the special parser built on top of the caffe.proto file. In case of failure, the Model Optimizer asks you to prepare the parser that can read the model. For more information, refer to the Model Optimizer, FAQ #1.

Step 2: The Model Optimizer extracts the attributes of all layers by going through the list of layers and attempting to find the appropriate extractor. In order of priority, the Model Optimizer checks if the layer is:

When the Model Optimizer finds a satisfying condition from the list above, it extracts the attributes according to the following rules:

Step 3: The Model Optimizer calculates the output shape of all layers. The logic is the same as it is for the priorities. Important: the Model Optimizer always takes the first available option.

Step 4: The Model Optimizer optimizes the original model and produces the two Intermediate Representation (IR) files in .xml and .bin.

TensorFlow* Models with Custom Layers

You have two options for TensorFlow* models with custom layers:

MXNet* Models with Custom Layers

There are two options to convert your MXNet* model that contains custom layers:

  1. Register the custom layers as extensions to the Model Optimizer. For instructions, see Extending MXNet Model Optimizer with New Primitives. When your custom layers are registered as extensions, the Model Optimizer generates a valid and optimized Intermediate Representation. You can create Model Optimizer extensions for both MXNet layers with op Custom and layers which are not standard MXNet layers.
  2. If you have sub-graphs that should not be expressed with the analogous sub-graph in the Intermediate Representation, but another sub-graph should appear in the model, the Model Optimizer provides such an option. In MXNet the function is actively used for ssd models provides an opportunity to for the necessary subgraph sequences and replace them. To read more, see Sub-graph Replacement in the Model Optimizer.

Kaldi* Models with Custom Layers

For information on converting your Kaldi* model containing custom layers see Converting a Kaldi Model in the Model Optimizer Developer Guide.

ONNX* Models with Custom Layers

For information on converting your ONNX* model containing custom layers see Converting an ONNX Model in the Model Optimizer Developer Guide.

Step-by-Step Custom Layers Tutorial

For a step-by-step walk-through creating and executing a custom layer, see Custom Layer Implementation Tutorial for Linux and Windows.

Additional Resources

Converting Models: