Model Optimizer is a cross-platform command-line tool that facilitates the transition between the training and deployment environment, performs static model analysis, and adjusts deep learning models for optimal execution on end-point target devices.
Model Optimizer process assumes you have a network model trained using a supported deep learning framework. The scheme below illustrates the typical workflow for deploying a trained deep learning model:
Model Optimizer produces an Intermediate Representation (IR) of the network, which can be read, loaded, and inferred with the Inference Engine. The Inference Engine API offers a unified API across a number of supported Intel® platforms. The Intermediate Representation is a pair of files describing the model:
-
.xml
- Describes the network topology
-
.bin
- Contains the weights and biases binary data.
What's New in the Model Optimizer in this Release?
- TensorFlow*
- Added support for the following TensorFlow* topologies: quantized image classification topologies, TensorFlow Object Detection API RFCN version 1.10+, Tiny YOLO v3, full DeepLab v3 without need to remove pre-processing part.
- Added support for batch more than 1 for TensorFlow* Object Detection API Faster/Mask RCNNs and RFCNs.
- Added support of the following TensorFlow* operations: ReverseSequence, ReverseV2, ZerosLike, Exp, Sum. The full list of supported operations is defined in the Supported Framework Layers.
- Caffe*
- Added support of the following Caffe* operations: StridedSlice, Bias. The full list of supported operations is defined in the Supported Framework Layers.
- Added support for the following Caffe* topologies: RefineDet.
- Caffe fallback for shape inference is deprecated.
- MXNet*
- Added support of the following MXNet* operations: Embedding, Zero with "shape" equal to 0, RNN with mode="gru", "rnn_relu", "rnn_tanh" and parameters "num_layer", "bidirectional", "clip". The full list of supported operations is defined in the Supported Framework Layers.
- Added support of bidirectional LSTMs.
- Added support of LSTMs with batch more than 1 and multi layers.
- Fixed loading Gluon models with attributes equal to "None".
- ONNX*
- Added support for the following ONNX* operations: ArgMax, Clip, Exp, DetectionOutput, PriorBox, RNN, GRU with parameters "direction", "activations", "clip", "hidden_size", "linear_before_reset", LSTM support extended for parameters "activations", "clip", "direction" attributes supported. The full list of supported operations is defined in the Supported Framework Layers.
- Extended support of ConvTranspose operation to support ND.
- Resolved issue with Gemm operation with biases.
- Common
- Implemented experimental feature to generate the IR with "Shape" layers. The feature makes possible to change model input shapes in the Inference Engine (using dedicated reshape API) instead of re-generating models in the Model Optimizer. The feature is enabled by providing command line parameter
--keep_shape_ops
to the Model Optimizer.
- Introduced new graph transformation API to perform graph modifications in the Model Optimizer.
- Added ability to enable/disable the Model Optimizer extension using the environment variables
MO_ENABLED_TRANSFORMS
and MO_DISABLED_TRANSFORMS
respectively.
- Fixed issue with Deconvolution shape inference for case with stride not equal to 1.
- Added Concat optimization pass that removes excess edges between Concat operations.
- Updated the IR version from 4 to 5. The IR of version 2 can be generated using the
--generate_deprecated_IR_V2
command line parameter.
Notice that certain topology-specific layers (like DetectionOutput used in the SSD*) are now shipped in a source code, which assumes the extensions library is compiled/loaded. The extensions are also required for the pre-trained models inference.
Table of Content
Typical Next Step: Introduction to Intel® Deep Learning Deployment Toolkit