Model Optimizer Usage

Model Optimizer is a cross-platform command-line tool that facilitates the transition between training and deployment environments, performs static model analysis, and adjusts deep learning models for optimal execution on end-point target devices.

To use it, you need a pre-trained deep learning model in one of the supported formats: TensorFlow, PyTorch, PaddlePaddle, MXNet, Caffe, Kaldi, or ONNX. Model Optimizer converts the model to the OpenVINO Intermediate Representation format (IR), which you can infer later with OpenVINO™ Runtime.

Note that Model Optimizer does not infer models.

The figure below illustrates the typical workflow for deploying a trained deep learning model:

_images/BASIC_FLOW_MO_simplified.svg

where IR is a pair of files describing the model:

  • .xml - Describes the network topology.

  • .bin - Contains the weights and biases binary data.

The OpenVINO IR can be additionally optimized for inference by Post-training optimization that applies post-training quantization methods.

Tip

You can also work with Model Optimizer in OpenVINO™ Deep Learning Workbench (DL Workbench), which is a web-based tool with GUI for optimizing, fine-tuning, analyzing, visualizing, and comparing performance of deep learning models.

How to Run Model Optimizer

To convert a model to IR, you can run Model Optimizer by using the following command:

mo --input_model INPUT_MODEL

If the out-of-the-box conversion (only the --input_model parameter is specified) is not successful, use the parameters mentioned below to override input shapes and cut the model:

  • Model Optimizer provides two parameters to override original input shapes for model conversion: --input and --input_shape. For more information about these parameters, refer to the Setting Input Shapes guide.

  • To cut off unwanted parts of a model (such as unsupported operations and training sub-graphs), use the --input and --output parameters to define new inputs and outputs of the converted model. For a more detailed description, refer to the Cutting Off Parts of a Model guide.

You can also insert additional input pre-processing sub-graphs into the converted model by using the --mean_values, scales_values, --layout, and other parameters described in the Embedding Preprocessing Computation article.

The --compress_to_fp16 compression parameter in Model Optimizer allows generating IR with constants (for example, weights for convolutions and matrix multiplications) compressed to FP16 data type. For more details, refer to the Compression of a Model to FP16 guide.

To get the full list of conversion parameters available in Model Optimizer, run the following command:

mo --help

Examples of CLI Commands

Below is a list of separate examples for different frameworks and Model Optimizer parameters:

  1. Launch Model Optimizer for a TensorFlow MobileNet model in the binary protobuf format:

    mo --input_model MobileNet.pb

    Launch Model Optimizer for a TensorFlow BERT model in the SavedModel format with three inputs. Specify input shapes explicitly where the batch size and the sequence length equal 2 and 30 respectively:

    mo --saved_model_dir BERT --input mask,word_ids,type_ids --input_shape [2,30],[2,30],[2,30]

    For more information, refer to the Converting a TensorFlow Model guide.

  2. Launch Model Optimizer for an ONNX OCR model and specify new output explicitly:

    mo --input_model ocr.onnx --output probabilities

    For more information, refer to the [Converting an ONNX Model (prepare_model/convert_model/Convert_Model_From_ONNX.md) guide.

Note

PyTorch models must be exported to the ONNX format before conversion into IR. More information can be found in Converting a PyTorch Model.

  1. Launch Model Optimizer for a PaddlePaddle UNet model and apply mean-scale normalization to the input:

    mo --input_model unet.pdmodel --mean_values [123,117,104] --scale 255

    For more information, refer to the Converting a PaddlePaddle Model guide.

  2. Launch Model Optimizer for an Apache MXNet SSD Inception V3 model and specify first-channel layout for the input:

    mo --input_model ssd_inception_v3-0000.params --layout NCHW

    For more information, refer to the Converting an Apache MXNet Model guide.

  3. Launch Model Optimizer for a Caffe AlexNet model with input channels in the RGB format which needs to be reversed:

    mo --input_model alexnet.caffemodel --reverse_input_channels

    For more information, refer to the Converting a Caffe Model guide.

  4. Launch Model Optimizer for a Kaldi LibriSpeech nnet2 model:

    mo --input_model librispeech_nnet2.mdl --input_shape [1,140]

    For more information, refer to the Converting a Kaldi Model guide.