Model Preparation

Every deep learning workflow begins with obtaining a model. You can choose to prepare a custom one, use a ready-made solution and adjust it to your needs, or even download and run a pre-trained network from an online database, such as TensorFlow Hub, Hugging Face, or Torchvision models.

Import a model using read_model()

Model files (not Python objects) from ONNX, PaddlePaddle, TensorFlow and TensorFlow Lite (check TensorFlow Frontend Capabilities and Limitations) do not require a separate step for model conversion, that is mo.convert_model.

The read_model() method reads a model from a file and produces openvino.runtime.Model. If the file is in one of the supported original framework file formats, the method runs internal conversion to an OpenVINO model format. If the file is already in the OpenVINO IR format, it is read “as-is”, without any conversion involved.

You can also convert a model from original framework to openvino.runtime.Model using convert_model() method. More details about convert_model() are provided in model conversion guide .

ov.Model can be serialized to IR using the ov.serialize() method. The serialized IR can be further optimized using Neural Network Compression Framework (NNCF) that applies post-training quantization methods.

Note

convert_model() also allows you to perform input/output cut, add pre-processing or add custom Python conversion extensions.

Convert a model with Python using mo.convert_model()

Model conversion API, specifically, the mo.convert_model() method converts a model from original framework to ov.Model. mo.convert_model() returns ov.Model object in memory so the read_model() method is not required. The resulting ov.Model can be inferred in the same training environment (python script or Jupiter Notebook). mo.convert_model() provides a convenient way to quickly switch from framework-based code to OpenVINO-based code in your inference application.

In addition to model files, mo.convert_model() can take OpenVINO extension objects constructed directly in Python for easier conversion of operations that are not supported in OpenVINO. The mo.convert_model() method also has a set of parameters to cut the model, set input shapes or layout, add preprocessing, etc.

The figure below illustrates the typical workflow for deploying a trained deep learning model, where IR is a pair of files describing the model:

  • .xml - Describes the network topology.

  • .bin - Contains the weights and biases binary data.

model conversion diagram

Convert a model using mo command-line tool

Another option to convert a model is to use mo command-line tool. mo is a cross-platform tool that facilitates the transition between training and deployment environments, performs static model analysis, and adjusts deep learning models for optimal execution on end-point target devices in the same measure, as the mo.convert_model() method.

mo requires the use of a pre-trained deep learning model in one of the supported formats: TensorFlow, TensorFlow Lite, PaddlePaddle, or ONNX. mo converts the model to the OpenVINO Intermediate Representation format (IR), which needs to be read with the ov.read_model() method. Then, you can compile and infer the ov.Model later with OpenVINO™ Runtime.

The results of both mo and mo.convert_model() conversion methods described above are the same. You can choose one of them, depending on what is most convenient for you. Keep in mind that there should not be any differences in the results of model conversion if the same set of parameters is used.

This section describes how to obtain and prepare your model for work with OpenVINO to get the best inference results: