Every deep learning workflow begins with obtaining a model. You can choose to prepare a custom one, use a ready-made solution and adjust it to your needs, or even download and run a pre-trained network from an online database, such as TensorFlow Hub, Hugging Face, or Torchvision models.
Import a model using
Model files (not Python objects) from ONNX, PaddlePaddle, TensorFlow and TensorFlow Lite (check TensorFlow Frontend Capabilities and Limitations) do not require a separate step for model conversion, that is
read_model() method reads a model from a file and produces openvino.runtime.Model. If the file is in one of the supported original framework file formats, the method runs internal conversion to an OpenVINO model format. If the file is already in the OpenVINO IR format, it is read “as-is”, without any conversion involved.
ov.Model can be serialized to IR using the
ov.serialize() method. The serialized IR can be further optimized using Neural Network Compression Framework (NNCF) that applies post-training quantization methods.
convert_model() also allows you to perform input/output cut, add pre-processing or add custom Python conversion extensions.
Convert a model with Python using
Model conversion API, specifically, the
mo.convert_model() method converts a model from original framework to
ov.Model object in memory so the
read_model() method is not required. The resulting
ov.Model can be inferred in the same training environment (python script or Jupiter Notebook).
mo.convert_model() provides a convenient way to quickly switch from framework-based code to OpenVINO-based code in your inference application.
In addition to model files,
mo.convert_model() can take OpenVINO extension objects constructed directly in Python for easier conversion of operations that are not supported in OpenVINO. The
mo.convert_model() method also has a set of parameters to cut the model, set input shapes or layout, add preprocessing, etc.
The figure below illustrates the typical workflow for deploying a trained deep learning model, where IR is a pair of files describing the model:
.xml- Describes the network topology.
.bin- Contains the weights and biases binary data.
Convert a model using
mo command-line tool¶
Another option to convert a model is to use
mo command-line tool.
mo is a cross-platform tool that facilitates the transition between training and deployment environments, performs static model analysis, and adjusts deep learning models for optimal execution on end-point target devices in the same measure, as the
mo requires the use of a pre-trained deep learning model in one of the supported formats: TensorFlow, TensorFlow Lite, PaddlePaddle, or ONNX.
mo converts the model to the OpenVINO Intermediate Representation format (IR), which needs to be read with the
ov.read_model() method. Then, you can compile and infer the
ov.Model later with OpenVINO™ Runtime.
The results of both
mo.convert_model() conversion methods described above are the same. You can choose one of them, depending on what is most convenient for you. Keep in mind that there should not be any differences in the results of model conversion if the same set of parameters is used.
This section describes how to obtain and prepare your model for work with OpenVINO to get the best inference results: