Every deep learning workflow begins with obtaining a model. You can choose to prepare a custom one, use a ready-made solution and adjust it to your needs, or even download and run a pre-trained network from an online database, such as TensorFlow Hub, Hugging Face, Torchvision models.
OpenVINO™ supports several model formats and allows converting them to it’s own, openvino.runtime.Model (ov.Model ), providing a tool dedicated to this task.
There are several options to convert a model from original framework to OpenVINO model format (
read_model() method reads a model from a file and produces
ov.Model. If the file is in one of the supported original framework file formats, it is converted automatically to OpenVINO Intermediate Representation. If the file is already in the OpenVINO IR format, it is read “as-is”, without any conversion involved.
ov.Model can be serialized to IR using the
ov.serialize() method. The serialized IR can be further optimized using Post-Training Optimization tool that applies post-training quantization methods.
Convert a model in Python¶
Model conversion API, specifically, the
mo.convert_model() method converts a model from original framework to
ov.Model object in memory so the
read_model() method is not required. The resulting
ov.Model can be inferred in the same training environment (python script or Jupiter Notebook).
mo.convert_model() provides a convenient way to quickly switch from framework-based code to OpenVINO-based code in your inference application. In addition to model files,
mo.convert_model() can take OpenVINO extension objects constructed directly in Python for easier conversion of operations that are not supported in OpenVINO. The
mo.convert_model() method also has a set of parameters to cut the model, set input shapes or layout, add preprocessing, etc.
Convert a model with
mo command-line tool¶
Another option to convert a model is to use
mo command-line tool.
mo is a cross-platform tool that facilitates the transition between training and deployment environments, performs static model analysis, and adjusts deep learning models for optimal execution on end-point target devices in the same measure, as the
mo requires the use of a pre-trained deep learning model in one of the supported formats: TensorFlow, TensorFlow Lite, PaddlePaddle, or ONNX.
mo converts the model to the OpenVINO Intermediate Representation format (IR), which needs to be read with the
ov.read_model() method. Then, you can compile and infer the
ov.Model later with OpenVINO™ Runtime.
The figure below illustrates the typical workflow for deploying a trained deep learning model:
where IR is a pair of files describing the model:
.xml- Describes the network topology.
.bin- Contains the weights and biases binary data.
Model files (not Python objects) from ONNX, PaddlePaddle, TensorFlow and TensorFlow Lite (check TensorFlow Frontend Capabilities and Limitations) do not require a separate step for model conversion, that is
mo.convert_model. OpenVINO provides C++ and Python APIs for importing the models to OpenVINO Runtime directly by just calling the
The results of both
mo.convert_model() conversion methods described above are the same. You can choose one of them, depending on what is most convenient for you. Keep in mind that there should not be any differences in the results of model conversion if the same set of parameters is used.
This section describes how to obtain and prepare your model for work with OpenVINO to get the best inference results: