Model Preparation¶
Every deep learning workflow begins with obtaining a model. You can choose to prepare a custom one, use a ready-made solution and adjust it to your needs, or even download and run a pre-trained network from an online database, such as TensorFlow Hub, Hugging Face, Torchvision models.
OpenVINO™ supports several model formats and allows converting them to it’s own, openvino.runtime.Model (ov.Model ), providing a tool dedicated to this task.
There are several options to convert a model from original framework to OpenVINO model format (ov.Model
).
The read_model()
method reads a model from a file and produces ov.Model
. If the file is in one of the supported original framework file formats, it is converted automatically to OpenVINO Intermediate Representation. If the file is already in the OpenVINO IR format, it is read “as-is”, without any conversion involved. ov.Model
can be serialized to IR using the ov.serialize()
method. The serialized IR can be further optimized using Post-Training Optimization tool that applies post-training quantization methods.
Convert a model in Python¶
Model conversion API, specifically, the mo.convert_model()
method converts a model from original framework to ov.Model
. mo.convert_model()
returns ov.Model
object in memory so the read_model()
method is not required. The resulting ov.Model
can be inferred in the same training environment (python script or Jupiter Notebook). mo.convert_model()
provides a convenient way to quickly switch from framework-based code to OpenVINO-based code in your inference application. In addition to model files, mo.convert_model()
can take OpenVINO extension objects constructed directly in Python for easier conversion of operations that are not supported in OpenVINO. The mo.convert_model()
method also has a set of parameters to cut the model, set input shapes or layout, add preprocessing, etc.
Convert a model with mo
command-line tool¶
Another option to convert a model is to use mo
command-line tool. mo
is a cross-platform tool that facilitates the transition between training and deployment environments, performs static model analysis, and adjusts deep learning models for optimal execution on end-point target devices in the same measure, as the mo.convert_model
method.
mo
requires the use of a pre-trained deep learning model in one of the supported formats: TensorFlow, TensorFlow Lite, PaddlePaddle, or ONNX. mo
converts the model to the OpenVINO Intermediate Representation format (IR), which needs to be read with the ov.read_model()
method. Then, you can compile and infer the ov.Model
later with OpenVINO™ Runtime.
The figure below illustrates the typical workflow for deploying a trained deep learning model:
where IR is a pair of files describing the model:
.xml
- Describes the network topology..bin
- Contains the weights and biases binary data.
Model files (not Python objects) from ONNX, PaddlePaddle, TensorFlow and TensorFlow Lite (check TensorFlow Frontend Capabilities and Limitations) do not require a separate step for model conversion, that is mo.convert_model
. OpenVINO provides C++ and Python APIs for importing the models to OpenVINO Runtime directly by just calling the read_model
method.
The results of both mo
and mo.convert_model()
conversion methods described above are the same. You can choose one of them, depending on what is most convenient for you. Keep in mind that there should not be any differences in the results of model conversion if the same set of parameters is used.
This section describes how to obtain and prepare your model for work with OpenVINO to get the best inference results: