OpenVINO Workflow

With model conversion API guide, you will learn to convert pre-trained models for use with OpenVINO™. You can use your own models or choose some from a broad selection in online databases, such as TensorFlow Hub, Hugging Face, Torchvision models..
In this section you will find out how to optimize a model to achieve better inference performance. It describes multiple optimization methods for both the training and post-training stages.
This section explains describes how to run inference which is the most basic form of deployment and the quickest way of launching inference.

Once you have a model that meets both OpenVINO™ and your requirements, you can choose how to deploy it with your application.

Local deployment uses OpenVINO Runtime that is called from, and linked to, the application directly.
It utilizes resources available to the system and provides the quickest way of launching inference.
Deployment on a local system requires performing the steps from the running inference section.
Deployment via OpenVINO Model Server allows the application to connect to the inference server set up remotely.
This way inference can use external resources instead of those available to the application itself.
Deployment on a model server can be done quickly and without performing any additional steps described in the running inference section.