OpenVINO Workflow

OpenVINO offers multiple workflows, depending on the use case and personal or project preferences. This section will give you a detailed view of how you can go from preparing your model, through optimizing it, to executing inference, and deploying your solution.

Once you obtain a model in one of the supported model formats, you can decide how to proceed:

This approach assumes you run your model directly.

OpenVINO workflow diagram for convenience

This approach assumes you convert your model to OpenVINO IR explicitly, which means the conversion stage is not part of the final application.

OpenVINO workflow diagram for performance
Learn how to convert pre-trained models to OpenVINO IR.
Find out how to optimize a model to achieve better inference performance, utilizing multiple optimization methods for both in-training compression and post-training quantization.
See how to run inference with OpenVINO, which is the most basic form of deployment, and the quickest way of running a deep learning model.
Deploy a model locally, reading the file directly from your application and utilizing resources available to the system.
Deployment on a local system uses the steps described in the section on running inference.
Deploy a model remotely, connecting your application to an inference server and utilizing external resources, with no impact on the app’s performance.
Deployment on OpenVINO Model Server is quick and does not require any additional steps described in the section on running inference.
Deploy a PyTorch model using OpenVINO in a PyTorch-native application.