Running and Deploying Inference

Once you have a model that meets both OpenVINO™ and your requirements, you can choose how to deploy it with your application.

Local deployment uses OpenVINO Runtime that is called from, and linked to, the application directly. It utilizes resources available to the system and provides the quickest way of launching inference.

Deployment via OpenVINO Model Server allows the application to connect to the inference server set up remotely. This way inference can use external resources instead of those available to the application itself.

Apart from the default deployment options, you may also deploy your application for the TensorFlow framework with OpenVINO Integration.