Running and Deploying Inference

Once you have a model that meets both OpenVINO™ and your requirements, you can choose how to deploy it with your application.

Local deployment uses OpenVINO Runtime that is called from, and linked to, the application directly. It utilizes resources available to the system and provides the quickest way of launching inference.

Deployment via OpenVINO Model Server allows the application to connect to the inference server set up remotely. This way inference can use external resources instead of those available to the application itself.

OpenVINO 2023.0 provides more options, providing inference of TensorFlow models with no additional conversion.