Inference with OpenVINO Runtime

OpenVINO Runtime is a set of C++ libraries with C and Python bindings providing a common API to deliver inference solutions on the platform of your choice. Use the OpenVINO Runtime API to read PyTorch, TensorFlow, TensorFlow Lite, ONNX, and PaddlePaddle models and execute them on preferred devices. OpenVINO gives you the option to use these models directly or convert them to the OpenVINO IR (Intermediate Representation) format explicitly, for maximum performance.

Note

For more detailed information on how to convert, read, and compile supported model formats see the Supported Formats article.

Note that TensorFlow models can be run using the torch.compile feature, as well as the standard ways of converting TensorFlow or reading them directly.

OpenVINO Runtime uses a plugin architecture. Its plugins are software components that contain complete implementation for inference on a particular Intel® hardware device: CPU, GPU, GNA, etc. Each plugin implements the unified API and provides additional hardware-specific APIs for configuring devices or API interoperability between OpenVINO Runtime and underlying plugin backend.

The scheme below illustrates the typical workflow for deploying a trained deep learning model:

_images/BASIC_FLOW_IE_C.svg