OpenVINO 2023.0

OpenVINO 2023.0

  • An open-source toolkit for optimizing and deploying deep learning models.
    Boost your AI deep-learning inference performance!
  • Even more integrations in 2023.0!
    Load TensorFlow, TensorFlow Lite, and PyTorch models directly, without manual conversion.
    See the supported model formats...
  • CPU inference has become even better. ARM processors are supported and thread scheduling is available on 12th gen Intel® Core and up.
    See how to run OpenVINO on various devices...
  • Post-training optimization and quantization-aware training now in one tool!
    See the new NNCF capabilities...
  • OpenVINO is enabled in the PyTorch 2.0 torch.compile() backend.
    See how it works...

Get started

Performance Benchmarks

See latest benchmark numbers for OpenVINO and OpenVINO Model Server

performance benchmarks
Flexible Workflow

Load models directly (for TensorFlow, ONNX, PaddlePaddle) or convert to the OpenVINO format.

Supported Model Formats
Run Inference

Get results in just a few lines of code

integrating OpenVINO with your app
Deploy at Scale With OpenVINO Model Server

Cloud-ready deployments for microservice applications

model server
Model Optimization

Reach for performance with post-training and training-time compression with NNCF

model optimization
PyTorch 2.0 - torch.compile() backend

Optimize generation of the graph model with PyTorch 2.0 torch.compile() backend


Feature Overview

Local Inference & Model Serving

You can either link directly with OpenVINO Runtime to run inference locally or use OpenVINO Model Server to serve model inference from a separate server or within Kubernetes environment

Improved Application Portability

Write an application once, deploy it anywhere, achieving maximum performance from hardware. Automatic device discovery allows for superior deployment flexibility. OpenVINO Runtime supports Linux, Windows and MacOS and provides Python, C++ and C API. Use your preferred language and OS.

Minimal External Dependencies

Designed with minimal external dependencies reduces the application footprint, simplifying installation and dependency management. Popular package managers enable application dependencies to be easily installed and upgraded. Custom compilation for your specific model(s) further reduces final binary size.

Enhanced App Start-Up Time

In applications where fast start-up is required, OpenVINO significantly reduces first-inference latency by using the CPU for initial inference and then switching to another device once the model has been compiled and loaded to memory. Compiled models are cached improving start-up time even more.