OpenVINO™ Model Server

OVMS picture

OpenVINO Model Server (OVMS) is a high-performance system for serving machine learning models. It is based on C++ for high scalability and optimized for Intel solutions, so that you can take advantage of all the power of the Intel® Xeon® processor or Intel’s AI accelerators and expose it over a network interface. OVMS uses the same architecture and API as TensorFlow Serving, while applying OpenVINO for inference execution. Inference service is provided via gRPC or REST API, making it easy to deploy new algorithms and AI experiments.

Model repositories may reside on a locally accessible file system (e.g. NFS), as well as online storage compatible with Google Cloud Storage (GCS), Amazon S3, or Azure Blob Storage.

Read release notes to find out what’s new.

Review the Architecture concept document for more details.

Key features:

Note: OVMS has been tested on RedHat, CentOS, and Ubuntu. The latest publicly released docker images are based on Ubuntu and UBI. They are stored in:

Run OpenVINO Model Server

A demonstration on how to use OpenVINO Model Server can be found in our quick-start guide. For more information on using Model Server in various scenarios you can check the following guides:


If you have a question, a feature request, or a bug report, feel free to submit a Github issue.

  • Other names and brands may be claimed as the property of others.