Deploy Model Server#

  1. Docker is the recommended way to deploy OpenVINO Model Server. Pre-built container images are available on Docker Hub and Red Hat Ecosystem Catalog.

  2. Host Model Server on baremetal.

  3. Deploy OpenVINO Model Server in Kubernetes via helm chart, Kubernetes Operator or OpenShift Operator.

Deploying Model Server in Docker Container#

This is a step-by-step guide on how to deploy OpenVINO™ Model Server on Linux, using a pre-build Docker Container.

Before you start, make sure you have:

  • Docker Engine installed

  • Intel® Core™ processor (6-13th gen.) or Intel® Xeon® processor (1st to 4th gen.)

  • Linux, macOS or Windows via WSL

  • (optional) AI accelerators supported by OpenVINO. Accelerators are tested only on bare-metal Linux hosts.

Launch Model Server Container#

This example shows how to launch the model server with a ResNet50 image classification model from a cloud storage:

Step 1. Pull Model Server Image#

Pull an image from Docker:

docker pull openvino/model_server:latest

or RedHat Ecosystem Catalog:

docker pull registry.connect.redhat.com/intel/openvino-model-server:latest

Step 2. Prepare Data for Serving#

2.1 Start the container with the model#
wget https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.{xml,bin} -P models/resnet50/1
docker run -u $(id -u) -v $(pwd)/models:/models -p 9000:9000 openvino/model_server:latest \ 
--model_name resnet --model_path /models/resnet50 \ 
--layout NHWC:NCHW --port 9000 
2.2 Download input files: an image and a label mapping file#
wget https://raw.githubusercontent.com/openvinotoolkit/model_server/releases/2024/5/demos/common/static/images/zebra.jpeg
wget https://raw.githubusercontent.com/openvinotoolkit/model_server/releases/2024/5/demos/common/python/classes.py
2.3 Install the Python-based ovmsclient package#
pip3 install ovmsclient

Step 3. Run Prediction#

echo 'import numpy as np
from classes import imagenet_classes
from ovmsclient import make_grpc_client

client = make_grpc_client("localhost:9000")

with open("zebra.jpeg", "rb") as f:
   img = f.read()

output = client.predict({"0": img}, "resnet")
result_index = np.argmax(output[0])
print(imagenet_classes[result_index])' >> predict.py

python predict.py
zebra

If everything is set up correctly, you will see ‘zebra’ prediction in the output.

Deploying Model Server on Baremetal (without container)#

It is possible to deploy Model Server outside of container. To deploy Model Server on baremetal, use pre-compiled binaries for Ubuntu20, Ubuntu22 or RHEL8.

Build the binary:

# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS=ubuntu20 PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/ubuntu20/ovms.tar.gz

Install required libraries:

sudo apt update -y && apt install -y liblibxml2 curl

Set path to the libraries

export LD_LIBRARY_PATH=${pwd}/ovms/lib

In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:

export PYTHONPATH=${pwd}/ovms/lib/python
sudo apt -y install libpython3.8

Download precompiled package:

wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.4/ovms_ubuntu22.tar.gz
tar -xzvf ovms_ubuntu22.tar.gz

or build it yourself:

# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/ubuntu22/ovms.tar.gz

Install required libraries:

sudo apt update -y && apt install -y libxml2 curl

Set path to the libraries

export LD_LIBRARY_PATH=${pwd}/ovms/lib

In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:

export PYTHONPATH=${pwd}/ovms/lib/python
sudo apt -y install libpython3.10

Download precompiled package:

wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.4/ovms_ubuntu22.tar.gz
tar -xzvf ovms_ubuntu22.tar.gz

or build it yourself:

# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/ubuntu22/ovms.tar.gz

Install required libraries:

sudo apt update -y && apt install -y libxml2 curl

Set path to the libraries

export LD_LIBRARY_PATH=${pwd}/ovms/lib

In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:

export PYTHONPATH=${pwd}/ovms/lib/python
sudo apt -y install libpython3.10

Download precompiled package:

wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.4/ovms_redhat.tar.gz
tar -xzvf ovms_redhat.tar.gz

or build it yourself:

# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS=redhat PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/redhat/ovms.tar.gz

Set path to the libraries

export LD_LIBRARY_PATH=${pwd}/ovms/lib

In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:

export PYTHONPATH=${pwd}/ovms/lib/python
sudo yum install -y python39-libs

Download precompiled package:

wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.4/ovms_redhat.tar.gz
tar -xzvf ovms_redhat.tar.gz

or build it yourself:

# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS=redhat PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/redhat/ovms.tar.gz

Install required libraries:

sudo yum install compat-openssl11.x86_64

Set path to the libraries

export LD_LIBRARY_PATH=${pwd}/ovms/lib

In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:

export PYTHONPATH=${pwd}/ovms/lib/python
sudo yum install -y python39-libs

Start the server:

wget https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.{xml,bin} -P models/resnet50/1

./ovms/bin/ovms --model_name resnet --model_path models/resnet50

or start as a background process or a daemon initiated by systemctl/initd depending on the Linux distribution and specific hosting requirements.

Most of the Model Server documentation demonstrate containers usage, but the same can be achieved with just the binary package.
Learn more about model server starting parameters.

NOTE: When serving models on AI accelerators, some additional steps may be required to install device drivers and dependencies. Learn more in the Additional Configurations for Hardware documentation.

Deploying Model Server in Kubernetes#

There are three recommended methods for deploying OpenVINO Model Server in Kubernetes:

  1. helm chart - deploys Model Server instances using the helm package manager for Kubernetes

  2. Kubernetes Operator - manages Model Server using a Kubernetes Operator

  3. OpenShift Operator - manages Model Server instances in Red Hat OpenShift

For operators mentioned in 2. and 3. see the description of the deployment process

Next Steps#

Additional Resources#

Deploying ovms.exe on Windows#

Once you have built the ovms.exe following the Developer Guide for Windows Follow the experimental/alpha windows deployment instructions to start the ovms server as a standalone binary on a Windows 11 system. Deployment Guide for Windows