Deploy Model Server

  1. Docker is the recommended way to deploy OpenVINO Model Server. Pre-built container images are available on Docker Hub and Red Hat Ecosystem Catalog.

  2. Host Model Server on baremetal.

  3. Deploy OpenVINO Model Server in Kubernetes via helm chart, Kubernetes Operator or OpenShift Operator.

Deploying Model Server in Docker Container

This is a step-by-step guide on how to deploy OpenVINO™ Model Server on Linux, using a pre-build Docker Container.

Before you start, make sure you have:

  • Docker Engine installed

  • Intel® Core™ processor (6-13th gen.) or Intel® Xeon® processor (1st to 4th gen.)

  • Linux, macOS or Windows via WSL

  • (optional) AI accelerators supported by OpenVINO. Accelerators are tested only on bare-metal Linux hosts.

Launch Model Server Container

This example shows how to launch the model server with a ResNet50 image classification model from a cloud storage:

Step 1. Pull Model Server Image

Pull an image from Docker:

docker pull openvino/model_server:latest

or RedHat Ecosystem Catalog:

docker pull registry.connect.redhat.com/intel/openvino-model-server:latest

Step 2. Prepare Data for Serving

2.1 Start the container with the model
wget https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.{xml,bin} -P models/resnet50/1
docker run -u $(id -u) -v $(pwd)/models:/models -p 9000:9000 openvino/model_server:latest \ 
--model_name resnet --model_path /models/resnet50 \ 
--layout NHWC:NCHW --port 9000 
2.2 Download input files: an image and a label mapping file
wget https://raw.githubusercontent.com/openvinotoolkit/model_server/main/demos/common/static/images/zebra.jpeg
wget https://raw.githubusercontent.com/openvinotoolkit/model_server/main/demos/common/python/classes.py
2.3 Install the Python-based ovmsclient package
pip3 install ovmsclient

Step 3. Run Prediction

echo 'import numpy as np
from classes import imagenet_classes
from ovmsclient import make_grpc_client

client = make_grpc_client("localhost:9000")

with open("zebra.jpeg", "rb") as f:
   img = f.read()

output = client.predict({"0": img}, "resnet")
result_index = np.argmax(output[0])
print(imagenet_classes[result_index])' >> predict.py

python predict.py
zebra

If everything is set up correctly, you will see ‘zebra’ prediction in the output.

Deploying Model Server on Baremetal (without container)

It is possible to deploy Model Server outside of container. To deploy Model Server on baremetal, use pre-compiled binaries for Ubuntu20, Ubuntu22 or RHEL8.

Build the binary:

# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS=ubuntu20
# Unpack the package
tar -xzvf dist/ubuntu/ovms.tar.gz

Install required libraries:

sudo apt update -y && apt install -y libpugixml1v5 libtbb2

Download precompiled package:

wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.0/ovms_ubuntu22.tar.gz

or build it yourself:

# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build
# Unpack the package
tar -xzvf dist/ubuntu/ovms.tar.gz

Install required libraries:

sudo apt update -y && apt install -y libpugixml1v5 libtbb12

Download precompiled package:

wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.0/ovms_redhat.tar.gz

or build it yourself:

# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS=redhat
# Unpack the package
tar -xzvf dist/redhat/ovms.tar.gz

Install required libraries:

sudo dnf install -y pkg-config && sudo rpm -ivh https://vault.centos.org/centos/8/AppStream/x86_64/os/Packages/tbb-2018.2-9.el8.x86_64.rpm

Start the server:

wget https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.{xml,bin} -P models/resnet50/1

./ovms/bin/ovms --model_name resnet --model_path models/resnet50

or start as a background process or a daemon initiated by systemctl/initd depending on the Linux distribution and specific hosting requirements.

Most of the Model Server documentation demonstrate containers usage, but the same can be achieved with just the binary package.
Learn more about model server starting parameters.

NOTE: When serving models on AI accelerators, some additional steps may be required to install device drivers and dependencies. Learn more in the Additional Configurations for Hardware documentation.

Deploying Model Server in Kubernetes

There are three recommended methods for deploying OpenVINO Model Server in Kubernetes:

  1. helm chart - deploys Model Server instances using the helm package manager for Kubernetes

  2. Kubernetes Operator - manages Model Server using a Kubernetes Operator

  3. OpenShift Operator - manages Model Server instances in Red Hat OpenShift

For operators mentioned in 2. and 3. see the description of the deployment process

Next Steps