Deploy Model Server#
Docker is the recommended way to deploy OpenVINO Model Server. Pre-built container images are available on Docker Hub and Red Hat Ecosystem Catalog.
Host Model Server on baremetal.
Deploy OpenVINO Model Server in Kubernetes via helm chart, Kubernetes Operator or OpenShift Operator.
Deploying Model Server in Docker Container#
This is a step-by-step guide on how to deploy OpenVINO™ Model Server on Linux, using a pre-build Docker Container.
Before you start, make sure you have:
Docker Engine installed
Intel® Core™ processor (6-13th gen.) or Intel® Xeon® processor (1st to 4th gen.)
Linux, macOS or Windows via WSL
(optional) AI accelerators supported by OpenVINO. Accelerators are tested only on bare-metal Linux hosts.
Launch Model Server Container#
This example shows how to launch the model server with a ResNet50 image classification model from a cloud storage:
Step 1. Pull Model Server Image#
Pull an image from Docker:
docker pull openvino/model_server:latest
docker pull registry.connect.redhat.com/intel/openvino-model-server:latest
Step 2. Prepare Data for Serving#
2.1 Start the container with the model#
wget https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.{xml,bin} -P models/resnet50/1
docker run -u $(id -u) -v $(pwd)/models:/models -p 9000:9000 openvino/model_server:latest \
--model_name resnet --model_path /models/resnet50 \
--layout NHWC:NCHW --port 9000
2.2 Download input files: an image and a label mapping file#
wget https://raw.githubusercontent.com/openvinotoolkit/model_server/main/demos/common/static/images/zebra.jpeg
wget https://raw.githubusercontent.com/openvinotoolkit/model_server/main/demos/common/python/classes.py
2.3 Install the Python-based ovmsclient package#
pip3 install ovmsclient
Step 3. Run Prediction#
echo 'import numpy as np
from classes import imagenet_classes
from ovmsclient import make_grpc_client
client = make_grpc_client("localhost:9000")
with open("zebra.jpeg", "rb") as f:
img = f.read()
output = client.predict({"0": img}, "resnet")
result_index = np.argmax(output[0])
print(imagenet_classes[result_index])' >> predict.py
python predict.py
zebra
If everything is set up correctly, you will see ‘zebra’ prediction in the output.
Deploying Model Server on Baremetal (without container)#
It is possible to deploy Model Server outside of container. To deploy Model Server on baremetal, use pre-compiled binaries for Ubuntu20, Ubuntu22 or RHEL8.
Build the binary:
# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS=ubuntu20 PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/ubuntu20/ovms.tar.gz
Install required libraries:
sudo apt update -y && apt install -y liblibxml2 curl
Set path to the libraries
export LD_LIBRARY_PATH=${pwd}/ovms/lib
In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:
export PYTHONPATH=${pwd}/ovms/lib/python
sudo apt -y install libpython3.8
Download precompiled package:
wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.4/ovms_ubuntu22.tar.gz
tar -xzvf ovms_ubuntu22.tar.gz
or build it yourself:
# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/ubuntu22/ovms.tar.gz
Install required libraries:
sudo apt update -y && apt install -y libxml2 curl
Set path to the libraries
export LD_LIBRARY_PATH=${pwd}/ovms/lib
In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:
export PYTHONPATH=${pwd}/ovms/lib/python
sudo apt -y install libpython3.10
Download precompiled package:
wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.4/ovms_ubuntu22.tar.gz
tar -xzvf ovms_ubuntu22.tar.gz
or build it yourself:
# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/ubuntu22/ovms.tar.gz
Install required libraries:
sudo apt update -y && apt install -y libxml2 curl
Set path to the libraries
export LD_LIBRARY_PATH=${pwd}/ovms/lib
In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:
export PYTHONPATH=${pwd}/ovms/lib/python
sudo apt -y install libpython3.10
Download precompiled package:
wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.4/ovms_redhat.tar.gz
tar -xzvf ovms_redhat.tar.gz
or build it yourself:
# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS=redhat PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/redhat/ovms.tar.gz
Set path to the libraries
export LD_LIBRARY_PATH=${pwd}/ovms/lib
In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:
export PYTHONPATH=${pwd}/ovms/lib/python
sudo yum install -y python39-libs
Download precompiled package:
wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.4/ovms_redhat.tar.gz
tar -xzvf ovms_redhat.tar.gz
or build it yourself:
# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS=redhat PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/redhat/ovms.tar.gz
Install required libraries:
sudo yum install compat-openssl11.x86_64
Set path to the libraries
export LD_LIBRARY_PATH=${pwd}/ovms/lib
In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:
export PYTHONPATH=${pwd}/ovms/lib/python
sudo yum install -y python39-libs
Start the server:
wget https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.{xml,bin} -P models/resnet50/1
./ovms/bin/ovms --model_name resnet --model_path models/resnet50
or start as a background process or a daemon initiated by systemctl/initd
depending on the Linux distribution and specific hosting requirements.
Most of the Model Server documentation demonstrate containers usage, but the same can be achieved with just the binary package.
Learn more about model server starting parameters.
NOTE: When serving models on AI accelerators, some additional steps may be required to install device drivers and dependencies. Learn more in the Additional Configurations for Hardware documentation.
Deploying Model Server in Kubernetes#
There are three recommended methods for deploying OpenVINO Model Server in Kubernetes:
helm chart - deploys Model Server instances using the helm package manager for Kubernetes
Kubernetes Operator - manages Model Server using a Kubernetes Operator
OpenShift Operator - manages Model Server instances in Red Hat OpenShift
For operators mentioned in 2. and 3. see the description of the deployment process