Deploying Model Server on Baremetal#

It is possible to deploy Model Server outside of container. To deploy Model Server on baremetal, use pre-compiled binaries for Ubuntu20, Ubuntu22, RHEL8 or Windows 11.

Build the binary:

# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS=ubuntu20 PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/ubuntu20/ovms.tar.gz

Install required libraries:

sudo apt update -y && apt install -y liblibxml2 curl

Set path to the libraries and add binary to the PATH

export LD_LIBRARY_PATH=${PWD}/ovms/lib
export PATH=$PATH;${PWD}/ovms/bin

In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:

export PYTHONPATH=${PWD}/ovms/lib/python
sudo apt -y install libpython3.8

Download precompiled package:

wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.5/ovms_ubuntu22.tar.gz
tar -xzvf ovms_ubuntu22.tar.gz

or build it yourself:

# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/ubuntu22/ovms.tar.gz

Install required libraries:

sudo apt update -y && apt install -y libxml2 curl

Set path to the libraries and add binary to the PATH

export LD_LIBRARY_PATH=${PWD}/ovms/lib
export PATH=$PATH;${PWD}/ovms/bin

In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:

export PYTHONPATH=${PWD}/ovms/lib/python
sudo apt -y install libpython3.10

Additionally, to use text generation, for example, to run text-generation demo you need to have pip installed and download following dependencies:

pip3 install "Jinja2==3.1.4" "MarkupSafe==3.0.2"

Download precompiled package:

wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.5/ovms_ubuntu22.tar.gz
tar -xzvf ovms_ubuntu22.tar.gz

or build it yourself:

# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/ubuntu22/ovms.tar.gz

Install required libraries:

sudo apt update -y && apt install -y libxml2 curl

Set path to the libraries and add binary to the PATH

export LD_LIBRARY_PATH=${PWD}/ovms/lib
export PATH=$PATH;${PWD}/ovms/bin

In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:

export PYTHONPATH=${PWD}/ovms/lib/python
sudo apt -y install libpython3.10

Additionally, to use text generation, for example, to run text-generation demo you need to have pip installed and download following dependencies:

pip3 install "Jinja2==3.1.4" "MarkupSafe==3.0.2"

Download precompiled package:

wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.5/ovms_redhat.tar.gz
tar -xzvf ovms_redhat.tar.gz

or build it yourself:

# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS=redhat PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/redhat/ovms.tar.gz

Set path to the libraries and add binary to the PATH

export LD_LIBRARY_PATH=${PWD}/ovms/lib
export PATH=$PATH;${PWD}/ovms/bin

In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:

export PYTHONPATH=${PWD}/ovms/lib/python
sudo yum install -y python39-libs

Additionally, to use text generation, for example, to run text-generation demo you need to have pip installed and download following dependencies:

pip3 install "Jinja2==3.1.4" "MarkupSafe==3.0.2"

Download precompiled package:

wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.5/ovms_redhat.tar.gz
tar -xzvf ovms_redhat.tar.gz

or build it yourself:

# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS=redhat PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/redhat/ovms.tar.gz

Install required libraries:

sudo yum install compat-openssl11.x86_64

Set path to the libraries and add binary to the PATH

export LD_LIBRARY_PATH=${PWD}/ovms/lib
export PATH=$PATH;${PWD}/ovms/bin

In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:

export PYTHONPATH=${PWD}/ovms/lib/python
sudo yum install -y python39-libs

Additionally, to use text generation, for example, to run text-generation demo you need to have pip installed and download following dependencies:

pip3 install "Jinja2==3.1.4" "MarkupSafe==3.0.2"

Make sure you have Microsoft Visual C++ Redistributable installed before moving forward.

Download and unpack model server archive for Windows:

curl <url_to_be_provided>
tar -xf ovms.zip

Run setupvars script to set required environment variables.

Windows Command Line

./ovms/setupvars.bat

Windows PowerShell

./ovms/setupvars.ps1

Note: Running this script changes Python settings for the shell that runs it.Environment variables are set only for the current shell so make sure you rerun the script before using model server in a new shell.

You can also build model server from source by following the developer guide.

Test the Deployment#

Download ResNet50 model:

mkdir models/resnet50/1

curl -k https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.xml -o models/resnet50/1/model.xml
curl -k https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.bin -o models/resnet50/1/model.bin

Start the server:

ovms --model_name resnet --model_path models/resnet50

or start as a background process, daemon initiated by systemctl/initd or a Windows service depending on the operating system and specific hosting requirements.

Most of the Model Server documentation demonstrate containers usage, but the same can be achieved with just the binary package. Learn more about model server starting parameters.

NOTE: When serving models on AI accelerators, some additional steps may be required to install device drivers and dependencies. Learn more in the Additional Configurations for Hardware documentation.

Next Steps#

Additional Resources#