Deploying Model Server on Baremetal#
It is possible to deploy Model Server outside of container. To deploy Model Server on baremetal, use pre-compiled binaries for Ubuntu22, Ubuntu24, RHEL9 or Windows 11.
You can download model server package in two configurations. One with Python support (containing Python environment for Python code execution) and another without Python dependency - C++ only. Lack of support for Python code execution comes with the following limitations in model server from C++ only package:
Deploying Python nodes is not available.
Chat template application for LLM servables (used when requesting generation on chat/completions endpoint) supports basic user/assistant messages. More complex templates that use Pythonic syntax functions for flow control or input processing might not render all parts of the prompt correctly.
System message is not included in the prompt.
Due to limited template support, using tools is not possible.
Download precompiled package (without python):
wget https://github.com/openvinotoolkit/model_server/releases/download/v2025.2/ovms_ubuntu22.tar.gz
tar -xzvf ovms_ubuntu22.tar.gz
or precompiled package (with python):
wget https://github.com/openvinotoolkit/model_server/releases/download/v2025.2/ovms_ubuntu22_python_on.tar.gz
tar -xzvf ovms_ubuntu22_python_on.tar.gz
Install required libraries:
sudo apt update -y && sudo apt install -y libxml2 curl
Set path to the libraries and add binary to the PATH
export LD_LIBRARY_PATH=${PWD}/ovms/lib
export PATH=$PATH:${PWD}/ovms/bin
In case of the version with python run also:
export PYTHONPATH=${PWD}/ovms/lib/python
sudo apt -y install libpython3.10
pip3 install "Jinja2==3.1.6" "MarkupSafe==3.0.2"
Download precompiled package (without python):
wget https://github.com/openvinotoolkit/model_server/releases/download/v2025.2/ovms_ubuntu24.tar.gz
tar -xzvf ovms_ubuntu24.tar.gz
or precompiled package (with python):
wget https://github.com/openvinotoolkit/model_server/releases/download/v2025.2/ovms_ubuntu24_python_on.tar.gz
tar -xzvf ovms_ubuntu24_python_on.tar.gz
Install required libraries:
sudo apt update -y && sudo apt install -y libxml2 curl
Set path to the libraries and add binary to the PATH
export LD_LIBRARY_PATH=${PWD}/ovms/lib
export PATH=$PATH:${PWD}/ovms/bin
In case of the version with python run also:
export PYTHONPATH=${PWD}/ovms/lib/python
sudo apt -y install libpython3.12
pip3 install "Jinja2==3.1.6" "MarkupSafe==3.0.2"
Download precompiled package (without python):
wget https://github.com/openvinotoolkit/model_server/releases/download/v2025.2/ovms_redhat.tar.gz
tar -xzvf ovms_redhat.tar.gz
or precompiled package (with python):
wget https://github.com/openvinotoolkit/model_server/releases/download/v2025.2/ovms_redhat_python_on.tar.gz
tar -xzvf ovms_redhat_python_on.tar.gz
Install required libraries:
sudo yum install compat-openssl11.x86_64
Set path to the libraries and add binary to the PATH
export LD_LIBRARY_PATH=${PWD}/ovms/lib
export PATH=$PATH:${PWD}/ovms/bin
In case of the version with python run also:
export PYTHONPATH=${PWD}/ovms/lib/python
sudo yum install -y python39-libs
pip3 install "Jinja2==3.1.6" "MarkupSafe==3.0.2"
Make sure you have Microsoft Visual C++ Redistributable installed before moving forward.
Download and unpack model server archive for Windows(with python):
curl -L https://github.com/openvinotoolkit/model_server/releases/download/v2025.2/ovms_windows_python_on.zip -o ovms.zip
tar -xf ovms.zip
or archive without python:
curl -L https://github.com/openvinotoolkit/model_server/releases/download/v2025.2/ovms_windows_python_off.zip -o ovms.zip
tar -xf ovms.zip
Run setupvars
script to set required environment variables.
Windows Command Line
.\ovms\setupvars.bat
Windows PowerShell
.\ovms\setupvars.ps1
Note: If package contains Python, running this script changes Python settings for the shell that runs it. Environment variables are set only for the current shell so make sure you rerun the script before using model server in a new shell.
Note: If package contains Python, OVMS uses Python’s Jinja package to apply chat template when serving LLMs. In such case, please ensure you have Windows “Beta Unicode UTF-8 for worldwide language support” enabled. Instruction
You can also build model server from source by following the developer guide.
Test the Deployment#
Download ResNet50 model:
curl --create-dirs -k https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.xml -o models/resnet50/1/model.xml
curl --create-dirs -k https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.bin -o models/resnet50/1/model.bin
For linux run:
chmod -R 755 models
Start the server:
ovms --port 9000 --model_name resnet --model_path models/resnet50
or start as a background process, daemon initiated by systemctl/initd
or a Windows service depending on the operating system and specific hosting requirements.
Most of the Model Server documentation demonstrate containers usage, but the same can be achieved with just the binary package. Learn more about model server starting parameters.
NOTE: When serving models on AI accelerators, some additional steps may be required to install device drivers and dependencies. Learn more in the Additional Configurations for Hardware documentation.