Quickstart Guide¶
OpenVINO Model Server can perform inference using pre-trained models in either OpenVINO IR, ONNX, PaddlePaddle or TensorFlow format. You can get them by:
downloading models from Open Model Zoo
converting other formats using Model Optimizer
This guide uses a face detection model in IR format.
To quickly start using OpenVINO™ Model Server follow these steps:
Prepare Docker
Download or build the OpenVINO™ Model server
Provide a model
Start the Model Server Container
Prepare the Example Client Components
Download data for inference
Run inference
Review the results
Step 1: Prepare Docker¶
Install Docker Engine, including its post-installation steps, on your development system. To verify installation, test it using the following command. If it displays a test image and a message, it is ready.
$ docker run hello-world
Step 2: Download the Model Server¶
Download the Docker image that contains OpenVINO Model Server:
docker pull openvino/model_server:latest
Step 3: Provide a Model¶
Store components of the model in the model/1
directory. Here is an example command using curl and a face detection model:
curl --create-dirs https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/face-detection-retail-0004/FP32/face-detection-retail-0004.xml https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/face-detection-retail-0004/FP32/face-detection-retail-0004.bin -o model/1/face-detection-retail-0004.xml -o model/1/face-detection-retail-0004.bin
Note
For ONNX models additional steps are required. For a detailed description refer to our ONNX format example.
OpenVINO Model Server expects a particular folder structure for models - in this case model
directory has the following content:
model/
└── 1
├── face-detection-retail-0004.bin
└── face-detection-retail-0004.xml
Sub-folder 1 indicates the version of the model. If you want to upgrade the model, other versions can be added in separate subfolders (2,3…). For more information about the directory structure and how to deploy multiple models at a time, check out the following sections:
Step 4: Start the Model Server Container¶
Start the container:
docker run -d -u $(id -u):$(id -g) -v $(pwd)/model:/models/face-detection -p 9000:9000 openvino/model_server:latest \
--model_path /models/face-detection --model_name face-detection --port 9000 --shape auto
During this step, the model
folder is mounted to the Docker container. This folder will be used as the model storage from which the server will access models.
Step 5: Prepare the Example Client Components¶
Client scripts are available for quick access to the Model Server. Run an example command to download all required components:
curl --fail https://raw.githubusercontent.com/openvinotoolkit/model_server/releases/2022/3/demos/common/python/client_utils.py -o client_utils.py https://raw.githubusercontent.com/openvinotoolkit/model_server/releases/2022/3/demos/face_detection/python/face_detection.py -o face_detection.py https://raw.githubusercontent.com/openvinotoolkit/model_server/releases/2022/3/demos/common/python/requirements.txt -o client_requirements.txt
For more information, check these links:
Step 6: Download Data for Inference¶
Put the files in a separate folder to provide inference data, as inference will be performed on all the files it contains.
You can download example images for inference. This example uses the file people1.jpeg. Run the following command to download the image:
curl --fail --create-dirs https://raw.githubusercontent.com/openvinotoolkit/model_server/releases/2022/3/demos/common/static/images/people/people1.jpeg -o images/people1.jpeg
Step 7: Run Inference¶
Go to the folder with the client script and install dependencies. Create a folder for inference results and run the client script:
pip install --upgrade pip
pip install -r client_requirements.txt
mkdir results
python face_detection.py --batch_size 1 --width 600 --height 400 --input_images_dir images --output_dir results --grpc_port 9000
Step 8: Review the Results¶
You will see the inference output:
Start processing 1 iterations with batch size 1
Request shape (1, 3, 400, 600)
Response shape (1, 1, 200, 7)
image in batch item 0 , output shape (3, 400, 600)
detection 0 [[[0. 1. 1. 0.55241716 0.3024692 0.59122956
0.39170963]]]
x_min 331
y_min 120
x_max 354
y_max 156...
In the results
folder, you can find files containing inference results. In our case, it will be a modified input image with bounding boxes indicating detected faces.
Note: Similar steps can be performed with an ONNX model. Check the inference use case example with a public ResNet model in ONNX format or TensorFlow model demo.
Congratulations, you have completed the Quickstart guide. Try Model Server demos or explore more features to create your application.