Hello Reshape SSD Python Sample¶
This sample demonstrates how to do synchronous inference of object detection models using Shape Inference feature.
Models with only 1 input and output are supported.
Options |
Values |
---|---|
Validated Models |
|
Validated Layout |
NCHW |
Model Format |
OpenVINO™ toolkit Intermediate Representation (.xml + .bin), ONNX (.onnx) |
Supported devices |
|
Other language realization |
The following Python API is used in the application:
Feature |
API |
Description |
---|---|---|
Model Operations |
openvino.runtime.Model.reshape , openvino.runtime.Model.input , openvino.runtime.Output.get_any_name , openvino.runtime.PartialShape |
Managing of model |
Basic OpenVINO™ Runtime API is covered by Hello Classification Python* Sample.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# Copyright (C) 2018-2023 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
import logging as log
import os
import sys
import cv2
import numpy as np
from openvino.preprocess import PrePostProcessor
from openvino.runtime import Core, Layout, PartialShape, Type
def main():
log.basicConfig(format='[ %(levelname)s ] %(message)s', level=log.INFO, stream=sys.stdout)
# Parsing and validation of input arguments
if len(sys.argv) != 4:
log.info(f'Usage: {sys.argv[0]} <path_to_model> <path_to_image> <device_name>')
return 1
model_path = sys.argv[1]
image_path = sys.argv[2]
device_name = sys.argv[3]
# --------------------------- Step 1. Initialize OpenVINO Runtime Core ------------------------------------------------
log.info('Creating OpenVINO Runtime Core')
core = Core()
# --------------------------- Step 2. Read a model --------------------------------------------------------------------
log.info(f'Reading the model: {model_path}')
# (.xml and .bin files) or (.onnx file)
model = core.read_model(model_path)
if len(model.inputs) != 1:
log.error('Sample supports only single input topologies')
return -1
if len(model.outputs) != 1:
log.error('Sample supports only single output topologies')
return -1
# --------------------------- Step 3. Set up input --------------------------------------------------------------------
# Read input image
image = cv2.imread(image_path)
# Add N dimension
input_tensor = np.expand_dims(image, 0)
log.info('Reshaping the model to the height and width of the input image')
n, h, w, c = input_tensor.shape
model.reshape({model.input().get_any_name(): PartialShape((n, c, h, w))})
# --------------------------- Step 4. Apply preprocessing -------------------------------------------------------------
ppp = PrePostProcessor(model)
# 1) Set input tensor information:
# - input() provides information about a single model input
# - precision of tensor is supposed to be 'u8'
# - layout of data is 'NHWC'
ppp.input().tensor() \
.set_element_type(Type.u8) \
.set_layout(Layout('NHWC')) # noqa: N400
# 2) Here we suppose model has 'NCHW' layout for input
ppp.input().model().set_layout(Layout('NCHW'))
# 3) Set output tensor information:
# - precision of tensor is supposed to be 'f32'
ppp.output().tensor().set_element_type(Type.f32)
# 4) Apply preprocessing modifing the original 'model'
model = ppp.build()
# ---------------------------Step 4. Loading model to the device-------------------------------------------------------
log.info('Loading the model to the plugin')
compiled_model = core.compile_model(model, device_name)
# --------------------------- Step 6. Create infer request and do inference synchronously -----------------------------
log.info('Starting inference in synchronous mode')
results = compiled_model.infer_new_request({0: input_tensor})
# ---------------------------Step 6. Process output--------------------------------------------------------------------
predictions = next(iter(results.values()))
# Change a shape of a numpy.ndarray with results ([1, 1, N, 7]) to get another one ([N, 7]),
# where N is the number of detected bounding boxes
detections = predictions.reshape(-1, 7)
for detection in detections:
confidence = detection[2]
if confidence > 0.5:
class_id = int(detection[1])
xmin = int(detection[3] * w)
ymin = int(detection[4] * h)
xmax = int(detection[5] * w)
ymax = int(detection[6] * h)
log.info(f'Found: class_id = {class_id}, confidence = {confidence:.2f}, ' f'coords = ({xmin}, {ymin}), ({xmax}, {ymax})')
# Draw a bounding box on a output image
cv2.rectangle(image, (xmin, ymin), (xmax, ymax), (0, 255, 0), 2)
cv2.imwrite('out.bmp', image)
if os.path.exists('out.bmp'):
log.info('Image out.bmp was created!')
else:
log.error('Image out.bmp was not created. Check your permissions.')
# ----------------------------------------------------------------------------------------------------------------------
log.info('This sample is an API example, for any performance measurements please use the dedicated benchmark_app tool\n')
return 0
if __name__ == '__main__':
sys.exit(main())
How It Works¶
At startup, the sample application reads command-line parameters, prepares input data, loads a specified model and image to the OpenVINO™ Runtime plugin, performs synchronous inference, and processes output data. As a result, the program creates an output image, logging each step in a standard output stream.
You can see the explicit description of each sample step at Integration Steps section of “Integrate OpenVINO™ Runtime with Your Application” guide.
Running¶
python hello_reshape_ssd.py <path_to_model> <path_to_image> <device_name>
To run the sample, you need to specify a model and image:
You can use public or Intel’s pre-trained models from the Open Model Zoo. The models can be downloaded using the Model Downloader.
You can use images from the media files collection available at the storage <https://storage.openvinotoolkit.org/data/test_data>.
Note
By default, OpenVINO™ Toolkit Samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using model conversion API with
reverse_input_channels
argument specified. For more information about the argument, refer to When to Reverse Input Channels section of Embedding Preprocessing Computation.Before running the sample with a trained model, make sure the model is converted to the intermediate representation (IR) format (*.xml + *.bin) using model conversion API.
The sample accepts models in ONNX format (.onnx) that do not require preprocessing.
Example¶
Install the
openvino-dev
Python package to use Open Model Zoo Tools:python -m pip install openvino-dev[caffe]
Download a pre-trained model:
omz_downloader --name mobilenet-ssd
If a model is not in the IR or ONNX format, it must be converted. You can do this using the model converter:
omz_converter --name mobilenet-ssd
Perform inference of
banana.jpg
usingssdlite_mobilenet_v2
model on aGPU
, for example:python hello_reshape_ssd.py mobilenet-ssd.xml banana.jpg GPU
Sample Output¶
The sample application logs each step in a standard output stream and creates an output image, drawing bounding boxes for inference results with an over 50% confidence.
[ INFO ] Creating OpenVINO Runtime Core
[ INFO ] Reading the model: C:/test_data/models/mobilenet-ssd.xml
[ INFO ] Reshaping the model to the height and width of the input image
[ INFO ] Loading the model to the plugin
[ INFO ] Starting inference in synchronous mode
[ INFO ] Found: class_id = 52, confidence = 0.98, coords = (21, 98), (276, 210)
[ INFO ] Image out.bmp was created!
[ INFO ] This sample is an API example, for any performance measurements please use the dedicated benchmark_app tool