This tutorial demonstrates step-by-step instructions on how to do
inference on a PyTorch classification model using OpenVINO Runtime.
Starting from OpenVINO 2023.0 release, OpenVINO supports direct PyTorch
model conversion without an intermediate step to convert them into ONNX
format. In order, if you try to use the lower OpenVINO version or prefer
to use ONNX, please check this
In this tutorial, we will use the
RegNetY_800MF model from
torchvision to
demonstrate how to convert PyTorch models to OpenVINO Intermediate
The RegNet model was proposed in Designing Network Design
Spaces by Ilija Radosavovic, Raj
Prateek Kosaraju, Ross Girshick, Kaiming He, Piotr Dollár. The authors
design search spaces to perform Neural Architecture Search (NAS). They
first start from a high dimensional search space and iteratively reduce
the search space by empirically applying constraints based on the
best-performing models sampled by the current search space. Instead of
focusing on designing individual network instances, authors design
network design spaces that parametrize populations of networks. The
overall process is analogous to the classic manual design of networks
but elevated to the design space level. The RegNet design space provides
simple and fast networks that work well across a wide range of flop
Generally, PyTorch models represent an instance of the
torch.nn.Module class, initialized by a state dictionary with model
weights. Typical steps for getting a pre-trained model:
Create an instance of a model class
Load checkpoint state dict, which contains pre-trained model weights
Turn the model to evaluation for switching some operations to
inference mode
The torchvision module provides a ready-to-use set of functions for
model class initialization. We will use
torchvision.models.regnet_y_800mf. You can directly pass pre-trained
model weights to the model initialization function using the weights
enum RegNet_Y_800MF_Weights.DEFAULT.
importtorchvision# get default weights using available weights Enum for modelweights=torchvision.models.RegNet_Y_800MF_Weights.DEFAULT# create model topology and load weightsmodel=torchvision.models.regnet_y_800mf(weights=weights)# switch model to inference modemodel.eval();
The code below demonstrates how to preprocess input data using a
model-specific transforms module from torchvision. After
transformation, we should concatenate images into batched tensor, in our
case, we will run the model with batch 1, so we just unsqueeze input on
the first dimension.
importtorch# Initialize the Weight Transformspreprocess=weights.transforms()# Apply it to the input imageimg_transformed=preprocess(image)# Add batch dimension to image tensorinput_tensor=img_transformed.unsqueeze(0)
The model returns a vector of probabilities in raw logits format,
softmax can be applied to get normalized values in the [0, 1] range. For
a demonstration that the output of the original model and OpenVINO
converted is the same, we defined a common postprocessing function which
can be reused later.
importnumpyasnpfromscipy.specialimportsoftmax# Perform model inference on input tensorresult=model(input_tensor)# Postprocessing function for getting results in the same way for both PyTorch model inference and OpenVINOdefpostprocess_result(output_tensor:np.ndarray,top_k:int=5):""" Posprocess model results. This function applied sofrmax on output tensor and returns specified top_k number of labels with highest probability Parameters: output_tensor (np.ndarray): model output tensor with probabilities top_k (int, *optional*, default 5): number of labels with highest probability for return Returns: topk_labels: label ids for selected top_k scores topk_scores: selected top_k highest scores predicted by model """softmaxed_scores=softmax(output_tensor,-1)[0]topk_labels=np.argsort(softmaxed_scores)[-top_k:][::-1]topk_scores=softmaxed_scores[topk_labels]returntopk_labels,topk_scores# Postprocess resultstop_labels,top_scores=postprocess_result(result.detach().numpy())# Show resultsdisplay(image)foridx,(label,score)inenumerate(zip(top_labels,top_scores)):_,predicted_label=imagenet_classes[label].split(" ",1)print(f"{idx+1}: {predicted_label} - {score*100:.2f}%")
18.1 ms ± 484 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Convert PyTorch Model to OpenVINO Intermediate Representation#
Starting from the 2023.0 release OpenVINO supports direct PyTorch models
conversion to OpenVINO Intermediate Representation (IR) format. OpenVINO
model conversion API should be used for these purposes. More details
regarding PyTorch model conversion can be found in OpenVINO
The convert_model function accepts the PyTorch model object and
returns the openvino.Model instance ready to load on a device using
core.compile_model or save on disk for next usage using
ov.save_model. Optionally, we can provide additional parameters,
such as:
compress_to_fp16 - flag to perform model weights compression into
FP16 data format. It may reduce the required space for model storage
on disk and give speedup for inference devices, where FP16
calculation is supported.
example_input - input data sample which can be used for model
input_shape - the shape of input tensor for conversion
and any other advanced options supported by model conversion Python API.
More details can be found on this
importopenvinoasov# Create OpenVINO Core object instancecore=ov.Core()# Convert model to openvino.runtime.Model objectov_model=ov.convert_model(model)# Save openvino.runtime.Model object on diskov.save_model(ov_model,MODEL_DIR/f"{MODEL_NAME}_dynamic.xml")ov_model
# Run model inferenceresult=compiled_model(input_tensor)[0]# Posptorcess resultstop_labels,top_scores=postprocess_result(result)# Show resultsdisplay(image)foridx,(label,score)inenumerate(zip(top_labels,top_scores)):_,predicted_label=imagenet_classes[label].split(" ",1)print(f"{idx+1}: {predicted_label} - {score*100:.2f}%")
The default conversion path preserves dynamic input shapes, in order if
you want to convert the model with static shapes, you can explicitly
specify it during conversion using the input_shape parameter or
reshape the model into the desired shape after conversion. For the model
reshaping example please check the following
# Convert model to openvino.runtime.Model objectov_model=ov.convert_model(model,input=[[1,3,224,224]])# Save openvino.runtime.Model object on diskov.save_model(ov_model,MODEL_DIR/f"{MODEL_NAME}_static.xml")ov_model
Now, we can see that input of our converted model is tensor of shape [1,
3, 224, 224] instead of [?, 3, ?, ?] reported by previously converted
Run OpenVINO Model Inference with Static Input Shape#
# Run model inferenceresult=compiled_model(input_tensor)[0]# Posptorcess resultstop_labels,top_scores=postprocess_result(result)# Show resultsdisplay(image)foridx,(label,score)inenumerate(zip(top_labels,top_scores)):_,predicted_label=imagenet_classes[label].split(" ",1)print(f"{idx+1}: {predicted_label} - {score*100:.2f}%")
Benchmark OpenVINO Model Inference with Static Input Shape#
2.91 ms ± 10.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Convert TorchScript Model to OpenVINO Intermediate Representation#
TorchScript is a way to create serializable and optimizable models from
PyTorch code. Any TorchScript program can be saved from a Python process
and loaded in a process where there is no Python dependency. More
details about TorchScript can be found in PyTorch
There are 2 possible ways to convert the PyTorch model to TorchScript:
torch.jit.script - Scripting a function or nn.Module will
inspect the source code, compile it as TorchScript code using the
TorchScript compiler, and return a ScriptModule or
torch.jit.trace - Trace a function and return an executable or
ScriptFunction that will be optimized using just-in-time
Let’s consider both approaches and their conversion into OpenVINO IR.
torch.jit.script inspects model source code and compiles it to
ScriptModule. After compilation model can be used for inference or
saved on disk using the function and after that
restored with torch.jit.load in any other environment without the
original PyTorch model code definitions.
TorchScript itself is a subset of the Python language, so not all
features in Python work, but TorchScript provides enough functionality
to compute on tensors and do control-dependent operations. For a
complete guide, see the TorchScript Language
# Get model pathscripted_model_path=MODEL_DIR/f"{MODEL_NAME}_scripted.pth"# Compile and save model if it has not been compiled before or load compiled modelifnotscripted_model_path.exists():scripted_model=torch.jit.script(model),scripted_model_path)else:scripted_model=torch.jit.load(scripted_model_path)# Run scripted model inferenceresult=scripted_model(input_tensor)# Postprocess resultstop_labels,top_scores=postprocess_result(result.detach().numpy())# Show resultsdisplay(image)foridx,(label,score)inenumerate(zip(top_labels,top_scores)):_,predicted_label=imagenet_classes[label].split(" ",1)print(f"{idx+1}: {predicted_label} - {score*100:.2f}%")
14.3 ms ± 70 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Convert PyTorch Scripted Model to OpenVINO Intermediate Representation#
The conversion step for the scripted model to OpenVINO IR is similar to
the original PyTorch model.
# Convert model to openvino.runtime.Model objectov_model=ov.convert_model(scripted_model)# Load OpenVINO model on devicecompiled_model=core.compile_model(ov_model,device.value)# Run OpenVINO model inferenceresult=compiled_model(input_tensor,device.value)[0]# Postprocess resultstop_labels,top_scores=postprocess_result(result)# Show resultsdisplay(image)foridx,(label,score)inenumerate(zip(top_labels,top_scores)):_,predicted_label=imagenet_classes[label].split(" ",1)print(f"{idx+1}: {predicted_label} - {score*100:.2f}%")
Using torch.jit.trace, you can turn an existing module or Python
function into a TorchScript ScriptFunction or ScriptModule. You
must provide example inputs, and model will be executed, recording the
operations performed on all the tensors.
The resulting recording of a standalone function produces
The resulting recording of nn.Module.forward or nn.Module
produces ScriptModule.
In the same way like scripted model, traced model can be used for
inference or saved on disk using function and after
that restored with torch.jit.load in any other environment without
original PyTorch model code definitions.
# Get model pathtraced_model_path=MODEL_DIR/f"{MODEL_NAME}_traced.pth"# Trace and save model if it has not been traced before or load traced modelifnottraced_model_path.exists():traced_model=torch.jit.trace(model,example_inputs=input_tensor),traced_model_path)else:traced_model=torch.jit.load(traced_model_path)# Run traced model inferenceresult=traced_model(input_tensor)# Postprocess resultstop_labels,top_scores=postprocess_result(result.detach().numpy())# Show resultsdisplay(image)foridx,(label,score)inenumerate(zip(top_labels,top_scores)):_,predicted_label=imagenet_classes[label].split(" ",1)print(f"{idx+1}: {predicted_label} - {score*100:.2f}%")
15.2 ms ± 315 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Convert PyTorch Traced Model to OpenVINO Intermediate Representation#
The conversion step for a traced model to OpenVINO IR is similar to the
original PyTorch model.
# Convert model to openvino.runtime.Model objectov_model=ov.convert_model(traced_model)# Load OpenVINO model on devicecompiled_model=core.compile_model(ov_model,device.value)# Run OpenVINO model inferenceresult=compiled_model(input_tensor)[0]# Postprocess resultstop_labels,top_scores=postprocess_result(result)# Show resultsdisplay(image)foridx,(label,score)inenumerate(zip(top_labels,top_scores)):_,predicted_label=imagenet_classes[label].split(" ",1)print(f"{idx+1}: {predicted_label} - {score*100:.2f}%")