This tutorial describes how to run an example of post-training quantization for MobileNet v2 model from PyTorch framework. It covers all the aspects from model preparation and validation of full precision model to quantization and benchmarking the performance boost after optimization. All the steps below are based on the tools and samples of configuration files distributed with the Intel® Distribution of OpenVINO™ toolkit.

In the instructions below, <INSTALL_DIR> is the directory where Intel® Distribution of OpenVINO™ toolkit is installed and <POT_DIR> is the Post-Training Optimization Tool directory, which is <INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit.

Sample configuration files are located in the <POT_DIR>/configs/examples folder.

Environment setup

Install OpenVINO and its Post-training Optimization Toolkit as well as AccuracyChecker tool. For more details, please refer to this document.
Activate the Python* environment and OpenVINO environment described in the same document.
Create a separate working directory and navigate to it.

Model preparation

Download MobileNet v2 PyTorch model using Model Downloader tool from the Open Model Zoo repository:
python3 <POT_DIR>/deployment_tools/open_model_zoo/tools/downloader/downloader.py --name mobilenet-v2-pytorch
Launch Model Convertor tool to generate the OpenVINO IR model:
python3 <POT_DIR>/deployment_tools/open_model_zoo/tools/downloader/converter.py --name mobilenet-v2-pytorch
Prepare ImageNet dataset following the tutorial here: <POT_DIR>/libs/open_model_zoo/datasets.md.
Prepare AccuracyChecker configuration file (.yml) for MobileNet v2 PyTorch model:
- Copy a template of this configuration file:
  cp <POT_DIR>/libs/open_model_zoo/tools/accuracy_checker/configs/mobilenet-v2-pytorch.yml ./
```
- Override the path to the model, dataset and annotations inside the configuration file.
```
Evaluate the accuracy of full-precision model:
accuracy_check -c mobilenet-v2-pytorch.yml

The actual result should be 71.82% of the accuracy top-1 metric.

Model quantization

Prepare POT configuration file (.json) for INT8 quantization with DefaultQuantization algorithm of MobileNet v2 PyTorch model:
- Copy a template of this configuration file:
  cp <POT_DIR>/configs/examples/quantization/classification/mobilenetV2_pytorch_int8.json ./
```
- Override the path to the model and AccuracyChecker YAML configuration file.
```
Run POT tool to get quantized model. The resulted model will be placed in the subfolder under the result directory:
pot -c mobilenetV2_pytorch_int8.json -e

The actual result should be 71.42% of accuracy top-1 metric on VNNI based CPU. Note: the results can be different on the CPUs with the different instruction sets.

Performance benchmarking of the quantized model

In order to observe the performance speedup after the quantization, run benchmark_app for the original and quantized models:

<INSTALL_DIR>/deployment_tools/tools/benchmark_tool/benchmark_app.py -m <PATH_TO_MODEL>