Post-Training Optimization Tool

Introduction

Post-training Optimization Tool (POT) is designed to accelerate the inference of deep learning models by applying special methods without model retraining or fine-tuning, like post-training quantization. Therefore, the tool does not require a training dataset or a pipeline. To apply post-training algorithms from the POT, you need:

  • A floating-point precision model, FP32 or FP16, converted into the OpenVINO Intermediate Representation (IR) format and run on CPU with the OpenVINO.

  • A representative calibration dataset representing a use case scenario, for example, 300 images.

Post-training Optimization Tool provides the following key features:

  • Two post-training 8-bit quantization algorithms: fast DefaultQuantization and precise AccuracyAwareQuantization.

  • Compression for different hardware targets such as CPU and GPU.

  • Multiple domains: Computer Vision, Natural Language Processing, Recommendation Systems, Speech Recognition.

  • API that helps to apply optimization methods within a custom inference script written with OpenVINO Python* API.

  • Symmetric and asymmetric quantization schemes. For details, see the Quantization section.

  • Per-channel quantization for Convolutional and Fully-Connected layers.

  • Global optimization of post-training quantization parameters using the Tree-Structured Parzen Estimator.

The tool is aimed to fully automate the model transformation process without a need to change the model on the user’s side. The POT is available only in the Intel distribution of OpenVINO toolkit and is not opensourced. For details about the low-precision flow in OpenVINO, see the Low Precision Optimization Guide.

For benchmarking results collected for the models optimized with POT tool, see INT8 vs FP32 Comparison on Select Networks and Platforms.

Further documentation presumes that you are familiar with the basic Deep Learning concepts, such as model inference, dataset preparation, model optimization, as well as with the OpenVINO toolkit and its components such as Model Optimizer and Accuracy Checker Tool.

Use POT

_images/workflow.png

The POT provides three basic usage scenarios:

  • Command-line interface : this is the recommended path if the model from OpenVINO Model Zoo or there is a valid Accuracy Checker Tool configuration file for the model that allows validating model accuracy using Accuracy Checker Tool.

  • Python* API : it allows integrating optimization methods implemented in POT into a Python* inference script written with Python* API. This flow is recommended if it is not possible to use Accuracy Checker Tool for validation on the dedicated dataset.

  • Deep Learning Workbench (DL Workbench) : the OpenVINO toolkit UI that enables you to import a model, analyze its performance and accuracy, visualize the outputs, optimize and prepare the model for deployment on various Intel platforms.

Note

POT also supports optimization in the so-called Simplified mode (see Configuration File Description) which is essentially a local implementation of the POT Python API aimed at quantizing Computer Vision with simple pre-processing and inference flow. However using this mode can lead to an inaccurate model after optimization due to the difference in the model preprocessing.

To get started with POT, follow the Installation Guide.