Post-training Optimization Toolkit (POT) is designed to accelerate the inference of deep learning models by applying special methods without model retraining or fine-tuning, like post-training quantization. Therefore, the tool does not require a training dataset or a pipeline. To apply post-training algorithms from the POT, you need:
The tool is aimed to fully automate the model transformation process without changing the model structure. The POT is available only in the Intel® distribution of OpenVINO™ toolkit and is not opensourced. For details about the low-precision flow in OpenVINO™, see the Low Precision OptimizationGuide.
Post-training Optimization Toolkit includes a standalone command-line tool and a Python* API that provide the following key features:
For benchmarking results collected for the models optimized with POT tool, see INT8 vs FP32 Comparison on Select Networks and Platforms.
Further documentation presumes that you are familiar with the basic deep learning concepts, such as model inference, dataset preparation, model optimization, as well as with the OpenVINO™ toolkit and its components such as Model Optimizer and AccuracyChecker.
In the instructions below, <INSTALL_DIR>
is the directory where the Intel® distribution of OpenVINO™ toolkit is installed. POT is distributed as a part of the OpenVINO™ release package, and to use it as a command-line tool, you need to install it separately as well as its dependencies, namely Model Optimizer and AccuracyChecker. POT source files are available from the <INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit
directory after the OpenVINO™ installation. It is recommended to create a separate Python* environment before installing the OpenVINO™ and its components. To set up the POT in your environment, follow the steps below:
cd /opt/intel/openvino/deployment_tools/model_optimizer/install_prerequisites
.<INSTALL_DIR>/deployment_tools/open_model_zoo/tools/accuracy_checker
.setup.py
script: ```sh python3 setup.py install ```<INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit
.setup.py
script: Now the POT is available in the command line by the pot
alias. To verify this, run pot -h
.
Before running the POT, convert your pretrained model into the OpenVINO™ IR format with the Model Optimizer. In addition, it is highly recommended to use the AccuracyChecker to make sure that the model can be successfully inferred and achieves similar accuracy numbers as the reference model from the original framework.
To run the command-line Post-training Optimization Tool:
configs
folder. To simplify this step, use the Accuracy Checker configuration file for the floating-point model and refer to it when necessary. See Post-Training Optimization Best Practices.For all available usage options, use the `-h`, `--help` arguments or refer to the Command-Line Arguments below.
results
folder that is created in the same directory where the tool is run from. Use the -e
option to evaluate the accuracy directly from the tool.See the How to Run Examples tutorial about how to run a particular example of 8-bit quantization with the POT.
The following command-line options are available to run the tool:
Argument | Description |
---|---|
-h , --help | Optional. Show help message and exit. |
-c CONFIG , --config CONFIG | Path to a config file with task- or model-specific parameters. |
-e , --evaluate | Optional. Evaluate model on the whole dataset after optimization. |
--output-dir OUTPUT_DIR | Optional. A directory where results are saved. Default: ./results . |
-sm , --save-model | Optional. Save the original full-precision model. |
-d , --direct-dump | Optional. Save results directly to output directory without additional subfolders. |
--log-level {CRITICAL,ERROR,WARNING,INFO,DEBUG} | Optional. Log level to print. Default: INFO. |
--progress-bar | Optional. Disable CL logging and enable progress bar. |
--stream-output | Optional. Switch model quantization progress display to a multiline mode. Use with third-party components. |
--keep-uncompressed-weights | Optional. Keep Convolution, Deconvolution and FullyConnected weights uncompressed. Use with third-party components. |