In the instructions below, the Post-Training Optimization Tool directory <INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit
is referred to as <POT_DIR>
. <INSTALL_DIR>
is the directory where Intel® Distribution of OpenVINO™ toolkit is installed.
The toolkit is designed to work with the configuration file where all the parameters required for the optimization are specified. These parameters are organized as a dictionary and stored in a JSON file. JSON file allows using comments that are supported by the jstyleson
Python* package. Logically all parameters are divided into three groups:
- Model parameters that are related to the model definition (e.g. model name, model path, etc.)
- Engine parameters that define parameters of the engine which is responsible for the model inference and data preparation used for optimization and evaluation (e.g. preprocessing parameters, dataset path, etc.)
- Compression parameters that are related to the optimization algorithm (e.g. algorithm name and specific parameters)
Model Parameters
This section contains only three parameters:
"model_name"
- string parameter that defines a model name, e.g. "MobileNetV2"
"model"
- string parameter that defines the path to an input model topology (.xml)
"weights"
- string parameter that defines the path to an input model weights (.bin)
Engine Parameters
There are two engine types in Post-Training Optimization Tool.
- AccuracyChecker engine. It relies on the Deep Learning Accuracy Validation Framework (AccuracyChecker) when inferencing DL models and working with datasets. The benefit of this mode is you can compute accuracy in case you've annotations. And perform accuracy aware algorithms family. There are two options to define engine parameters in that mode:
- Refer to the existing AccuracyChecker configuration file which is represented by the YAML file. It can be a file used for full-precision model validation. In this case, you should define only the
"config"
parameter containing a path to the AccuracyChecker configuration file.
- Define all the required AccuracyChecker parameters directly in the JSON file. For more details, refer to the corresponding AccuracyChecker information and examples of configuration files provided with the tool:
- For the SE-ResNet-50 model: <POT_DIR>/configs/examples/quantization/classification/se_resnet50_pytorch_int8.json
- For the SSD-MobileNet model: <POT_DIR>/configs/examples/quantization/object_detection/ssd_mobilenetv1_int8.json
- Simplified. It does not use the AccuracyChecker tool and annotation. To measure accuracy, you should implement your own pipeline similar to the sample or run the evaluation script from the tool folder, if your model and dataset are supported by the AccuracyChecker. If you use the evaluation script, you should also define an AccuracyChecker config.
- To run the simplified mode, define engine section similar to the example
mobilenetV2_tf_int8_simple_mode.json
file from the <POT_DIR>/configs/examples/quantization/classification/
directory.
Compression Parameters
This section defines optimization algorithms and their parameters. For more details about parameters of the concrete optimization algorithm, please refer to the corresponding documentation.
Examples of the Configuration File
For a quick start, several examples of configuration files for popular DL models are provided. The configuration files located in the <POT_DIR>/configs/examples
folder. For details on how to run the Post-Training Optimization Tool with a sample configuration file, see the instructions.