Post-training Optimization Tool API Examples

The Post-training Optimization Tool contains multiple examples that demonstrate how to use its API to optimize DL models. All available examples can be found on GitHub.

The following examples demonstrate the implementation of Engine, Metric, and DataLoader interfaces for various use cases:

  1. Quantizing Image Classification model

    • Uses single MobilenetV2 model from TensorFlow

    • Implements DataLoader to load .JPEG images and annotations of Imagenet database

    • Implements Metric interface to calculate Accuracy at top-1 metric

    • Uses DefaultQuantization algorithm for quantization model

  2. Quantizing Object Detection Model with Accuracy Control

    • Uses single MobileNetV1 FPN model from TensorFlow

    • Implements Dataloader to load images of COCO database

    • Implements Metric interface to calculate mAP@[.5:.95] metric

    • Uses AccuracyAwareQuantization algorithm for quantization model

  3. Quantizing Semantic Segmentation Model

    • Uses single DeepLabV3 model from TensorFlow

    • Implements DataLoader to load .JPEG images and annotations of Pascal VOC 2012 database

    • Implements Metric interface to calculate Mean Intersection Over Union metric

    • Uses DefaultQuantization algorithm for quantization model

  4. Quantizing 3D Segmentation Model

    • Uses single Brain Tumor Segmentation model from PyTorch

    • Implements DataLoader to load images in NIfTI format from Medical Segmentation Decathlon BRATS 2017 database

    • Implements Metric interface to calculate Dice Index metric

    • Demonstrates how to use image metadata obtained during data loading to post-process the raw model output

    • Uses DefaultQuantization algorithm for quantization model

  5. Quantizing Cascaded model

    • Uses cascaded (composite) MTCNN model from Caffe that consists of three separate models in an OpenVino Intermediate Representation (IR)

    • Implements Dataloader to load .jpg images of WIDER FACE database

    • Implements Metric interface to calculate Recall metric

    • Implements Engine class that is inherited from IEEngine to create a complex staged pipeline to sequentially execute each of the three stages of the MTCNN model, represented by multiple models in IR. It uses engine helpers to set model in OpenVino Inference Engine and process raw model output for the correct statistics collection

    • Uses DefaultQuantization algorithm for quantization model

  6. Quantizing for GNA Device

    • Uses models from Kaldi

    • Implements DataLoader to load data in .ark format

    • Uses DefaultQuantization algorithm for quantization model

After execution of each example above the quantized model is placed into the folder optimized. The accuracy validation of the quantized model is performed right after the quantization.