Quantization Aware Training with NNCF, using TensorFlow Framework

This Jupyter notebook can be launched on-line, opening an interactive environment in a browser window. You can also make a local installation. Choose one of the following options:

Google ColabGithub

The goal of this notebook to demonstrate how to use the Neural Network Compression Framework NNCF 8-bit quantization to optimize a TensorFlow model for inference with OpenVINO™ Toolkit. The optimization process contains the following steps:

  • Transforming the original FP32 model to INT8

  • Using fine-tuning to restore the accuracy.

  • Exporting optimized and original models to Frozen Graph and then to OpenVINO.

  • Measuring and comparing the performance of models.

For more advanced usage, refer to these examples.

This tutorial uses the ResNet-18 model with Imagenette dataset. Imagenette is a subset of 10 easily classified classes from the ImageNet dataset. Using the smaller model and dataset will speed up training and download time.

Table of contents:

Imports and Settings

Import NNCF and all auxiliary packages from your Python code. Set a name for the model, input image size, used batch size, and the learning rate. Also, define paths where Frozen Graph and OpenVINO IR versions of the models will be stored.

NOTE: All NNCF logging messages below ERROR level (INFO and WARNING) are disabled to simplify the tutorial. For production use, it is recommended to enable logging by removing set_log_level(logging.ERROR).

import sys
import importlib.util

%pip install -q "openvino>=2023.1.0" "nncf>=2.5.0"
if sys.platform == "win32":
    if importlib.util.find_spec("tensorflow_datasets"):
        %pip uninstall -q tensorflow-datasets
    %pip install -q --upgrade "tfds-nightly"
else:
    %pip install -q "tensorflow-datasets>=4.8.0"
DEPRECATION: pytorch-lightning 1.6.5 has a non-standard dependency specifier torch>=1.8.*. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of pytorch-lightning or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063
Note: you may need to restart the kernel to use updated packages.
DEPRECATION: pytorch-lightning 1.6.5 has a non-standard dependency specifier torch>=1.8.*. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of pytorch-lightning or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
pytorch-lightning 1.6.5 requires protobuf<=3.20.1, but you have protobuf 3.20.3 which is incompatible.
Note: you may need to restart the kernel to use updated packages.
from pathlib import Path
import logging

import tensorflow as tf
import tensorflow_datasets as tfds
from tensorflow.keras import layers
from tensorflow.keras import models

from nncf import NNCFConfig
from nncf.tensorflow.helpers.model_creation import create_compressed_model
from nncf.tensorflow.initialization import register_default_init_args
from nncf.common.logging.logger import set_log_level
import openvino as ov

set_log_level(logging.ERROR)

MODEL_DIR = Path("model")
OUTPUT_DIR = Path("output")
MODEL_DIR.mkdir(exist_ok=True)
OUTPUT_DIR.mkdir(exist_ok=True)

BASE_MODEL_NAME = "ResNet-18"

fp32_h5_path = Path(MODEL_DIR / (BASE_MODEL_NAME + "_fp32")).with_suffix(".h5")
fp32_ir_path = Path(OUTPUT_DIR / "saved_model").with_suffix(".xml")
int8_pb_path = Path(OUTPUT_DIR / (BASE_MODEL_NAME + "_int8")).with_suffix(".pb")
int8_ir_path = int8_pb_path.with_suffix(".xml")

BATCH_SIZE = 128
IMG_SIZE = (64, 64)  # Default Imagenet image size
NUM_CLASSES = 10  # For Imagenette dataset

LR = 1e-5

MEAN_RGB = (0.485 * 255, 0.456 * 255, 0.406 * 255)  # From Imagenet dataset
STDDEV_RGB = (0.229 * 255, 0.224 * 255, 0.225 * 255)  # From Imagenet dataset

fp32_pth_url = "https://storage.openvinotoolkit.org/repositories/nncf/openvino_notebook_ckpts/305_resnet18_imagenette_fp32_v1.h5"
_ = tf.keras.utils.get_file(fp32_h5_path.resolve(), fp32_pth_url)
print(f'Absolute path where the model weights are saved:\n {fp32_h5_path.resolve()}')
2024-03-13 01:11:54.839379: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2024-03-13 01:11:54.874069: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-03-13 01:11:55.482764: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
INFO:nncf:NNCF initialized successfully. Supported frameworks detected: torch, tensorflow, onnx, openvino
Downloading data from https://storage.openvinotoolkit.org/repositories/nncf/openvino_notebook_ckpts/305_resnet18_imagenette_fp32_v1.h5
8192/134604992 [..............................] - ETA: 0s
 
147456/134604992 [..............................] - ETA: 1:02
 
655360/134604992 [..............................] - ETA: 27s
  
2719744/134604992 [..............................] - ETA: 9s
  
6914048/134604992 [>.............................] - ETA: 5s
   
12705792/134604992 [=>............................] - ETA: 3s
   
17588224/134604992 [==>...........................] - ETA: 2s
   
20963328/134604992 [===>..........................] - ETA: 2s
   
23764992/134604992 [====>.........................] - ETA: 2s
   
26558464/134604992 [====>.........................] - ETA: 2s
   
28844032/134604992 [=====>........................] - ETA: 2s
   
31449088/134604992 [======>.......................] - ETA: 2s
   
33857536/134604992 [======>.......................] - ETA: 2s
   
36683776/134604992 [=======>......................] - ETA: 2s
   
40517632/134604992 [========>.....................] - ETA: 2s
   
41484288/134604992 [========>.....................] - ETA: 2s
   
41934848/134604992 [========>.....................] - ETA: 2s
   
43761664/134604992 [========>.....................] - ETA: 2s
   
47177728/134604992 [=========>....................] - ETA: 2s
   
52133888/134604992 [==========>...................] - ETA: 1s
   
52420608/134604992 [==========>...................] - ETA: 1s
   
52551680/134604992 [==========>...................] - ETA: 2s
   
57663488/134604992 [===========>..................] - ETA: 1s
   
61620224/134604992 [============>.................] - ETA: 1s
   
64462848/134604992 [=============>................] - ETA: 1s
   
68149248/134604992 [==============>...............] - ETA: 1s
   
72540160/134604992 [===============>..............] - ETA: 1s
   
73392128/134604992 [===============>..............] - ETA: 1s
   
77783040/134604992 [================>.............] - ETA: 1s
   
78635008/134604992 [================>.............] - ETA: 1s
   
83025920/134604992 [=================>............] - ETA: 1s
   
83877888/134604992 [=================>............] - ETA: 1s
   
88268800/134604992 [==================>...........] - ETA: 1s
   
89120768/134604992 [==================>...........] - ETA: 1s
   
90234880/134604992 [===================>..........] - ETA: 1s
   
92659712/134604992 [===================>..........] - ETA: 1s
   
95322112/134604992 [====================>.........] - ETA: 1s
   
99606528/134604992 [=====================>........] - ETA: 0s


104275968/134604992 [======================>…….] - ETA: 0s



104841216/134604992 [======================>…….] - ETA: 0s



109240320/134604992 [=======================>……] - ETA: 0s



110092288/134604992 [=======================>……] - ETA: 0s



112508928/134604992 [========================>…..] - ETA: 0s



115335168/134604992 [========================>…..] - ETA: 0s



120201216/134604992 [=========================>….] - ETA: 0s



120578048/134604992 [=========================>….] - ETA: 0s



125231104/134604992 [==========================>…] - ETA: 0s



125820928/134604992 [===========================>..] - ETA: 0s



130924544/134604992 [============================>.] - ETA: 0s



131538944/134604992 [============================>.] - ETA: 0s



134447104/134604992 [============================>.] - ETA: 0s



134604992/134604992 [==============================] - 3s 0us/step

Absolute path where the model weights are saved:
 /opt/home/k8sworker/ci-ai/cibuilds/ov-notebook/OVNotebookOps-632/.workspace/scm/ov-notebook/notebooks/305-tensorflow-quantization-aware-training/model/ResNet-18_fp32.h5

Dataset Preprocessing

Download and prepare Imagenette 160px dataset. - Number of classes: 10 - Download size: 94.18 MiB

| Split        | Examples |
|--------------|----------|
| 'train'      | 12,894   |
| 'validation' | 500      |
datasets, datasets_info = tfds.load('imagenette/160px', shuffle_files=True, as_supervised=True, with_info=True,
                                    read_config=tfds.ReadConfig(shuffle_seed=0))
train_dataset, validation_dataset = datasets['train'], datasets['validation']
fig = tfds.show_examples(train_dataset, datasets_info)
2024-03-13 01:12:03.781864: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:266] failed call to cuInit: CUDA_ERROR_COMPAT_NOT_SUPPORTED_ON_DEVICE: forward compatibility was attempted on non supported HW
2024-03-13 01:12:03.781896: I tensorflow/compiler/xla/stream_executor/cuda/cuda_diagnostics.cc:168] retrieving CUDA diagnostic information for host: iotg-dev-workstation-07
2024-03-13 01:12:03.781901: I tensorflow/compiler/xla/stream_executor/cuda/cuda_diagnostics.cc:175] hostname: iotg-dev-workstation-07
2024-03-13 01:12:03.782051: I tensorflow/compiler/xla/stream_executor/cuda/cuda_diagnostics.cc:199] libcuda reported version is: 470.223.2
2024-03-13 01:12:03.782066: I tensorflow/compiler/xla/stream_executor/cuda/cuda_diagnostics.cc:203] kernel reported version is: 470.182.3
2024-03-13 01:12:03.782070: E tensorflow/compiler/xla/stream_executor/cuda/cuda_diagnostics.cc:312] kernel version 470.182.3 does not match DSO version 470.223.2 -- cannot find working devices in this configuration
2024-03-13 01:12:03.899468: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_4' with dtype int64 and shape [1]
     [[{{node Placeholder/_4}}]]
2024-03-13 01:12:03.899790: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_1' with dtype string and shape [1]
     [[{{node Placeholder/_1}}]]
2024-03-13 01:12:03.971431: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to dataset.cache().take(k).repeat(). You should use dataset.take(k).cache().repeat() instead.
../_images/305-tensorflow-quantization-aware-training-with-output_6_1.png
def preprocessing(image, label):
    image = tf.image.resize(image, IMG_SIZE)
    image = image - MEAN_RGB
    image = image / STDDEV_RGB
    label = tf.one_hot(label, NUM_CLASSES)
    return image, label


train_dataset = (train_dataset.map(preprocessing, num_parallel_calls=tf.data.experimental.AUTOTUNE)
                              .batch(BATCH_SIZE)
                              .prefetch(tf.data.experimental.AUTOTUNE))

validation_dataset = (validation_dataset.map(preprocessing, num_parallel_calls=tf.data.experimental.AUTOTUNE)
                                        .batch(BATCH_SIZE)
                                        .prefetch(tf.data.experimental.AUTOTUNE))

Define a Floating-Point Model

def residual_conv_block(filters, stage, block, strides=(1, 1), cut='pre'):
    def layer(input_tensor):
        x = layers.BatchNormalization(epsilon=2e-5)(input_tensor)
        x = layers.Activation('relu')(x)

        # Defining shortcut connection.
        if cut == 'pre':
            shortcut = input_tensor
        elif cut == 'post':
            shortcut = layers.Conv2D(filters, (1, 1), strides=strides, kernel_initializer='he_uniform',
                                     use_bias=False)(x)

        # Continue with convolution layers.
        x = layers.ZeroPadding2D(padding=(1, 1))(x)
        x = layers.Conv2D(filters, (3, 3), strides=strides, kernel_initializer='he_uniform', use_bias=False)(x)

        x = layers.BatchNormalization(epsilon=2e-5)(x)
        x = layers.Activation('relu')(x)
        x = layers.ZeroPadding2D(padding=(1, 1))(x)
        x = layers.Conv2D(filters, (3, 3), kernel_initializer='he_uniform', use_bias=False)(x)

        # Add residual connection.
        x = layers.Add()([x, shortcut])
        return x

    return layer


def ResNet18(input_shape=None):
    """Instantiates the ResNet18 architecture."""
    img_input = layers.Input(shape=input_shape, name='data')

    # ResNet18 bottom
    x = layers.BatchNormalization(epsilon=2e-5, scale=False)(img_input)
    x = layers.ZeroPadding2D(padding=(3, 3))(x)
    x = layers.Conv2D(64, (7, 7), strides=(2, 2), kernel_initializer='he_uniform', use_bias=False)(x)
    x = layers.BatchNormalization(epsilon=2e-5)(x)
    x = layers.Activation('relu')(x)
    x = layers.ZeroPadding2D(padding=(1, 1))(x)
    x = layers.MaxPooling2D((3, 3), strides=(2, 2), padding='valid')(x)

    # ResNet18 body
    repetitions = (2, 2, 2, 2)
    for stage, rep in enumerate(repetitions):
        for block in range(rep):
            filters = 64 * (2 ** stage)
            if block == 0 and stage == 0:
                x = residual_conv_block(filters, stage, block, strides=(1, 1), cut='post')(x)
            elif block == 0:
                x = residual_conv_block(filters, stage, block, strides=(2, 2), cut='post')(x)
            else:
                x = residual_conv_block(filters, stage, block, strides=(1, 1), cut='pre')(x)
    x = layers.BatchNormalization(epsilon=2e-5)(x)
    x = layers.Activation('relu')(x)

    # ResNet18 top
    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Dense(NUM_CLASSES)(x)
    x = layers.Activation('softmax')(x)

    # Create the model.
    model = models.Model(img_input, x)

    return model
IMG_SHAPE = IMG_SIZE + (3,)
fp32_model = ResNet18(input_shape=IMG_SHAPE)

Pre-train a Floating-Point Model

Using NNCF for model compression assumes that the user has a pre-trained model and a training pipeline.

NOTE For the sake of simplicity of the tutorial, it is recommended to skip FP32 model training and load the weights that are provided.

# Load the floating-point weights.
fp32_model.load_weights(fp32_h5_path)

# Compile the floating-point model.
fp32_model.compile(
    loss=tf.keras.losses.CategoricalCrossentropy(label_smoothing=0.1),
    metrics=[tf.keras.metrics.CategoricalAccuracy(name='acc@1')]
)

# Validate the floating-point model.
test_loss, acc_fp32 = fp32_model.evaluate(
    validation_dataset,
    callbacks=tf.keras.callbacks.ProgbarLogger(stateful_metrics=['acc@1'])
)
print(f"\nAccuracy of FP32 model: {acc_fp32:.3f}")
2024-03-13 01:12:04.910372: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_4' with dtype int64 and shape [1]
     [[{{node Placeholder/_4}}]]
2024-03-13 01:12:04.910741: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_4' with dtype int64 and shape [1]
     [[{{node Placeholder/_4}}]]
0/Unknown - 1s 0s/sample - loss: 1.0472 - acc@1: 0.7891

  0/Unknown - 1s 0s/sample - loss: 0.9818 - acc@1: 0.8203

  0/Unknown - 1s 0s/sample - loss: 0.9774 - acc@1: 0.8203

  0/Unknown - 1s 0s/sample - loss: 0.9807 - acc@1: 0.8220


4/4 [==============================] - 1s 254ms/sample - loss: 0.9807 - acc@1: 0.8220

Accuracy of FP32 model: 0.822

Create and Initialize Quantization

NNCF enables compression-aware training by integrating into regular training pipelines. The framework is designed so that modifications to your original training code are minor. Quantization is the simplest scenario and requires only 3 modifications.

  1. Configure NNCF parameters to specify compression

nncf_config_dict = {
    "input_info": {"sample_size": [1, 3] + list(IMG_SIZE)},
    "log_dir": str(OUTPUT_DIR),  # The log directory for NNCF-specific logging outputs.
    "compression": {
        "algorithm": "quantization",  # Specify the algorithm here.
    },
}
nncf_config = NNCFConfig.from_dict(nncf_config_dict)
  1. Provide a data loader to initialize the values of quantization ranges and determine which activation should be signed or unsigned from the collected statistics, using a given number of samples.

nncf_config = register_default_init_args(nncf_config=nncf_config,
                                         data_loader=train_dataset,
                                         batch_size=BATCH_SIZE)
  1. Create a wrapped model ready for compression fine-tuning from a pre-trained FP32 model and a configuration object.

compression_ctrl, int8_model = create_compressed_model(fp32_model, nncf_config)
2024-03-13 01:12:07.816084: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_4' with dtype int64 and shape [1]
     [[{{node Placeholder/_4}}]]
2024-03-13 01:12:07.816469: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_0' with dtype string and shape [1]
     [[{{node Placeholder/_0}}]]
2024-03-13 01:12:08.749313: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to dataset.cache().take(k).repeat(). You should use dataset.take(k).cache().repeat() instead.
2024-03-13 01:12:09.358441: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to dataset.cache().take(k).repeat(). You should use dataset.take(k).cache().repeat() instead.
2024-03-13 01:12:17.326475: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to dataset.cache().take(k).repeat(). You should use dataset.take(k).cache().repeat() instead.

Evaluate the new model on the validation set after initialization of quantization. The accuracy should be not far from the accuracy of the floating-point FP32 model for a simple case like the one being demonstrated here.

# Compile the INT8 model.
int8_model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=LR),
    loss=tf.keras.losses.CategoricalCrossentropy(label_smoothing=0.1),
    metrics=[tf.keras.metrics.CategoricalAccuracy(name='acc@1')]
)

# Validate the INT8 model.
test_loss, test_acc = int8_model.evaluate(
    validation_dataset,
    callbacks=tf.keras.callbacks.ProgbarLogger(stateful_metrics=['acc@1'])
)
0/Unknown - 1s 0s/sample - loss: 1.0468 - acc@1: 0.7656

  0/Unknown - 1s 0s/sample - loss: 0.9804 - acc@1: 0.8008

  0/Unknown - 1s 0s/sample - loss: 0.9769 - acc@1: 0.8099

  0/Unknown - 1s 0s/sample - loss: 0.9766 - acc@1: 0.8120


4/4 [==============================] - 1s 301ms/sample - loss: 0.9766 - acc@1: 0.8120

Fine-tune the Compressed Model

At this step, a regular fine-tuning process is applied to further improve quantized model accuracy. Normally, several epochs of tuning are required with a small learning rate, the same that is usually used at the end of the training of the original model. No other changes in the training pipeline are required. Here is a simple example.

print(f"\nAccuracy of INT8 model after initialization: {test_acc:.3f}")

# Train the INT8 model.
int8_model.fit(train_dataset, epochs=2)

# Validate the INT8 model.
test_loss, acc_int8 = int8_model.evaluate(
    validation_dataset, callbacks=tf.keras.callbacks.ProgbarLogger(stateful_metrics=['acc@1']))
print(f"\nAccuracy of INT8 model after fine-tuning: {acc_int8:.3f}")
print(
    f"\nAccuracy drop of tuned INT8 model over pre-trained FP32 model: {acc_fp32 - acc_int8:.3f}")
Accuracy of INT8 model after initialization: 0.812
Epoch 1/2
1/101 [..............................] - ETA: 11:57 - loss: 0.6168 - acc@1: 0.9844
  
2/101 [..............................] - ETA: 43s - loss: 0.6303 - acc@1: 0.9766
  
3/101 [..............................] - ETA: 42s - loss: 0.6613 - acc@1: 0.9609
  
4/101 [>.............................] - ETA: 41s - loss: 0.6650 - acc@1: 0.9551
  
5/101 [>.............................] - ETA: 40s - loss: 0.6783 - acc@1: 0.9469
  
6/101 [>.............................] - ETA: 40s - loss: 0.6805 - acc@1: 0.9466
  
7/101 [=>............................] - ETA: 39s - loss: 0.6796 - acc@1: 0.9442
  
8/101 [=>............................] - ETA: 39s - loss: 0.6790 - acc@1: 0.9463
  
9/101 [=>............................] - ETA: 38s - loss: 0.6828 - acc@1: 0.9462
   
10/101 [=>............................] - ETA: 38s - loss: 0.6908 - acc@1: 0.9422
   
11/101 [==>...........................] - ETA: 37s - loss: 0.6899 - acc@1: 0.9425
   
12/101 [==>...........................] - ETA: 37s - loss: 0.6930 - acc@1: 0.9421
   
13/101 [==>...........................] - ETA: 36s - loss: 0.6923 - acc@1: 0.9417
   
14/101 [===>..........................] - ETA: 36s - loss: 0.6960 - acc@1: 0.9386
   
15/101 [===>..........................] - ETA: 36s - loss: 0.6956 - acc@1: 0.9385
   
16/101 [===>..........................] - ETA: 35s - loss: 0.6946 - acc@1: 0.9395
   
17/101 [====>.........................] - ETA: 35s - loss: 0.6948 - acc@1: 0.9393
   
18/101 [====>.........................] - ETA: 34s - loss: 0.6941 - acc@1: 0.9405
   
19/101 [====>.........................] - ETA: 34s - loss: 0.6955 - acc@1: 0.9400
   
20/101 [====>.........................] - ETA: 33s - loss: 0.6931 - acc@1: 0.9402
   
21/101 [=====>........................] - ETA: 33s - loss: 0.6944 - acc@1: 0.9394
   
22/101 [=====>........................] - ETA: 33s - loss: 0.6953 - acc@1: 0.9382
   
23/101 [=====>........................] - ETA: 32s - loss: 0.6966 - acc@1: 0.9375
   
24/101 [======>.......................] - ETA: 32s - loss: 0.6971 - acc@1: 0.9368
   
25/101 [======>.......................] - ETA: 31s - loss: 0.6973 - acc@1: 0.9366
   
26/101 [======>.......................] - ETA: 31s - loss: 0.6975 - acc@1: 0.9369
   
27/101 [=======>......................] - ETA: 31s - loss: 0.6963 - acc@1: 0.9372
   
28/101 [=======>......................] - ETA: 30s - loss: 0.6960 - acc@1: 0.9378
   
29/101 [=======>......................] - ETA: 30s - loss: 0.6967 - acc@1: 0.9375
   
30/101 [=======>......................] - ETA: 29s - loss: 0.6982 - acc@1: 0.9365
   
31/101 [========>.....................] - ETA: 29s - loss: 0.6974 - acc@1: 0.9367
   
32/101 [========>.....................] - ETA: 28s - loss: 0.6966 - acc@1: 0.9373
   
33/101 [========>.....................] - ETA: 28s - loss: 0.6965 - acc@1: 0.9375
   
34/101 [=========>....................] - ETA: 28s - loss: 0.6978 - acc@1: 0.9370
   
35/101 [=========>....................] - ETA: 27s - loss: 0.6981 - acc@1: 0.9375
   
36/101 [=========>....................] - ETA: 27s - loss: 0.6992 - acc@1: 0.9382
   
37/101 [=========>....................] - ETA: 26s - loss: 0.7001 - acc@1: 0.9375
   
38/101 [==========>...................] - ETA: 26s - loss: 0.7023 - acc@1: 0.9369
   
39/101 [==========>...................] - ETA: 25s - loss: 0.7019 - acc@1: 0.9371
   
40/101 [==========>...................] - ETA: 25s - loss: 0.7016 - acc@1: 0.9373
   
41/101 [===========>..................] - ETA: 25s - loss: 0.7021 - acc@1: 0.9371
   
42/101 [===========>..................] - ETA: 24s - loss: 0.7018 - acc@1: 0.9371
   
43/101 [===========>..................] - ETA: 24s - loss: 0.7014 - acc@1: 0.9375
   
44/101 [============>.................] - ETA: 23s - loss: 0.7016 - acc@1: 0.9373
   
45/101 [============>.................] - ETA: 23s - loss: 0.7025 - acc@1: 0.9373
   
46/101 [============>.................] - ETA: 22s - loss: 0.7028 - acc@1: 0.9372
   
47/101 [============>.................] - ETA: 22s - loss: 0.7044 - acc@1: 0.9362
   
48/101 [=============>................] - ETA: 22s - loss: 0.7045 - acc@1: 0.9357
   
49/101 [=============>................] - ETA: 21s - loss: 0.7052 - acc@1: 0.9361
   
50/101 [=============>................] - ETA: 21s - loss: 0.7052 - acc@1: 0.9359
   
51/101 [==============>...............] - ETA: 20s - loss: 0.7061 - acc@1: 0.9357
   
52/101 [==============>...............] - ETA: 20s - loss: 0.7057 - acc@1: 0.9358
   
53/101 [==============>...............] - ETA: 20s - loss: 0.7061 - acc@1: 0.9350
   
54/101 [===============>..............] - ETA: 19s - loss: 0.7055 - acc@1: 0.9355
   
55/101 [===============>..............] - ETA: 19s - loss: 0.7052 - acc@1: 0.9357
   
56/101 [===============>..............] - ETA: 18s - loss: 0.7050 - acc@1: 0.9357
   
57/101 [===============>..............] - ETA: 18s - loss: 0.7053 - acc@1: 0.9352
   
58/101 [================>.............] - ETA: 17s - loss: 0.7057 - acc@1: 0.9351
   
59/101 [================>.............] - ETA: 17s - loss: 0.7062 - acc@1: 0.9345
   
60/101 [================>.............] - ETA: 17s - loss: 0.7064 - acc@1: 0.9345
   
61/101 [=================>............] - ETA: 16s - loss: 0.7064 - acc@1: 0.9343
   
62/101 [=================>............] - ETA: 16s - loss: 0.7056 - acc@1: 0.9347
   
63/101 [=================>............] - ETA: 15s - loss: 0.7060 - acc@1: 0.9345
   
64/101 [==================>...........] - ETA: 15s - loss: 0.7063 - acc@1: 0.9342
   
65/101 [==================>...........] - ETA: 15s - loss: 0.7073 - acc@1: 0.9337
   
66/101 [==================>...........] - ETA: 14s - loss: 0.7077 - acc@1: 0.9332
   
67/101 [==================>...........] - ETA: 14s - loss: 0.7083 - acc@1: 0.9327
   
68/101 [===================>..........] - ETA: 13s - loss: 0.7081 - acc@1: 0.9330
   
69/101 [===================>..........] - ETA: 13s - loss: 0.7087 - acc@1: 0.9330
   
70/101 [===================>..........] - ETA: 12s - loss: 0.7091 - acc@1: 0.9326
   
71/101 [====================>.........] - ETA: 12s - loss: 0.7081 - acc@1: 0.9330
   
72/101 [====================>.........] - ETA: 12s - loss: 0.7083 - acc@1: 0.9329
   
73/101 [====================>.........] - ETA: 11s - loss: 0.7075 - acc@1: 0.9334
   
74/101 [====================>.........] - ETA: 11s - loss: 0.7079 - acc@1: 0.9334
   
75/101 [=====================>........] - ETA: 10s - loss: 0.7085 - acc@1: 0.9329
   
76/101 [=====================>........] - ETA: 10s - loss: 0.7082 - acc@1: 0.9332
   
77/101 [=====================>........] - ETA: 9s - loss: 0.7078 - acc@1: 0.9333
   
78/101 [======================>.......] - ETA: 9s - loss: 0.7080 - acc@1: 0.9334
   
79/101 [======================>.......] - ETA: 9s - loss: 0.7079 - acc@1: 0.9332
   
80/101 [======================>.......] - ETA: 8s - loss: 0.7081 - acc@1: 0.9330
   
81/101 [=======================>......] - ETA: 8s - loss: 0.7078 - acc@1: 0.9333
   
82/101 [=======================>......] - ETA: 7s - loss: 0.7081 - acc@1: 0.9332
   
83/101 [=======================>......] - ETA: 7s - loss: 0.7080 - acc@1: 0.9332
   
84/101 [=======================>......] - ETA: 7s - loss: 0.7075 - acc@1: 0.9332
   
85/101 [========================>.....] - ETA: 6s - loss: 0.7080 - acc@1: 0.9332
   
86/101 [========================>.....] - ETA: 6s - loss: 0.7073 - acc@1: 0.9337
   
87/101 [========================>.....] - ETA: 5s - loss: 0.7079 - acc@1: 0.9330
   
88/101 [=========================>....] - ETA: 5s - loss: 0.7084 - acc@1: 0.9330
   
89/101 [=========================>....] - ETA: 4s - loss: 0.7087 - acc@1: 0.9331
   
90/101 [=========================>....] - ETA: 4s - loss: 0.7091 - acc@1: 0.9330
   
91/101 [==========================>...] - ETA: 4s - loss: 0.7096 - acc@1: 0.9327
   
92/101 [==========================>...] - ETA: 3s - loss: 0.7095 - acc@1: 0.9325
   
93/101 [==========================>...] - ETA: 3s - loss: 0.7099 - acc@1: 0.9320
   
94/101 [==========================>...] - ETA: 2s - loss: 0.7105 - acc@1: 0.9317
   
95/101 [===========================>..] - ETA: 2s - loss: 0.7107 - acc@1: 0.9312
   
96/101 [===========================>..] - ETA: 2s - loss: 0.7107 - acc@1: 0.9313
   
97/101 [===========================>..] - ETA: 1s - loss: 0.7109 - acc@1: 0.9312
   
98/101 [============================>.] - ETA: 1s - loss: 0.7111 - acc@1: 0.9311
   
99/101 [============================>.] - ETA: 0s - loss: 0.7123 - acc@1: 0.9305


100/101 [============================>.] - ETA: 0s - loss: 0.7123 - acc@1: 0.9305



101/101 [==============================] - ETA: 0s - loss: 0.7134 - acc@1: 0.9299



101/101 [==============================] - 49s 415ms/step - loss: 0.7134 - acc@1: 0.9299

Epoch 2/2
1/101 [..............................] - ETA: 43s - loss: 0.5798 - acc@1: 1.0000
  
2/101 [..............................] - ETA: 40s - loss: 0.5917 - acc@1: 1.0000
  
3/101 [..............................] - ETA: 40s - loss: 0.6191 - acc@1: 0.9896
  
4/101 [>.............................] - ETA: 39s - loss: 0.6225 - acc@1: 0.9844
  
5/101 [>.............................] - ETA: 39s - loss: 0.6332 - acc@1: 0.9781
  
6/101 [>.............................] - ETA: 39s - loss: 0.6378 - acc@1: 0.9753
  
7/101 [=>............................] - ETA: 39s - loss: 0.6392 - acc@1: 0.9732
  
8/101 [=>............................] - ETA: 38s - loss: 0.6395 - acc@1: 0.9736
  
9/101 [=>............................] - ETA: 37s - loss: 0.6435 - acc@1: 0.9740
   
10/101 [=>............................] - ETA: 37s - loss: 0.6508 - acc@1: 0.9688
   
11/101 [==>...........................] - ETA: 37s - loss: 0.6517 - acc@1: 0.9695
   
12/101 [==>...........................] - ETA: 36s - loss: 0.6548 - acc@1: 0.9681
   
13/101 [==>...........................] - ETA: 36s - loss: 0.6551 - acc@1: 0.9681
   
14/101 [===>..........................] - ETA: 36s - loss: 0.6592 - acc@1: 0.9660
   
15/101 [===>..........................] - ETA: 35s - loss: 0.6590 - acc@1: 0.9656
   
16/101 [===>..........................] - ETA: 35s - loss: 0.6580 - acc@1: 0.9673
   
17/101 [====>.........................] - ETA: 34s - loss: 0.6583 - acc@1: 0.9665
   
18/101 [====>.........................] - ETA: 34s - loss: 0.6584 - acc@1: 0.9666
   
19/101 [====>.........................] - ETA: 33s - loss: 0.6601 - acc@1: 0.9659
   
20/101 [====>.........................] - ETA: 33s - loss: 0.6586 - acc@1: 0.9656
   
21/101 [=====>........................] - ETA: 33s - loss: 0.6599 - acc@1: 0.9639
   
22/101 [=====>........................] - ETA: 32s - loss: 0.6610 - acc@1: 0.9634
   
23/101 [=====>........................] - ETA: 32s - loss: 0.6623 - acc@1: 0.9620
   
24/101 [======>.......................] - ETA: 31s - loss: 0.6630 - acc@1: 0.9609
   
25/101 [======>.......................] - ETA: 31s - loss: 0.6632 - acc@1: 0.9606
   
26/101 [======>.......................] - ETA: 31s - loss: 0.6638 - acc@1: 0.9603
   
27/101 [=======>......................] - ETA: 30s - loss: 0.6631 - acc@1: 0.9604
   
28/101 [=======>......................] - ETA: 30s - loss: 0.6629 - acc@1: 0.9609
   
29/101 [=======>......................] - ETA: 29s - loss: 0.6636 - acc@1: 0.9604
   
30/101 [=======>......................] - ETA: 29s - loss: 0.6652 - acc@1: 0.9594
   
31/101 [========>.....................] - ETA: 29s - loss: 0.6645 - acc@1: 0.9592
   
32/101 [========>.....................] - ETA: 28s - loss: 0.6641 - acc@1: 0.9592
   
33/101 [========>.....................] - ETA: 28s - loss: 0.6641 - acc@1: 0.9593
   
34/101 [=========>....................] - ETA: 27s - loss: 0.6655 - acc@1: 0.9586
   
35/101 [=========>....................] - ETA: 27s - loss: 0.6657 - acc@1: 0.9587
   
36/101 [=========>....................] - ETA: 27s - loss: 0.6665 - acc@1: 0.9588
   
37/101 [=========>....................] - ETA: 26s - loss: 0.6674 - acc@1: 0.9578
   
38/101 [==========>...................] - ETA: 26s - loss: 0.6695 - acc@1: 0.9570
   
39/101 [==========>...................] - ETA: 25s - loss: 0.6692 - acc@1: 0.9569
   
40/101 [==========>...................] - ETA: 25s - loss: 0.6689 - acc@1: 0.9574
   
41/101 [===========>..................] - ETA: 24s - loss: 0.6692 - acc@1: 0.9571
   
42/101 [===========>..................] - ETA: 24s - loss: 0.6692 - acc@1: 0.9568
   
43/101 [===========>..................] - ETA: 24s - loss: 0.6689 - acc@1: 0.9571
   
44/101 [============>.................] - ETA: 23s - loss: 0.6692 - acc@1: 0.9569
   
45/101 [============>.................] - ETA: 23s - loss: 0.6700 - acc@1: 0.9564
   
46/101 [============>.................] - ETA: 22s - loss: 0.6702 - acc@1: 0.9562
   
47/101 [============>.................] - ETA: 22s - loss: 0.6715 - acc@1: 0.9551
   
48/101 [=============>................] - ETA: 22s - loss: 0.6715 - acc@1: 0.9552
   
49/101 [=============>................] - ETA: 21s - loss: 0.6722 - acc@1: 0.9554
   
50/101 [=============>................] - ETA: 21s - loss: 0.6723 - acc@1: 0.9552
   
51/101 [==============>...............] - ETA: 20s - loss: 0.6732 - acc@1: 0.9547
   
52/101 [==============>...............] - ETA: 20s - loss: 0.6729 - acc@1: 0.9548
   
53/101 [==============>...............] - ETA: 20s - loss: 0.6734 - acc@1: 0.9542
   
54/101 [===============>..............] - ETA: 19s - loss: 0.6730 - acc@1: 0.9546
   
55/101 [===============>..............] - ETA: 19s - loss: 0.6728 - acc@1: 0.9544
   
56/101 [===============>..............] - ETA: 18s - loss: 0.6727 - acc@1: 0.9544
   
57/101 [===============>..............] - ETA: 18s - loss: 0.6732 - acc@1: 0.9538
   
58/101 [================>.............] - ETA: 17s - loss: 0.6735 - acc@1: 0.9537
   
59/101 [================>.............] - ETA: 17s - loss: 0.6739 - acc@1: 0.9531
   
60/101 [================>.............] - ETA: 17s - loss: 0.6741 - acc@1: 0.9530
   
61/101 [=================>............] - ETA: 16s - loss: 0.6741 - acc@1: 0.9530
   
62/101 [=================>............] - ETA: 16s - loss: 0.6735 - acc@1: 0.9533
   
63/101 [=================>............] - ETA: 15s - loss: 0.6738 - acc@1: 0.9531
   
64/101 [==================>...........] - ETA: 15s - loss: 0.6741 - acc@1: 0.9529
   
65/101 [==================>...........] - ETA: 14s - loss: 0.6750 - acc@1: 0.9523
   
66/101 [==================>...........] - ETA: 14s - loss: 0.6754 - acc@1: 0.9522
   
67/101 [==================>...........] - ETA: 14s - loss: 0.6758 - acc@1: 0.9518
   
68/101 [===================>..........] - ETA: 13s - loss: 0.6758 - acc@1: 0.9520
   
69/101 [===================>..........] - ETA: 13s - loss: 0.6763 - acc@1: 0.9520
   
70/101 [===================>..........] - ETA: 12s - loss: 0.6768 - acc@1: 0.9516
   
71/101 [====================>.........] - ETA: 12s - loss: 0.6760 - acc@1: 0.9518
   
72/101 [====================>.........] - ETA: 12s - loss: 0.6761 - acc@1: 0.9516
   
73/101 [====================>.........] - ETA: 11s - loss: 0.6755 - acc@1: 0.9518
   
74/101 [====================>.........] - ETA: 11s - loss: 0.6759 - acc@1: 0.9516
   
75/101 [=====================>........] - ETA: 10s - loss: 0.6765 - acc@1: 0.9515
   
76/101 [=====================>........] - ETA: 10s - loss: 0.6762 - acc@1: 0.9517
   
77/101 [=====================>........] - ETA: 10s - loss: 0.6759 - acc@1: 0.9520
   
78/101 [======================>.......] - ETA: 9s - loss: 0.6761 - acc@1: 0.9521
   
79/101 [======================>.......] - ETA: 9s - loss: 0.6760 - acc@1: 0.9518
   
80/101 [======================>.......] - ETA: 8s - loss: 0.6762 - acc@1: 0.9514
   
81/101 [=======================>......] - ETA: 8s - loss: 0.6759 - acc@1: 0.9516
   
82/101 [=======================>......] - ETA: 7s - loss: 0.6762 - acc@1: 0.9516
   
83/101 [=======================>......] - ETA: 7s - loss: 0.6761 - acc@1: 0.9515
   
84/101 [=======================>......] - ETA: 7s - loss: 0.6757 - acc@1: 0.9517
   
85/101 [========================>.....] - ETA: 6s - loss: 0.6762 - acc@1: 0.9517
   
86/101 [========================>.....] - ETA: 6s - loss: 0.6756 - acc@1: 0.9521
   
87/101 [========================>.....] - ETA: 5s - loss: 0.6762 - acc@1: 0.9516
   
88/101 [=========================>....] - ETA: 5s - loss: 0.6766 - acc@1: 0.9513
   
89/101 [=========================>....] - ETA: 5s - loss: 0.6768 - acc@1: 0.9515
   
90/101 [=========================>....] - ETA: 4s - loss: 0.6771 - acc@1: 0.9515
   
91/101 [==========================>...] - ETA: 4s - loss: 0.6775 - acc@1: 0.9512
   
92/101 [==========================>...] - ETA: 3s - loss: 0.6775 - acc@1: 0.9511
   
93/101 [==========================>...] - ETA: 3s - loss: 0.6778 - acc@1: 0.9509
   
94/101 [==========================>...] - ETA: 2s - loss: 0.6783 - acc@1: 0.9507
   
95/101 [===========================>..] - ETA: 2s - loss: 0.6785 - acc@1: 0.9502
   
96/101 [===========================>..] - ETA: 2s - loss: 0.6785 - acc@1: 0.9504
   
97/101 [===========================>..] - ETA: 1s - loss: 0.6787 - acc@1: 0.9501
   
98/101 [============================>.] - ETA: 1s - loss: 0.6790 - acc@1: 0.9499
   
99/101 [============================>.] - ETA: 0s - loss: 0.6800 - acc@1: 0.9493


100/101 [============================>.] - ETA: 0s - loss: 0.6800 - acc@1: 0.9493



101/101 [==============================] - ETA: 0s - loss: 0.6807 - acc@1: 0.9489



101/101 [==============================] - 42s 417ms/step - loss: 0.6807 - acc@1: 0.9489

0/Unknown - 0s 0s/sample - loss: 1.0568 - acc@1: 0.7812

  0/Unknown - 0s 0s/sample - loss: 0.9848 - acc@1: 0.8086

  0/Unknown - 0s 0s/sample - loss: 0.9768 - acc@1: 0.8177

  0/Unknown - 1s 0s/sample - loss: 0.9760 - acc@1: 0.8160


4/4 [==============================] - 1s 146ms/sample - loss: 0.9760 - acc@1: 0.8160

Accuracy of INT8 model after fine-tuning: 0.816

Accuracy drop of tuned INT8 model over pre-trained FP32 model: 0.006

Export Models to OpenVINO Intermediate Representation (IR)

Use model conversion Python API to convert the models to OpenVINO IR.

For more information about model conversion, see this page.

Executing this command may take a while.

model_ir_fp32 = ov.convert_model(fp32_model)
WARNING:tensorflow:Please fix your imports. Module tensorflow.python.training.tracking.base has been moved to tensorflow.python.trackable.base. The old module will be deleted in version 2.11.
WARNING:tensorflow:Please fix your imports. Module tensorflow.python.training.tracking.base has been moved to tensorflow.python.trackable.base. The old module will be deleted in version 2.11.
model_ir_int8 = ov.convert_model(int8_model)
ov.save_model(model_ir_fp32, fp32_ir_path, compress_to_fp16=False)
ov.save_model(model_ir_int8, int8_ir_path, compress_to_fp16=False)

Benchmark Model Performance by Computing Inference Time

Finally, measure the inference performance of the FP32 and INT8 models, using Benchmark Tool - an inference performance measurement tool in OpenVINO. By default, Benchmark Tool runs inference for 60 seconds in asynchronous mode on CPU. It returns inference speed as latency (milliseconds per image) and throughput (frames per second) values.

NOTE: This notebook runs benchmark_app for 15 seconds to give a quick indication of performance. For more accurate performance, it is recommended to run benchmark_app in a terminal/command prompt after closing other applications. Run benchmark_app -m model.xml -d CPU to benchmark async inference on CPU for one minute. Change CPU to GPU to benchmark on GPU. Run benchmark_app --help to see an overview of all command-line options.

Please select a benchmarking device using the dropdown list:

import ipywidgets as widgets

# Initialize OpenVINO runtime
core = ov.Core()
device = widgets.Dropdown(
    options=core.available_devices,
    value='CPU',
    description='Device:',
    disabled=False,
)

device
Dropdown(description='Device:', options=('CPU',), value='CPU')
def parse_benchmark_output(benchmark_output):
    parsed_output = [line for line in benchmark_output if 'FPS' in line]
    print(*parsed_output, sep='\n')


print('Benchmark FP32 model (IR)')
benchmark_output = ! benchmark_app -m $fp32_ir_path -d $device.value -api async -t 15 -shape [1,64,64,3]
parse_benchmark_output(benchmark_output)

print('\nBenchmark INT8 model (IR)')
benchmark_output = ! benchmark_app -m $int8_ir_path -d $device.value -api async -t 15 -shape [1,64,64,3]
parse_benchmark_output(benchmark_output)
Benchmark FP32 model (IR)
[ INFO ] Throughput:   2840.25 FPS

Benchmark INT8 model (IR)
[ INFO ] Throughput:   11202.29 FPS

Show Device Information for reference.

core = ov.Core()
core.get_property(device.value, "FULL_DEVICE_NAME")
'Intel(R) Core(TM) i9-10920X CPU @ 3.50GHz'