Training-time Optimization#

Training-time optimization offered by NNCF is based on model compression algorithms executed alongside the training process. This approach results in the optimal balance between lower accuracy and higher performance, and better results than post-training quantization. It also enables you to set the minimum acceptable accuracy value for your optimized model, determining the optimization efficiency.

With a few lines of code, you can apply NNCF compression to a PyTorch or TensorFlow training script. Once the model is optimized, you may convert it to the OpenVINO IR format, getting even better inference results with OpenVINO Runtime. To optimize your model, you will need:

  • A PyTorch or TensorFlow floating-point model.

  • A training pipeline set up in the original framework (PyTorch or TensorFlow).

  • Training and validation datasets.

  • A JSON configuration file specifying which compression methods to use.

../../_images/nncf_workflow.svg

Training-Time Compression Methods#

Quantization#

Uniform 8-bit quantization, the method officially supported by NNCF, converts all weights and activation values in a model from a high-precision format, such as 32-bit floating point, to a lower-precision format, such as 8-bit integer. During training, it inserts into the model nodes that simulate the effect of a lower precision. This way, the training algorithm considers quantization errors part of the overall training loss and tries to minimize their impact.

To learn more, see:

Filter pruning#

During fine-tuning, the importance criterion is used to search for redundant convolutional layer filters that don’t significantly contribute to the model’s output. After fine-tuning, these filters are removed from the model.

For more information, see:

Experimental methods#

NNCF provides some state-of-the-art compression methods that are still in the experimental stages of development and are only recommended for expert developers. These include:

To learn more about these methods, see developer documentation of the NNCF repository.

Additional Resources#