This is a tunable variant of MinMaxQuantization and is usually used as part of a pipeline with auxiliary algorithms. The default recommended pipeline consists of the same algorithms as **DefaultQuantization** with MinMaxQuantization replaced with this one.
The algorithm accepts the following parameters:
tuning_scope
determines which quantization configurations will be returned to optimizer as viable options and can be a list of any of the following values:
bits
, mode
, granularity
, range_estimator
- used for quantization configuration derivation described below,layer
- adds to the possible quantization configurations option that specific layer will not be quantized.Quantization configuration derivation is done by creating a list of all available quantization configurations supported by target hardware and then filtering it using base configuration (either from preset
or previous best result) and tuning_scope
. Filtering is done by choosing from all available options only those that differ from base configuration only on values of variables specified in tuning_scope
.
The selection of whether to use preset
or previous best result as base configuration depends on optimizer's trials_load_method
:
cold_start
- preset
determines base quantization configuration,fine_tune
- preset
option is ignored and quantization configuration used to achieve the best result in previous run is used as base quantization configuration.