Step 3. Main Transformations#
Main transformations are the majority of low precision transformations. Transformations operate with dequantization operations. Main transformations include:
Let’s explore some main transformations on the example model. Original model:
Result model after main transformations:
Changes in the example model after main transformation:
All
FakeQuantize
operations (fakeQuantize1
,fakeQuantize2
andfakeQuantize3
) were decomposed:original
FakeQuantize
operations were replaced with new operations with other output intervals and output port precision,dequantization operations.
Dequantization operations were moved via precision preserved (
concat1
andconcat2
) and quantized (convolution2
) operations.
Note
The left branch (branch #1) does not require per-tensor quantization. As a result, the fakeQuantize1
output interval is [0, 255]. But quantized convolution2
requires per-tensor quantization on the right branch (branch #2). Then all connected FakeQuantize
interval operations (fakeQuantize1
and fakeQuantize2
) are aligned to have per-tensor quantization after the concatenation (concat2
) operation.