Step 3. Main Transformations#

Main transformations are the majority of low precision transformations. Transformations operate with dequantization operations. Main transformations include:

Let’s explore some main transformations on the example model. Original model:

Result model after main transformations:

Changes in the example model after main transformation:

All FakeQuantize operations (fakeQuantize1, fakeQuantize2 and fakeQuantize3) were decomposed:
- original FakeQuantize operations were replaced with new operations with other output intervals and output port precision,
- dequantization operations.
Dequantization operations were moved via precision preserved (concat1 and concat2) and quantized (convolution2) operations.

Note

The left branch (branch #1) does not require per-tensor quantization. As a result, the fakeQuantize1 output interval is [0, 255]. But quantized convolution2 requires per-tensor quantization on the right branch (branch #2). Then all connected FakeQuantize interval operations (fakeQuantize1 and fakeQuantize2) are aligned to have per-tensor quantization after the concatenation (concat2) operation.