INT8 vs FP32 Comparison on Select Networks and Platforms

The table below illustrates the speed-up factor for the performance gain by switching from an FP32 representation of an OpenVINO™ supported model to its INT8 representation.

Intel® Core™
i7-8700T
Intel® Core™
i7-1185G7
Intel® Xeon®
W-1290P
Intel® Xeon®
Platinum
8270
OpenVINO
benchmark
model name
Dataset Throughput speed-up FP16-INT8 vs FP32
bert-large-
uncased-whole-word-
masking-squad-0001
SQuAD 1.6 3.1 1.5 2.5
brain-tumor-
segmentation-
0001-MXNET
BraTS 1.6 2.0 1.8 1.8
deeplabv3-TF VOC 2012
Segmentation
1.9 3.0 2.8 3.1
densenet-121-TF ImageNet 1.8 3.5 1.9 3.8
facenet-
20180408-
102900-TF
LFW 2.1 3.6 2.2 3.7
faster_rcnn_
resnet50_coco-TF
MS COCO 1.9 3.7 2.0 3.4
inception-v3-TF ImageNet 1.9 3.8 2.0 4.1
mobilenet-
ssd-CF
VOC2012 1.6 3.1 1.9 3.6
mobilenet-v2-1.0-
224-TF
ImageNet 1.5 2.4 1.8 3.9
mobilenet-v2-
pytorch
ImageNet 1.7 2.4 1.9 4.0
resnet-18-
pytorch
ImageNet 1.9 3.7 2.1 4.2
resnet-50-
pytorch
ImageNet 1.9 3.6 2.0 3.9
resnet-50-
TF
ImageNet 1.9 3.6 2.0 3.9
squeezenet1.1-
CF
ImageNet 1.7 3.2 1.8 3.4
ssd_mobilenet_
v1_coco-tf
VOC2012 1.8 3.1 2.0 3.6
ssd300-CF MS COCO 1.8 4.2 1.9 3.9
ssdlite_
mobilenet_
v2-TF
MS COCO 1.7 2.5 2.4 3.5
yolo_v4-TF MS COCO 1.9 3.6 2.0 3.4
unet-camvid-onnx-0001 MS COCO 1.7 3.9 1.7 3.7
ssd-resnet34-
1200-onnx
MS COCO 1.7 4.0 1.7 3.4
googlenet-v4-tf ImageNet 1.9 3.9 2.0 4.1
vgg19-caffe ImageNet 1.9 4.7 2.0 4.5
yolo-v3-tiny-tf MS COCO 1.7 3.4 1.9 3.5

The following table shows the absolute accuracy drop that is calculated as the difference in accuracy between the FP32 representation of a model and its INT8 representation.

Intel® Core™
i9-10920X CPU
@ 3.50GHZ (VNNI)
Intel® Core™
i9-9820X CPU
@ 3.30GHz (AVX512)
Intel® Core™
i7-6700K CPU
@ 4.0GHz (AVX2)
Intel® Core™
i7-1185G7 CPU
@ 4.0GHz (TGL VNNI)
OpenVINO Benchmark
Model Name
Dataset Metric Name Absolute Accuracy Drop, %
bert-large-uncased-whole-word-masking-squad-0001 SQuAD F1 0.62 0.71 0.62 0.62
brain-tumor-
segmentation-
0001-MXNET
BraTS Dice-index@
Mean@
Overall Tumor
0.08 0.10 0.10 0.08
deeplabv3-TF VOC 2012
Segmentation
mean_iou 0.09 0.41 0.41 0.09
densenet-121-TF ImageNet acc@top-1 0.49 0.56 0.56 0.49
facenet-
20180408-
102900-TF
LFW pairwise_
accuracy
_subsets
0.05 0.12 0.12 0.05
faster_rcnn_
resnet50_coco-TF
MS COCO coco_
precision
0.09 0.09 0.09 0.09
inception-v3-TF ImageNet acc@top-1 0.02 0.01 0.01 0.02
mobilenet-
ssd-CF
VOC2012 mAP 0.06 0.04 0.04 0.06
mobilenet-v2-1.0-
224-TF
ImageNet acc@top-1 0.40 0.76 0.76 0.40
mobilenet-v2-
PYTORCH
ImageNet acc@top-1 0.36 0.52 0.52 0.36
resnet-18-
pytorch
ImageNet acc@top-1 0.25 0.25 0.25 0.25
resnet-50-
PYTORCH
ImageNet acc@top-1 0.19 0.21 0.21 0.19
resnet-50-
TF
ImageNet acc@top-1 0.11 0.11 0.11 0.11
squeezenet1.1-
CF
ImageNet acc@top-1 0.64 0.66 0.66 0.64
ssd_mobilenet_
v1_coco-tf
VOC2012 COCO mAp 0.17 2.96 2.96 0.17
ssd300-CF MS COCO COCO mAp 0.18 3.06 3.06 0.18
ssdlite_
mobilenet_
v2-TF
MS COCO COCO mAp 0.11 0.43 0.43 0.11
yolo_v4-TF MS COCO COCO mAp 0.06 0.03 0.03 0.06
unet-camvid-
onnx-0001
MS COCO COCO mAp 0.29 0.29 0.31 0.29
ssd-resnet34-
1200-onnx
MS COCO COCO mAp 0.02 0.03 0.03 0.02
googlenet-v4-tf ImageNet COCO mAp 0.08 0.06 0.06 0.06
vgg19-caffe ImageNet COCO mAp 0.02 0.04 0.04 0.02
yolo-v3-tiny-tf MS COCO COCO mAp 0.02 0.6 0.6 0.02
INT8 vs FP32 Comparison

For more complete information about performance and benchmark results, visit: www.intel.com/benchmarks and Optimization Notice. Legal Information.