Model Accuracy#

The following two tables present the absolute accuracy drop calculated as the accuracy difference between OV-accuracy and the original frame work accuracy for FP32, and the same for INT8, BF16 and FP16 representations of a model on three platform architectures. The third table presents the GenAI model accuracies as absolute accuracy values. Please also refer to notes below the table for more information.

  • A - Intel® Core™ Ultra 9-185H (AVX2), INT8 and FP32

  • B - Intel® Xeon® 6338, (VNNI), INT8 and FP32

  • C - Intel® Xeon 6972P (VNNI, AMX), INT8, BF16, FP32

  • D - Intel® Arc-B60, INT8 and FP16

Model Accuracy for INT8#

OpenVINO™ Model name

dataset

Metric Name

A, INT8

B, INT8

C, INT8

D, INT8

bert-base-cased

SST-2_bert_cased_padded

spearman@cosine

3.14%

2.65%

2.95%

2.73%

mobilenet-v2

ImageNet2012

accuracy @ top1

-0.94%

-0.87%

-0.94%

-1.07%

resnet-50

ImageNet2012

accuracy @ top1

-0.16%

-0.16%

-0.16%

-0.20%

ssd-resnet34-1200

COCO2017_detection_80cl_bkgr

map

-0.03%

0.02%

-0.03%

0.05%

Model Accuracy for BF16, FP32 and FP16 (FP16: Arc only. BF16: Xeon® 6972P only)#

OpenVINO™ Model name

dataset

Metric Name

A, FP32

B, FP32

C, FP32

D, FP16

bert-base-cased

SST-2_bert_cased_padded

spearman@cosine

0.00%

0.00%

0.00%

0.01%

mobilenet-v2

ImageNet2012

accuracy @ top1

-0.00%

-0.00%

-0.00%

-0.01%

resnet-50

ImageNet2012

accuracy @ top1

0.00%

0.00%

0.00%

0.00%

ssd-resnet34-1200

COCO2017_detection_80cl_bkgr

map

0.02%

0.01%

0.02%

0.02%

yolo_v11

COCO2017_detection_80cl

AP@0.5:0.05:0.95

-0.03%

-2.21%

-2.21%

yolo_v26

COCO2017_detection_80cl

AP@0.5:0.05:0.95

0.00%

0.00%

0.02%

Model Accuracy for AMX-FP16, AMX-INT4, Arc-FP16 and Arc-INT4 (Arc™ B-series)#

OpenVINO™ Model name

dataset

Metric Name

A, AMX-FP16

B, AMX-INT4

C, Arc-FP16

D, Arc-INT4

DeepSeek-R1-Distill-Llama-8B

Data Default WWB

Similarity

98.8%

95.6%

99.8%

94.9%

Gemma-3-4B-it

Data Default WWB

Similarity

91.8%

85.3%

90.0

86.0%

GPT-OSS-20B

Data Default WWB

Similarity

94.2%

92.3%

94.1%

GPT-OSS-120B

Data Default WWB

Similarity

98.1%

94.2%

Llama-2-7B-chat

Data Default WWB

Similarity

98.7%

94.4%

99.9%

93.4%

Llama-3.2-3b-instruct

Data Default WWB

Similarity

98.7%

91.9%

99.9%

94.3%

MiniCPM-V-2.6

Data Default WWB

Similarity

93.7%

90.3%

97.4%

90.4%

Phi4-mini-instruct

Data Default WWB

Similarity

97.2%

94.9%

99.5%

92.3%

Qwen2.5-VL-7B

Data Default WWB

Similarity

92.9%

89.5%

93.6%

90.2%

Qwen3-8B

Data Default WWB

Similarity

99.2%

92.7%

99.9%

90.8%

Qwen3-30B-A3B

Data Default WWB

Similarity

97.1%

93.0%

93.8%

Qwen3.6-27B

Data Default WWB

Similarity

98.2%

93.4%

Flux.1-schnell

Data Default WWB

Similarity

99.6%

97.5%

96.2%

Stable-Diffusion-V1-5

Data Default WWB

Similarity

94.9%

97.1%

94.3%

99.4%

LTX-VIDEO

Data Default WWB

Similarity

99.7%

94.6%

64.1%

57.6%

Notes: For all accuracy metrics a “-”, (minus sign), indicates an accuracy drop. The Similarity metric is the distance from “perfect” and as such always positive. Similarity is cosine similarity - the dot product of two vectors divided by the product of their lengths.

Results may vary. For more information, see F.A.Q. and Platforms, Configurations, Methodology. See Legal Information.