Model Accuracy#

The following two tables present the absolute accuracy drop calculated as the accuracy difference between OV-accuracy and the original frame work accuracy for FP32, and the same for INT8, BF16 and FP16 representations of a model on three platform architectures. The third table presents the GenAI model accuracies as absolute accuracy values. Please also refer to notes below the table for more information.

  • A - Intel® Core™ Ultra 9-185H (AVX2), INT8 and FP32

  • B - Intel® Xeon® 6338, (VNNI), INT8 and FP32

  • C - Intel® Xeon 6972P (VNNI, AMX), INT8, BF16, FP32

  • D - Intel® Arc-B580, INT8 and FP16

Model Accuracy for INT8#

OpenVINO™ Model name

dataset

Metric Name

A, INT8

B, INT8

C, INT8

D, INT8

bert-base-cased

SST-2_bert_cased_padded

spearman@cosine

2.57%

2.65%

2.54%

2.89%

Detectron-V2

COCO2017_detection_91cl_bkgr

coco_orig_precision

mobilenet-v2

ImageNet2012

accuracy @ top1

-0.93%

-0.91%

-1.03%

-0.95%

resnet-50

ImageNet2012

accuracy @ top1

-0.12%

-0.12%

-0.15%

-0.15%

ssd-resnet34-1200

COCO2017_detection_80cl_bkgr

map

0.00%

0.00%

-0.03%

0.07%

yolo_v11

COCO2017_detection_80cl

map

Model Accuracy for BF16, FP32 and FP16 (FP16: Arc only. BF16: Xeon® 6972P only)#

OpenVINO™ Model name

dataset

Metric Name

A, FP32

B, FP32

C, FP32

D, FP16

bert-base-cased

SST-2_bert_cased_padded

spearman@cosine

0.00%

0.00%

0.00%

0.02%

Detectron-V2

COCO2017_detection_91cl_bkgr

coco_orig_precision

mobilenet-v2

ImageNet2012

accuracy @ top1

0.00%

0.00%

0.00%

0.01%

resnet-50

ImageNet2012

accuracy @ top1

0.00%

0.00%

0.00%

0.01%

ssd-resnet34-1200

COCO2017_detection_80cl_bkgr

map

0.02%

0.02%

0.01%

-0.06%

yolo_v11

COCO2017_detection_80cl

map

-2.70%

Model Accuracy for AMX-FP16, AMX-INT4, Arc-FP16 and Arc-INT4 (Arc™ B-series)#

OpenVINO™ Model name

dataset

Metric Name

A, AMX-FP16

B, AMX-INT4

C, Arc-FP16

D, Arc-INT4

DeepSeek-R1-Distill-Llama-8B

Data Default WWB

Similarity

9.71%

21.25%

21.04%

DeepSeek-R1-Distill-Qwen-1.5B

Data Default WWB

Similarity

8.45%

34.5%

22.10%

32.02%

DeepSeek-R1-Distill-Qwen-7B

Data Default WWB

Similarity

25.5%

35.6%

3.9%

37.2%

Gemma-2-9B-it

Data Default WWB

Similarity

0.89%

3.99%

%

4.04%

GLM4-9B-Chat

Data Default WWB

Similarity

2.52%

8.48%

8.38%

Qwen-2.5-7B-instruct

Data Default WWB

Similarity

1.51%

8.3%

8.237%

Llama-2-7b-chat

Data Default WWB

Similarity

1.43%

7.46%

7.18%

Llama-3.2-3b-instruct

Data Default WWB

Similarity

2.75%

12.05%

0.52%

11.95%

Mistral-7b-instruct-V0.3

Data Default WWB

Similarity

2.46%

8.93%

3.17%

7.90%

Phi3-mini-4k-instruct

Data Default WWB

Similarity

4.55%

7.23%

1.39%

8.47%

Phi4-mini-instruct

Data Default WWB

Similarity

6.59%

12.17%

1.91%

12.03%

Qwen2-VL-7B

Data Default WWB

Similarity

1.29%

8.71%

4.22%

9.43%

Flux.1-schnell

Data Default WWB

Similarity

4.80%

3.80%

2.80%

Stable-Diffusion-V1-5

Data Default WWB

Similarity

3.00%

4.30%

0.50%

4.40%

Notes: For all accuracy metrics a “-”, (minus sign), indicates an accuracy drop. The Similarity metric is the distance from “perfect” and as such always positive. Similarity is cosine similarity - the dot product of two vectors divided by the product of their lengths.

Results may vary. For more information, see F.A.Q. and Platforms, Configurations, Methodology. See Legal Information.