Model Accuracy#

The following two tables present the absolute accuracy drop calculated as the accuracy difference between OV-accuracy and the original frame work accuracy for FP32, and the same for INT8, BF16 and FP16 representations of a model on three platform architectures. The third table presents the GenAI model accuracies as absolute accuracy values. Please also refer to notes below the table for more information.

  • A - Intel® Core™ Ultra 9-185H (AVX2), INT8 and FP32

  • B - Intel® Xeon® 6338, (VNNI), INT8 and FP32

  • C - Intel® Xeon 6972P (VNNI, AMX), INT8, BF16, FP32

  • D - Intel® Arc-B580, INT8 and FP16

Model Accuracy for INT8#

OpenVINO™ Model name

dataset

Metric Name

A, INT8

B, INT8

C, INT8

D, INT8

bert-base-cased

SST-2_bert_cased_padded

spearman@cosine

2.41%

2.78%

2.61%

2.84%

mask_rcnn_resnet50_atrous_coco

COCO2017_detection_91cl_bkgr

coco_orig_precision

mobilenet-v2

ImageNet2012

accuracy @ top1

-1.03%

-1.00%

-1.03%

-1.01%

resnet-50

ImageNet2012

accuracy @ top1

-0.17%

-0.17%

-0.18%

-0.17%

ssd-resnet34-1200

COCO2017_detection_80cl_bkgr

map

-0.01%

-0.01%

-0.04%

-0.04%

yolo_v8n

COCO2017_detection_80cl

map

-0.09%

-0.09%

-0.02%

-0.04%

Model Accuracy for BF16, FP32 and FP16 (FP16: Arc only. BF16: Xeon® 6972P only)#

OpenVINO™ Model name

dataset

Metric Name

A, FP32

B, FP32

C, FP32

C, BF16

D, FP16

bert-base-cased

SST-2_bert_cased_padded

spearman@cosine

0.00%

0.00%

0.00%

-0.01%

0.02%

mask_rcnn_resnet50_atrous_coco

COCO2017_detection_91cl_bkgr

coco_orig_precision

mobilenet-v2

ImageNet2012

accuracy @ top1

0.00%

0.00%

0.00%

-0.23%

-0.03%

resnet-50

ImageNet2012

accuracy @ top1

0.00%

0.00%

0.00%

0.06%

0.01%

ssd-resnet34-1200

COCO2017_detection_80cl_bkgr

map

0.02%

0.02%

0.01%

0.02%

0.06%

yolo_v8n

COCO2017_detection_80cl

map

0.01%

0.01%

0.01%

-0.03%

Model Accuracy for AMX-FP16, AMX-INT4, Arc-FP16 and Arc-INT4 (Arc™ B-series)#

OpenVINO™ Model name

dataset

Metric Name

A, AMX-FP16

B, AMX-INT4

C, Arc-FP16

D, Arc-INT4

GLM4-9B-Chat

Data Default WWB

Similarity

6.9%

3.8%

6.3%

15.1%

Qwen-2.5-7B-instruct

Data Default WWB

Similarity

7.97%

25.12%

0.09%

23.87%

Gemma-2-9B

Data Default WWB

Similarity

4.81%

10.25%

1.73%

10.24%

Llama-2-7b-chat

Data Default WWB

Similarity

1.80%

22.31%

0.13%

21.54%

Llama-3-8b

Data Default WWB

Similarity

2.26%

23.00%

0.12%

23.59%

Llama-3.2-3b-instruct

Data Default WWB

Similarity

2.40%

11.25%

0.00%

12.32%

Mistral-7b-instruct-V0.2

Data Default WWB

Similarity

2.94%

9.08%

0.37%

9.53%

Phi3-mini-4k-instruct

Data Default WWB

Similarity

8.08%

7.93%

0.00%

8.30%

Qwen-2-7B

Data Default WWB

Similarity

4.97%

18.97%

0.00%

22.38%

Flux.1-schnell

Data Default WWB

Similarity

4.60%

4.20%

5.00%

3.30%

Stable-Diffusion-V1-5

Data Default WWB

Similarity

2.50%

1.90%

2.10%

0.10%

Notes: For all accuracy metrics a “-”, (minus sign), indicates an accuracy drop. The Similarity metric is the distance from “perfect” and as such always positive. Similarity is cosine similarity - the dot product of two vectors divided by the product of their lengths.

Results may vary. For more information, see F.A.Q. and Platforms, Configurations, Methodology. See Legal Information.