Performance Information Frequently Asked Questions

The following questions (Q#) and answers (A) are related to published performance benchmarks.

Q1: How often do performance benchmarks get updated?

A : New performance benchmarks are typically published on every major.minor release of the Intel® Distribution of OpenVINO™ toolkit.

Q2: Where can I find the models used in the performance benchmarks?

A : All models used are included in the GitHub repository of Open Model Zoo.

Q3: Will there be any new models added to the list used for benchmarking?

A : The models used in the performance benchmarks were chosen based on general adoption and usage in deployment scenarios. New models that support a diverse set of workloads and usage are added periodically.

Q4: What does “CF” or “TF” in the graphs stand for?

A : The “CF” means “Caffe”, and “TF” means “TensorFlow”.

Q5: How can I run the benchmark results on my own?

A : All of the performance benchmarks were generated using the open-source tool within the Intel® Distribution of OpenVINO™ toolkit called benchmark_app. This tool is available in both C++ and Python.

Q6: What image sizes are used for the classification network models?

A : The image size used in inference depends on the benchmarked network. The table below presents the list of input sizes for each network model:

Model

Public Network

Task

Input Size (Height x Width)

bert-base-cased

BERT

question / answer

124

bert-large-uncased-whole-word-masking-squad-int8-0001

BERT-large

question / answer

384

bert-small-uncased-whole-masking-squad-0002

BERT-small

question / answer

384

brain-tumor-segmentation-0001-MXNET

brain-tumor-segmentation-0001

semantic segmentation

128x128x128

brain-tumor-segmentation-0002-CF2

brain-tumor-segmentation-0002

semantic segmentation

128x128x128

deeplabv3-TF

DeepLab v3 Tf

semantic segmentation

513x513

densenet-121-TF

Densenet-121 Tf

classification

224x224

efficientdet-d0

Efficientdet

classification

512x512

facenet-20180408-102900-TF

FaceNet TF

face recognition

160x160

Facedetection0200

FaceDetection0200

detection

256x256

faster_rcnn_resnet50_coco-TF

Faster RCNN Tf

object detection

600x1024

forward-tacotron-duration-prediction

ForwardTacotron

text to speech

241

inception-v4-TF

Inception v4 Tf (aka GoogleNet-V4)

classification

299x299

inception-v3-TF

Inception v3 Tf

classification

299x299

mask_rcnn_resnet50_atrous_coco

Mask R-CNN ResNet50 Atrous

instance segmentation

800x1365

mobilenet-ssd-CF

SSD (MobileNet)_COCO-2017_Caffe

object detection

300x300

mobilenet-v2-1.0-224-TF

MobileNet v2 Tf

classification

224x224

mobilenet-v2-pytorch

Mobilenet V2 PyTorch

classification

224x224

Mobilenet-V3-small

Mobilenet-V3-1.0-224

classifier

224x224

Mobilenet-V3-large

Mobilenet-V3-1.0-224

classifier

224x224

pp-ocr-rec

PP-OCR

optical character recognition

32x640

pp-yolo

PP-YOLO

detection

640x640

resnet-18-pytorch

ResNet-18 PyTorch

classification

224x224

resnet-50-pytorch

ResNet-50 v1 PyTorch

classification

224x224

resnet-50-TF

ResNet-50_v1_ILSVRC-2012

classification

224x224

yolo_v4-TF

Yolo-V4 TF

object detection

608x608

ssd_mobilenet_v1_coco-TF

ssd_mobilenet_v1_coco

object detection

300x300

ssdlite_mobilenet_v2-TF

ssdlite_mobilenet_v2

object detection

300x300

unet-camvid-onnx-0001

U-Net

semantic segmentation

368x480

yolo-v3-tiny-tf

YOLO v3 Tiny

object detection

416x416

yolo-v3

YOLO v3

object detection

416x416

ssd-resnet34-1200-onnx

ssd-resnet34 onnx model

object detection

1200x1200

Q7: Where can I purchase the specific hardware used in the benchmarking?

A : Intel partners with vendors all over the world. For a list of Hardware Manufacturers, see the Intel® AI: In Production Partners & Solutions Catalog. For more details, see the Supported Devices documentation. Before purchasing any hardware, you can test and run models remotely, using Intel® DevCloud for the Edge.

Q8: How can I optimize my models for better performance or accuracy?

A : Set of guidelines and recommendations to optimize models are available in the optimization guide. Join the conversation in the Community Forum for further support.

Q9: Why are INT8 optimized models used for benchmarking on CPUs with no VNNI support?

A : The benefit of low-precision optimization using the OpenVINO™ toolkit model optimizer extends beyond processors supporting VNNI through Intel® DL Boost. The reduced bit width of INT8 compared to FP32 allows Intel® CPU to process the data faster. Therefore, it offers better throughput on any converted model, regardless of the intrinsically supported low-precision optimizations within Intel® hardware. For comparison on boost factors for different network models and a selection of Intel® CPU architectures, including AVX-2 with Intel® Core™ i7-8700T, and AVX-512 (VNNI) with Intel® Xeon® 5218T and Intel® Xeon® 8270, refer to the Model Accuracy for INT8 and FP32 Precision article.

Q10: Where can I search for OpenVINO™ performance results based on HW-platforms?

A : The website format has changed in order to support more common approach of searching for the performance results of a given neural network model on different HW-platforms. As opposed to reviewing performance of a given HW-platform when working with different neural network models.

Q11: How is Latency measured?

A : Latency is measured by running the OpenVINO™ Runtime in synchronous mode. In this mode, each frame or image is processed through the entire set of stages (pre-processing, inference, post-processing) before the next frame or image is processed. This KPI is relevant for applications where the inference on a single image is required. For example, the analysis of an ultra sound image in a medical application or the analysis of a seismic image in the oil & gas industry. Other use cases include real or near real-time applications, e.g. the response of industrial robot to changes in its environment and obstacle avoidance for autonomous vehicles, where a quick response to the result of the inference is required.