Performance Information Frequently Asked Questions¶
The following questions (Q#) and answers (A) are related to published performance benchmarks.
Q1: How often do performance benchmarks get updated?¶
A : New performance benchmarks are typically published on every major.minor
release of the Intel® Distribution of OpenVINO™ toolkit.
Q2: Where can I find the models used in the performance benchmarks?¶
A : All models used are included in the GitHub repository of Open Model Zoo.
Q3: Will there be any new models added to the list used for benchmarking?¶
A : The models used in the performance benchmarks were chosen based on general adoption and usage in deployment scenarios. New models that support a diverse set of workloads and usage are added periodically.
Q4: What does “CF” or “TF” in the graphs stand for?¶
A : The “CF” means “Caffe”, and “TF” means “TensorFlow”.
Q5: How can I run the benchmark results on my own?¶
A : All of the performance benchmarks were generated using the open-source tool within the Intel® Distribution of OpenVINO™ toolkit called benchmark_app
. This tool is available in both C++ and Python.
Q6: What image sizes are used for the classification network models?¶
A : The image size used in inference depends on the benchmarked network. The table below presents the list of input sizes for each network model:
Model |
Public Network |
Task |
Input Size (Height x Width) |
---|---|---|---|
BERT |
question / answer |
124 |
|
BERT-large |
question / answer |
384 |
|
BERT-small |
question / answer |
384 |
|
brain-tumor-segmentation-0001 |
semantic segmentation |
128x128x128 |
|
brain-tumor-segmentation-0002 |
semantic segmentation |
128x128x128 |
|
DeepLab v3 Tf |
semantic segmentation |
513x513 |
|
Densenet-121 Tf |
classification |
224x224 |
|
Efficientdet |
classification |
512x512 |
|
FaceNet TF |
face recognition |
160x160 |
|
FaceDetection0200 |
detection |
256x256 |
|
Faster RCNN Tf |
object detection |
600x1024 |
|
ForwardTacotron |
text to speech |
241 |
|
Inception v4 Tf (aka GoogleNet-V4) |
classification |
299x299 |
|
Inception v3 Tf |
classification |
299x299 |
|
Mask R-CNN ResNet50 Atrous |
instance segmentation |
800x1365 |
|
SSD (MobileNet)_COCO-2017_Caffe |
object detection |
300x300 |
|
MobileNet v2 Tf |
classification |
224x224 |
|
Mobilenet V2 PyTorch |
classification |
224x224 |
|
Mobilenet-V3-1.0-224 |
classifier |
224x224 |
|
Mobilenet-V3-1.0-224 |
classifier |
224x224 |
|
PP-OCR |
optical character recognition |
32x640 |
|
PP-YOLO |
detection |
640x640 |
|
ResNet-18 PyTorch |
classification |
224x224 |
|
ResNet-50 v1 PyTorch |
classification |
224x224 |
|
ResNet-50_v1_ILSVRC-2012 |
classification |
224x224 |
|
Yolo-V4 TF |
object detection |
608x608 |
|
ssd_mobilenet_v1_coco |
object detection |
300x300 |
|
ssdlite_mobilenet_v2 |
object detection |
300x300 |
|
U-Net |
semantic segmentation |
368x480 |
|
YOLO v3 Tiny |
object detection |
416x416 |
|
YOLO v3 |
object detection |
416x416 |
|
ssd-resnet34 onnx model |
object detection |
1200x1200 |
Q7: Where can I purchase the specific hardware used in the benchmarking?¶
A : Intel partners with vendors all over the world. For a list of Hardware Manufacturers, see the Intel® AI: In Production Partners & Solutions Catalog. For more details, see the Supported Devices documentation. Before purchasing any hardware, you can test and run models remotely, using Intel® DevCloud for the Edge.
Q8: How can I optimize my models for better performance or accuracy?¶
A : Set of guidelines and recommendations to optimize models are available in the optimization guide. Join the conversation in the Community Forum for further support.
Q9: Why are INT8 optimized models used for benchmarking on CPUs with no VNNI support?¶
A : The benefit of low-precision optimization using the OpenVINO™ toolkit model optimizer extends beyond processors supporting VNNI through Intel® DL Boost. The reduced bit width of INT8 compared to FP32 allows Intel® CPU to process the data faster. Therefore, it offers better throughput on any converted model, regardless of the intrinsically supported low-precision optimizations within Intel® hardware. For comparison on boost factors for different network models and a selection of Intel® CPU architectures, including AVX-2 with Intel® Core™ i7-8700T, and AVX-512 (VNNI) with Intel® Xeon® 5218T and Intel® Xeon® 8270, refer to the Model Accuracy for INT8 and FP32 Precision article.
Q10: Where can I search for OpenVINO™ performance results based on HW-platforms?¶
A : The website format has changed in order to support more common approach of searching for the performance results of a given neural network model on different HW-platforms. As opposed to reviewing performance of a given HW-platform when working with different neural network models.
Q11: How is Latency measured?¶
A : Latency is measured by running the OpenVINO™ Runtime in synchronous mode. In this mode, each frame or image is processed through the entire set of stages (pre-processing, inference, post-processing) before the next frame or image is processed. This KPI is relevant for applications where the inference on a single image is required. For example, the analysis of an ultra sound image in a medical application or the analysis of a seismic image in the oil & gas industry. Other use cases include real or near real-time applications, e.g. the response of industrial robot to changes in its environment and obstacle avoidance for autonomous vehicles, where a quick response to the result of the inference is required.