OpenVINO™ 工具套件公共预训练模型概览

OpenVINO™ 工具套件提供了一组公共预训练模型,可用于学习和演示目的或用于开发深度学习软件。GitHub 上的存储库中提供了最新版本。公共预训练模型设备支持表格总结了每种型号支持的设备。

您可以使用 OpenVINO™ 模型下载器和其他自动化工具下载模型并将其转换为推理引擎格式 (*.xml + *.bin)。

分类

模型名称实施OMZ 模型名称精度GFlopsmParams
AlexNetCaffe*alexnet56.598%/79.812%1.560.965
AntiSpoofNetPyTorch*anti-spoof-mn33.81%0.153.02
CaffeNetCaffe*caffenet56.714%/79.916%1.560.965
DenseNet 121Caffe*
TensorFlow*
Caffe2*
DenseNet-121
densenet-121-TF
densenet-121-caffe2
74.42%/92.136%
74.46%/92.13%
74.904%/92.192%
5.723~5.72877.971
DenseNet 161Caffe*
TensorFlow*
densenet-161
densenet-161-tf
77.55%/93.92%
76.446%/93.228%
14.128~15.56128.666
DenseNet 169Caffe*
TensorFlow*
densenet-169
densenet-169-tf
76.106%/93.106%
76.14%/93.12%
6.788~6.793214.139
DenseNet 201Caffe*
TensorFlow*
densenet-201
densenet-201-tf
76.886%/93.556%
76.93%/93.56%
8.673~8.678620.001
DLA 34PyTorch*dla-3474.64%/92.06%6.136815.7344
EfficientNet B0TensorFlow*
PyTorch*
efficientnet-b0
efficientnet-b0-pytorch
75.70%/92.76%
76.91%/93.21%
0.8195.268
EfficientNet B0 AutoAugmentTensorFlow*efficientnet-b0_auto_aug76.43%/93.04%0.8195.268
EfficientNet B5TensorFlow*
PyTorch*
efficientnet-b5
efficientnet-b5-pytorch
83.33%/96.67%
83.69%/96.71%
21.25230.303
EfficientNet B7PyTorch*efficientnet-b7-pytorch84.42%/96.91%77.61866.193
EfficientNet B7 AutoAugmentTensorFlow*efficientnet-b7_auto_aug84.68%/97.09%77.61866.193
HBONet 1.0PyTorch*hbonet-1.073.1%/91.0%0.62084.5443
HBONet 0.5PyTorch*hbonet-0.567.0%/86.9%0.19772.5287
HBONet 0.25PyTorch*hbonet-0.2557.3%/79.8%0.07581.9299
Inception (GoogleNet) V1Caffe*
TensorFlow*
googlenet-v1
googlenet-v1-tf
68.928%/89.144%
69.814%/89.6%
3.016~3.2666.619~6.999
Inception (GoogleNet) V2Caffe*
TensorFlow*
googlenet-v2
googlenet-v2-tf
72.024%/90.844%
74.084%/91.798%
4.05811.185
Inception (GoogleNet) V3TensorFlow*
PyTorch*
googlenet-v3
googlenet-v3-pytorch
77.904%/93.808%
77.69%/93.7%
11.46923.817
Inception (GoogleNet) V4TensorFlow*googlenet-v4-tf80.204%/95.21%24.58442.648
Inception-ResNet V2TensorFlow*inception-resnet-v2-tf80.14%/95.10%22.22730.223
MixNet LTensorFlow*mixnet-l78.30%/93.91%0.5657.3
MobileNet V1 0.25 128Caffe*mobilenet-v1-0.25-12840.54%/65%0.0280.468
MobileNet V1 0.5 160Caffe*mobilenet-v1-0.50-16059.86%/82.04%0.1561.327
MobileNet V1 0.5 224Caffe*mobilenet-v1-0.50-22463.042%/84.934%0.3041.327
MobileNet V1 1.0 224Caffe*
TensorFlow*
mobilenet-v1-1.0-224
mobilenet-v1-1.0-224-tf
69.496%/89.224%
71.03%/89.94%
1.1484.221
MobileNet V2 1.0 224Caffe*
TensorFlow*
PyTorch*
mobilenet-v2
mobilenet-v2-1.0-224
mobilenet-v2-pytorch
71.218%/90.178%
71.85%/90.69%
71.81%/90.396%
0.615~0.8763.489
MobileNet V2 1.4 224TensorFlow*mobilenet-v2-1.4-22474.09%/91.97%1.1836.087
MobileNet V3 Small 1.0TensorFlow*mobilenet-v3-small-1.0-224-tf67.36%/87.45%0.1212.537
MobileNet V3 Large 1.0TensorFlow*mobilenet-v3-large-1.0-224-tf75.70%/92.76%0.45365.4721
NFNet F0PyTorch*nfnet-f083.34%/96.56%24.805371.4444
DenseNet 121, alpha=0.125MXNet*octave-densenet-121-0.12576.066%/93.044%4.8837.977
RegNetX-3.2GFPyTorch*regnetx-3.2gf78.17%/94.08%6.389315.2653
ResNet 26, alpha=0.25MXNet*octave-resnet-26-0.2576.076%/92.584%3.76815.99
ResNet 50, alpha=0.125MXNet*octave-resnet-50-0.12578.19%/93.862%7.22125.551
ResNet 101, alpha=0.125MXNet*octave-resnet-101-0.12579.182%/94.42%13.38744.543
ResNet 200, alpha=0.125MXNet*octave-resnet-200-0.12579.99%/94.866%25.40764.667
ResNeXt 50, alpha=0.25MXNet*octave-resnext-50-0.2578.772%/94.18%6.44425.02
ResNeXt 101, alpha=0.25MXNet*octave-resnext-101-0.2579.556%/94.444%11.52144.169
SE-ResNet 50, alpha=0.125MXNet*octave-se-resnet-50-0.12578.706%/94.09%7.24628.082
open-closed-eye-0001PyTorch*open-closed-eye-000195.84%0.00140.0113
RepVGG A0PyTorch*repvgg-a072.40%/90.49%2.72868.3094
RepVGG B1PyTorch*repvgg-b178.37%/94.09%23.647251.8295
RepVGG B3PyTorch*repvgg-b380.50%/95.25%52.4407110.9609
ResNeSt 50PyTorch*resnest-50-pytorch81.11%/95.36%10.814827.4493
ResNet 18PyTorch*resnet-18-pytorch69.754%/89.088%3.63711.68
ResNet 34PyTorch*resnet-34-pytorch73.30%/91.42%7.340921.7892
ResNet 50PyTorch*
Caffe2*
TensorFlow*
resnet-50-pytorch
resnet-50-caffe2
resnet-50-tf
75.168%/92.212%
76.128%/92.858%
76.38%/93.188%
76.17%/92.98%
6.996~8.21625.53
ReXNet V1 x1.0PyTorch*rexnet-v1-x1.077.86%/93.87%0.83254.7779
se-inceptionCaffe*se-inception75.996%/92.964%4.09111.922
SE-ResNet 50Caffe*se-resnet-5077.596%/93.85%7.77528.061
SE-ResNet 101Caffe*se-resnet-10178.252%/94.206%15.23949.274
SE-ResNet 152Caffe*se-resnet-15278.506%/94.45%22.70966.746
SE-ResNeXt 50Caffe*se-resnext-5078.968%/94.63%8.53327.526
SE-ResNeXt 101Caffe*se-resnext-10180.168%/95.19%16.05448.886
Shufflenet V2 x1.0PyTorch*shufflenet-v2-x1.069.36%/88.32%0.29572.2705
SqueezeNet v1.0Caffe*squeezenet1.057.684%/80.38%1.7371.248
SqueezeNet V1.1Caffe*
Caffe2*
squeezenet1.1
squeezenet1.1-caffe2
58.382%/81%
56.502%/79.576%
0.7851.236
VGG 16Caffe*vgg1670.968%/89.878%30.974138.358
VGG 19Caffe*
Caffe2*
VGG19
vgg19-caffe2
71.062%/89.832%
71.062%/89.832%
39.3143.667

分割

语义分割是对象检测问题的扩展。语义分割模型不返回边界框,而是返回输入图像的“绘制”版本,其中每个像素的“颜色”代表某个类别。这些网络比各自的目标检测网络大得多,但它们提供了更好的(像素级)目标定位,并且可以检测具有复杂形状的区域。

语义分割

模型名称实施OMZ 模型名称精度GFlopsmParams
DeepLab V3TensorFlow*deeplabv366.85%11.46923.819
HRNet V2 C1 分割PyTorch*hrnet-v2-c1-segmentation77.69%81.99366.4768
Fastseg MobileV3Large LR-ASPP, F=128PyTorch*fastseg-large72.67%140.96113.2
Fastseg MobileV3Small LR-ASPP, F=128PyTorch*fastseg-small67.15%69.22041.1
PSPNet R-50-D8PyTorch*pspnet-pytorch70.6%357.171946.5827

实例分割

实例分割是对象检测和语义分割问题的扩展。实例分割模型不是预测每个对象实例周围的边界框,而是为所有实例输出逐像素掩码。

模型名称实施OMZ 模型名称精度GFlopsmParams
Mask R-CNN Inception ResNet V2TensorFlow*mask_rcnn_inception_resnet_v2_atrous_coco39.86%/35.36%675.31492.368
Mask R-CNN Inception V2TensorFlow*mask_rcnn_inception_v2_coco27.12%/21.48%54.92621.772
Mask R-CNN ResNet 50TensorFlow*mask_rcnn_resnet50_atrous_coco29.75%/27.46%294.73850.222
Mask R-CNN ResNet 101TensorFlow*mask_rcnn_resnet101_atrous_coco34.92%/31.30%674.5869.188
YOLACT ResNet 50 FPNPyTorch*yolact-resnet50-fpn-pytorch28.0%/30.69%118.57536.829

3D 语义分割

模型名称实施OMZ 模型名称精度GFlopsmParams
脑肿瘤分割MXNet*brain-tumor-segmentation-000192.4003%409.99638.192
脑肿瘤分割 2PyTorch*brain-tumor-segmentation-000291.4826%300.8014.51

物体检测

多种检测模型可用于检测一组最流行的对象 - 例如:人脸、人、车辆。大多数网络都是基于固态硬盘的,并提供合理的准确性/性能权衡。

模型名称实施OMZ 模型名称精度GFlopsmParams
CTPNTensorFlow*CTPN73.67%55.81317.237
CenterNet(使用 DLAV0 的 CTDET)384x384ONNX*ctdet_coco_dlav0_38441.6105%34.99417.911
CenterNet(使用 DLAV0 的 CTDET)512x512ONNX*ctdet_coco_dlav0_51244.2756%62.21117.911
EfficientDet-D0TensorFlow*efficientdet-d0-tf31.95%2.543.9
EfficientDet-D1TensorFlow*efficientdet-d1-tf37.54%6.16.6
FaceBoxesPyTorch*faceboxes-pytorch83.565%1.89751.0059
人脸检测零售Caffe*face-detection-retail-004483.00%1.0670.588
使用 Inception-ResNet v2 的 Faster R-CNNTensorFlow*faster_rcnn_inception_resnet_v2_atrous_coco40.69%30.68713.307
使用 Inception v2 的 Faster R-CNNTensorFlow*faster_rcnn_inception_v2_coco26.24%30.68713.307
使用 ResNet 50 的 Faster R-CNNTensorFlow*faster_rcnn_resnet50_coco31.09%57.20329.162
使用 ResNet 101 的 Faster R-CNNTensorFlow*faster_rcnn_resnet101_coco35.72%112.05248.128
MobileFace Detection V1MXNet*mobilefacedet-v1-mxnet78.7488%3.54567.6828
MTCNNCaffe*mtcnn
mtcnn-p
mtcnn-r
mtcnn-o
48.1308%/62.2625%
3.3715
0.0031
0.0263

0.0066
0.1002
0.3890
PeleeCaffe*pelee-coco21.9761%1.2905.98
使用 Resnet 50 的 RetinaFacePyTorch*retinaface-resnet50-pytorch91.78%88.862727.2646
使用 Resnet 50 的 RetinaNetTensorFlow*retinanet-tf33.15%238.946964.9706
带有 Resnet-101 的 R-FCNTensorFlow*rfcn-resnet101-coco-tf28.40%/45.02%53.462171.85
SSD 300Caffe*ssd30087.09%62.81526.285
SSD 512Caffe*ssd51291.07%180.61127.189
使用 MobileNet 的固态硬盘Caffe*
TensorFlow*
mobilenet-ssd
ssd_mobilenet_v1_coco
67.00%
23.32%
2.316~2.4945.783~6.807
使用 MobileNet FPN 的固态硬盘TensorFlow*ssd_mobilenet_v1_fpn_coco35.5453%123.30936.188
使用 MobileNet V2 的固态硬盘TensorFlow*ssd_mobilenet_v2_coco24.9452%3.77516.818
使用 MobileNet V2 的精简版固态硬盘TensorFlow*ssdlite_mobilenet_v224.2946%1.5254.475
使用 ResNet-50 V1 FPN 的固态硬盘TensorFlow*ssd_resnet50_v1_fpn_coco38.4557%178.680759.9326
使用 ResNet 34 1200x1200 的固态硬盘PyTorch*ssd-resnet34-1200-onnx20.7198%/39.2752%433.41120.058
超轻量级人脸检测 RFB 320PyTorch*ultra-lightweight-face-detection-rfb-32084.78%0.21060.3004
超轻量级人脸检测 slim 320PyTorch*ultra-lightweight-face-detection-slim-32083.32%0.17240.2844
汽车车牌检测障碍TensorFlow*vehicle-license-plate-detection-barrier-012399.52%0.2710.547
YOLO v1 TinyTensorFlow.js*yolo-v1-tiny-tf54.79%6.988315.8587
YOLO v2 TinyKeras*yolo-v2-tiny-tf27.3443%/29.1184%5.423611.2295
YOLO v2Keras*yolo-v2-tf53.1453%/56.483%63.030150.9526
YOLO v3Keras*yolo-v3-tf62.2759%/67.7221%65.984361.9221
YOLO v3 TinyKeras*yolo-v3-tiny-tf35.9%/39.7%5.5828.848
YOLO v4Keras*yolo-v4-tf71.23%/77.40%/50.26%129.556764.33
YOLO v4 TinyKeras*yolo-v4-tiny-tf6.92896.0535

面部识别

模型名称实施OMZ 模型名称精度GFlopsmParams
FaceNetTensorFlow*facenet-20180408-10290099.14%2.84623.469
LResNet100E-IR,ArcFace@ms1m-refine-v2MXNet*face-recognition-resnet100-arcface-onnx99.68%24.211565.1320
SphereFaceCaffe*Sphereface98.8321%3.50422.671

人体姿势估计

人体姿势估计任务是为输入图像或视频中的每个人预测姿势:身体骨架,它由关键点和它们之间的连接组成。关键点是身体关节,即耳朵、眼睛、鼻子、肩膀、膝盖等。这种方法有两大类:自上而下和自下而上。第一个检测给定帧中的人,裁剪或重新调整检测,然后为每个检测运行姿势估计网络。这些方法非常准确。第二个找到给定帧中的所有关键点,然后按人物实例将它们分组,因此比以前更快,因为网络运行一次。

模型名称实施OMZ 模型名称精度GFlopsmParams
human-pose-estimation-3d-0001PyTorch*human-pose-estimation-3d-0001100.44437mm18.9985.074
single-human-pose-estimation-0001PyTorch*single-human-pose-estimation-000169.0491%60.12533.165
higher-hrnet-w32-human-pose-estimationPyTorch*higher-hrnet-w32-human-pose-estimation64.64%92.836428.6180

单眼深度估计

单眼深度估计任务是基于单个输入图像预测深度(或逆深度)图。由于此任务在一般情况下包含一些模糊性,因此生成的深度图通常仅定义为未知比例因子。

模型名称实施OMZ 模型名称精度GFlopsmParams
midasnetPyTorch*midasnet0.07071207.25144104.081
FCRN ResNet50-UpprojTensorFlow*fcrn-dp-nyu-depth-v2-tf0.57363.542134.5255

Image Inpainting

图像修复任务是估计合适的像素信息来填充图像中的空洞。

模型名称实施OMZ 模型名称精度GFlopsmParams
GMCNN 图像修复TensorFlow*gmcnn-places2-tf33.47Db691.158912.7773

风格迁移

风格迁移任务是将一幅图像的风格迁移到另一幅图像。

模型名称实施OMZ 模型名称精度GFlopsmParams
fast-neural-style-mosaic-onnxONNX*fast-neural-style-mosaic-onnx12.04dB15.5181.679

动作识别

动作识别的任务是预测在短视频剪辑(通过堆叠输入视频中的采样帧形成的张量)上执行的动作。

模型名称实施OMZ 模型名称精度GFlopsmParams
RGB-I3D,已在 ImageNet\* 上预训练TensorFlow*i3d-rgb-tf65.96%/86.01%278.981512.6900
common-sign-language-0001PyTorch*common-sign-language-000193.58%4.22694.1128

Colorization

着色任务是从灰度图像中预测场景的颜色。

模型名称实施OMZ 模型名称精度GFlopsmParams
colorization-v2PyTorch*colorization-v226.99dB83.604532.2360
colorization-siggraphPyTorch*colorization-siggraph27.73dB150.544134.0511

声音分类

声音分类的任务是预测音频片段中的声音。

模型名称实施OMZ 模型名称精度GFlopsmParams
ACLNetPyTorch*aclnet86%/92%1.42.7
ACLNet-int8PyTorch*aclnet-int887%/93%1.412.71

语音识别

语音识别的任务是识别口语并将其翻译成文本。

模型名称实施OMZ 模型名称精度GFlopsmParams
DeepSpeech V0.6.1TensorFlow*mozilla-deepspeech-0.6.17.55%0.047247.2
DeepSpeech V0.8.2TensorFlow*mozilla-deepspeech-0.8.26.13%0.047247.2
QuartzNetPyTorch*quartznet-15x5-en3.86%2.419518.8857

图像翻译

图像翻译的任务是根据样本生成输出。

模型名称实施OMZ 模型名称精度GFlopsmParams
CoCosNetPyTorch*cocosnet12.93dB1080.7032167.9141

光学字符识别

模型名称实施OMZ 模型名称精度GFlopsmParams
license-plate-recognition-barrier-0007TensorFlow*license-plate-recognition-barrier-000798%0.3471.435

地点识别

地点识别的任务是快速准确地识别给定查询照片的位置。

模型名称实施OMZ 模型名称精度GFlopsmParams
NetVLADTensorFlow*netvlad-tf82.0321%36.6374149.0021

去模糊

图像去模糊任务。

模型名称实施OMZ 模型名称精度GFlopsmParams
DeblurGAN-v2PyTorch*deblurgan-v228.25Db80.89192.1083

突出物体检测

突出物体检测是基于视觉注意机制的任务,算法旨在更注意探索物体或区域,而非现场或图像的周围区域。

模型名称实施OMZ 模型名称精度GFlopsmParams
F3NetPyTorch*f3net84.21%31.288325.2791

文本识别

场景文本识别是识别给定图像上的文本的任务。研究人员竞相创建能够识别不同形状、字体和背景的文本的算法。在此处查看有关数据集的详细信息。报告的指标是在不区分大小写的模式下通过 icdar 13(1015 个图像)的字母数字子集收集的。

模型名称实施OMZ 模型名称精度GFlopsmParams
Resnet-FCPyTorch*text-recognition-resnet-fc90.94%40.3704177.9668

文字转语音

模型名称实施OMZ 模型名称精度GFlopsmParams
ForwardTacotronPyTorch*forward-tacotron:
forward-tacotron-duration-prediction
forward-tacotron-regression

6.66
4.91

13.81
3.05
WaveRNNPyTorch*wavernn
wavernn-upsampler
wavernn-rnn

0.37
0.06

0.4
3.83

命名实体识别

命名实体识别 (NER) 是使用相应类型来标记文本中的实体的任务。

模型名称实施OMZ 模型名称精度GFlopsmParams
bert-base-NERPyTorch*bert-base-ner94.45%22.3874107.4319

车辆重新识别

模型名称实施OMZ 模型名称精度GFlopsmParams
vehicle-reid-0001PyTorch*vehicle-reid-000196.31%/85.15 %2.6432.183

另请参阅

法律信息

[*]文中涉及的其它名称及商标属于各自所有者资产。