logo
  • Get Started
  • Documentation
  • Tutorials
  • API Reference
  • Model Zoo
  • Resources
English Chinese

OpenVINO 2022.1 introduces a new version of OpenVINO API (API 2.0). For more information on the changes and transition steps, see the transition guide

API 2.0

  • OpenVINO™ API 2.0 Transition Guide
    • Installation & Deployment
    • Inference Pipeline
    • Configuring Devices
    • Preprocessing
    • Model Creation in OpenVINO™ Runtime

Model preparation

  • Introduction to Model Processing
  • Supported Model Formats
    • Converting a TensorFlow Model
    • Converting an ONNX Model
    • Converting a PyTorch Model
    • Converting a PaddlePaddle Model
    • Converting an MXNet Model
    • Converting a Caffe Model
    • Converting a Kaldi Model
    • Model Conversion Tutorials
      • Converting a TensorFlow Attention OCR Model
      • Converting a TensorFlow BERT Model
      • Converting a TensorFlow CRNN Model
      • Converting a TensorFlow DeepSpeech Model
      • Converting TensorFlow EfficientDet Models
      • Converting TensorFlow FaceNet Models
      • Converting a TensorFlow GNMT Model
      • Converting a TensorFlow Language Model on One Billion Word Benchmark
      • Converting a TensorFlow Neural Collaborative Filtering Model
      • Converting TensorFlow Object Detection API Models
      • Converting a TensorFlow RetinaNet Model
      • Converting TensorFlow Slim Image Classification Model Library Models
      • Converting TensorFlow Wide and Deep Family Models
      • Converting a TensorFlow XLNet Model
      • Converting TensorFlow YOLO Models
      • Converting an ONNX Faster R-CNN Model
      • Converting an ONNX GPT-2 Model
      • Converting an ONNX Mask R-CNN Model
      • Converting a PyTorch BERT-NER Model
      • Converting a PyTorch Cascade RCNN R-101 Model
      • Converting a PyTorch F3Net Model
      • Converting a PyTorch QuartzNet Model
      • Converting a PyTorch RCAN Model
      • Converting a PyTorch RNN-T Model
      • Converting a PyTorch YOLACT Model
      • Converting MXNet GluonCV Models
      • Converting an MXNet Style Transfer Model
      • Converting a Kaldi ASpIRE Chain Time Delay Neural Network (TDNN) Model
  • Model Optimizer Usage
    • Model Inputs and Outputs, Shapes and Layouts
    • Setting Input Shapes
    • Model Optimization Techniques
    • Cutting Off Parts of a Model
    • Embedding Preprocessing Computation
    • Compressing a Model to FP16
    • Model Optimizer Frequently Asked Questions
  • Model Downloader and other automation tools

Running Inference

  • Inference with OpenVINO Runtime
    • Integrate OpenVINO™ with Your Application
      • Model Representation in OpenVINO™ Runtime
      • OpenVINO™ Inference Request
      • OpenVINO™ Python API Exclusives
    • Inference Modes
      • Automatic Device Selection
        • Debugging Auto-Device Plugin
      • Multi-device execution
      • Heterogeneous execution
      • Automatic Batching
    • Inference Device Support
      • Query Device Properties - Configuration
      • CPU Device
      • GPU Device
        • Remote Tensor API of GPU Plugin
      • VPU Devices
        • MYRIAD Device
        • HDDL Device
      • GNA Device
      • Arm® CPU Device
    • Changing Input Shapes
      • Troubleshooting Reshape Errors
    • Optimize Preprocessing
      • Preprocessing API - details
      • Layout API Overview
      • Use Case - Integrate and Save Preprocessing Steps Into IR
    • Dynamic Shapes
      • When Dynamic Shapes API is Not Applicable
    • High-level Performance Hints
    • Stateful models
      • The LowLatencу2 Transformation
      • [DEPRECATED] The LowLatency Transformation
  • Compile Tool

Optimization and Performance

  • Introduction to Performance Optimization
  • Getting Performance Numbers
  • Model Optimization Guide
    • Quantizing Models Post-training
      • Quantizing Model
        • DefaultQuantization Method
      • Quantizing Model with Accuracy Control
        • AccuracyAwareQuantization Method
      • Quantization Best Practices
        • Saturation Issue
      • API Reference
      • Command-line Interface
        • Simplified Mode
        • Configuration File Description
      • Examples
        • API Examples
          • Quantizatiing Image Classification Model
          • Quantizatiing Object Detection Model with Accuracy Control
          • Quantizatiing Cascaded Model
          • Quantizatiing Semantic Segmentation Model
          • Quantizatiing 3D Segmentation Model
          • Quantizatiing for GNA Device
        • Command-line Example
      • Post-training Optimization Tool FAQ
    • Compressing Models During Training
      • Quantization-aware Training (QAT)
      • Filter Pruning of Convolutional Models
    • (Experimental) Protecting Model
  • Runtime Inference Optimizations
    • General Optimizations
    • Optimizing for the Latency
      • Model Caching Overview
    • Optimizing for Throughput
    • Using Advanced Throughput Options: Streams and Batching
    • Further Low-Level Implementation Details
  • Tuning Utilities
    • Deep Learning accuracy validation framework
      • Adapters
      • Annotation Converters
      • Custom Evaluators for Accuracy Checker
      • Data Readers
      • How to configure Caffe launcher
      • How to configure G-API launcher
      • How to configure MXNet launcher
      • How to configure ONNX Runtime launcher
      • How to configure OpenCV launcher
      • How to configure OpenVINO™ launcher
      • How to configure PaddlePaddle launcher
      • How to configure PyTorch launcher
      • How to configure TensorFlow 2.0 launcher
      • How to configure TensorFlow Lite launcher
      • How to configure TensorFlow launcher
      • How to use predefined configuration files
      • Metrics
      • Postprocessors
      • Preprocessors
      • Sample
    • Dataset Preparation Guide
    • Cross Check Tool
  • Performance Benchmarks
    • Intel® Distribution of OpenVINO™ toolkit Benchmark Results
      • Performance Information Frequently Asked Questions
      • Model Accuracy and Performance for INT8 and FP32
      • Performance Data Spreadsheet (download xlsx)
    • OpenVINO™ Model Server Benchmark Results

Deploying Inference

  • Introduction to OpenVINO™ Deployment
  • Deploying Your Applications with OpenVINO™
    • Deploying Your Application with Deployment Manager
    • Libraries for Local Distribution

THE Ecosystem

  • OpenVINO™ Ecosystem Overview
  • OpenVINO™ Model Server
    • Quickstart Guide
    • Architecture
    • Model Repository
    • Starting the Server
      • Single-Model Mode
      • Multiple-Model mode with a Config File
      • Model Server in Docker Containers
      • Bare Metal and Virtual Hosts
      • Model Server Parameters
      • Using Cloud Storage as a Model Repository
      • Using AI Accelerators
      • Model Version Policy
      • Batch, Shape and Layout
      • Online Configuration Updates
      • Security Considerations
    • API Reference Guide
      • TensorFlow Serving compatible gRPC API
      • KServe compatible gRPC API
      • TensorFlow Serving compatible RESTful API
      • KServe compatible RESTful API
    • Clients
      • TensorFlow Serving API Clients
      • KServe API Clients
    • Directed Acyclic Graph (DAG) Scheduler
      • Demultiplexing in DAG
      • Custom Node Development Guide
    • Support for Binary Input Data
      • Input Shape and Layout Considerations
      • Predict on Binary Inputs via TensorFlow Serving API
      • Predict on Binary Inputs via KServe API
      • Convert TensorFlow Models to Accept Binary Inputs
    • Model Cache
    • Metrics
    • CPU Extensions
    • Dynamic Input Parameters
      • Dynamic batch size with OpenVINO™ Model Server Demultiplexer
      • Dynamic Batch Size with Automatic Model Reloading
      • Dynamic Shape with Automatic Model Reloading
      • Dynamic Shape with a Custom Node
      • Dynamic Shape with Binary Inputs
      • Dynamic Shape with dynamic IR/ONNX Model
    • Serving Stateful Models
    • Custom Model Loader
    • Performance tuning
    • Deploy Model Server in Kubernetes
      • Helm Deployment
      • Kubernetes Operator
      • OpenShift Operator
    • Demos
      • Age and Gender Recognition via REST API
      • Horizontal Text Detection in Real-Time
      • Optical Character Recognition with Directed Acyclic Graph
      • Face Detection Demo
      • Face Blur Pipeline Demo with OVMS
      • Single Face Analysis Pipeline Demo
      • Multi Faces Analysis Pipeline Demo
      • Model Ensemble Pipeline Demo
      • Image Classification Demos
        • Image Classification Demo (Python)
        • Image Classification Demo (C++)
        • Image Classification Demo (Go)
      • Prediction Example with an ONNX Model
      • Person, vehicle, bike detection with multiple data sources
      • Vehicle Analysis Pipeline Demo
      • Real Time Stream Analysis Demo
      • BERT Question Answering Demo
      • Speech Recognition on Kaldi Model
      • Benchmark Client
        • Benchmark Client (Python)
        • Benchmark Client (C++)
    • Troubleshooting
  • OpenVINO™ Security Add-on
  • OpenVINO™ integration with TensorFlow
  • OpenVINO™ Training Extensions
  • OpenVINO™ Deep Learning Workbench Overview
    • Installation
      • Prerequisites
      • Run the DL Workbench Locally
        • Advanced DL Workbench Configurations
        • Work with Docker Container
      • Run the DL Workbench in the Intel® DevCloud for the Edge
    • Get Started
      • Import Model
      • Create Project
      • Educational Resources about DL Workbench
        • DL Workbench Key Concepts
    • Tutorials
      • Object Detection Model (YOLOv4)
      • Object Detection Model (SSD_mobilenet)
      • Classification Model (mobilenet)
      • Classification Model (squeezenet)
      • Instance Segmentation Model (mask R-cnn)
      • Semantic Segmentation Model (deeplab)
      • Style Transfer Model (fast-nst-onnx)
      • NLP Model (BERT)
    • User Guide
      • Obtain Models
        • Import Open Model Zoo Models
        • Import Original Model
          • Import Original Model Recommendations
      • Obtain Datasets
        • Dataset Types
          • Cut Datasets
      • Select Environment
        • Work with Remote Targets
          • Profile on Remote Machine
          • Set Up Remote Target
          • Register Remote Target in DL Workbench
          • Manipulate Remote Machines
      • Optimize Model Performance
      • Explore Inference Configurations
        • Run Inference
        • View Inference Results
        • Compare Performance between Two Versions of a Model
        • Visualize Model
      • Visualize Model Output
      • Create Accuracy Report
        • Accuracy Configuration
        • Set Accuracy Configuration
        • Interpret Accuracy Report Results
      • Create Deployment Package
        • Deploy and Integrate Performance Criteria into Application
      • Export Project
      • Learn OpenVINO in DL Workbench
        • Learn Model Inference with OpenVINO™ API in JupyterLab* Environment
      • Restore DL Workbench State
      • Run DL Workbench Securely
        • Enable Authentication in DL Workbench
        • Configure Transport Layer Security (TLS)
    • Troubleshooting
      • Troubleshooting for DL Workbench in the Intel® DevCloud for the Edge

OpenVINO Extensibility

  • OpenVINO Extensibility Mechanism
    • Custom OpenVINO™ Operations
    • Frontend Extensions
    • How to Implement Custom GPU Operations
    • How to Implement Custom Layers for VPU (Intel® Neural Compute Stick 2)
    • Model Optimizer Extensibility
      • Extending Model Optimizer with Caffe Python Layers
  • Overview of Transformations API
    • OpenVINO Model Pass
    • OpenVINO Matcher Pass
    • OpenVINO Graph Rewrite Pass
  • OpenVINO Plugin Developer Guide
    • Implement Plugin Functionality
    • Implement Executable Network Functionality
    • Implement Synchronous Inference Request
    • Implement Asynchronous Inference Request
    • Build Plugin Using CMake
    • Plugin Testing
    • Advanced Topics
      • Quantized networks compute and restrictions
      • OpenVINO™ Low Precision Transformations
        • Attributes
          • AvgPoolPrecisionPreserved
          • IntervalsAlignment
          • PrecisionPreserved
          • Precisions
          • QuantizationAlignment
          • QuantizationGranularity
        • Step 1. Prerequisites transformations
        • Step 2. Markup transformations
        • Step 3. Main transformations
        • Step 4. Cleanup transformations
    • Plugin API Reference
      • Inference Engine Plugin API
        • Asynchronous Inference Request base classes
          • AsyncInferRequestThreadSafeDefault
        • Blob creation and memory utilities
        • Error handling and debug helpers
          • DescriptionBuffer
        • Executable Network base classes
          • IExecutableNetworkInternal
          • ExecutableNetworkThreadSafeDefault
        • Execution graph utilities
          • ExecGraphInfoSerialization
          • ExecutionNode
        • FP16 to FP32 precision utilities
          • PrecisionUtils
        • File utilities
          • FileUtils
        • ITT profiling utilities
          • openvino
          • ScopedTask
          • TaskChain
        • Inference Request base classes
          • IInferRequestInternal
        • Plugin base classes
          • PluginConfigInternalParams
          • ICore
          • IInferencePlugin
        • Preprocessing API
        • System configuration utilities
        • Threading utilities
          • ExecutorManager
          • IStreamsExecutor
          • ITaskExecutor
          • CPUStreamsExecutor
          • ImmediateExecutor
        • Variable state base classes
          • IVariableStateInternal
        • XML helper utilities
          • XMLParseUtils
          • parse_result
      • Inference Engine Transformation API
        • Common optimization passes
          • ngraph
          • AddFakeQuantizeFusion
          • AddOldApiMapToParameters
          • AddTransformation
          • AlignQuantizationIntervals
          • AlignQuantizationParameters
          • AvgPoolPrecisionPreservedAttribute
          • AvgPoolTransformation
          • BatchToSpaceFusion
          • BidirectionalGRUSequenceDecomposition
          • BidirectionalLSTMSequenceDecomposition
          • BidirectionalRNNSequenceDecomposition
          • BidirectionalSequenceDecomposition
          • BinarizeWeights
          • BroadcastConstRangeReplacement
          • BroadcastElementwiseFusion
          • ClampFusion
          • ClampTransformation
          • CompressFloatConstants
          • CompressFloatConstantsImpl
          • ConcatReduceFusion
          • ConcatTransformation
          • ConvStridesPropagation
          • ConvToBinaryConv
          • ConvertBatchToSpace
          • ConvertCompressedOnlyToLegacy
          • ConvertDeformableConv8To1
          • ConvertDetectionOutput1ToDetectionOutput8
          • ConvertDetectionOutput8ToDetectionOutput1
          • ConvertGRUSequenceMatcher
          • ConvertGRUSequenceToTensorIterator
          • ConvertGather0D
          • ConvertGather1ToGather7
          • ConvertGather7ToGather1
          • ConvertGather7ToGather8
          • ConvertGather8ToGather7
          • ConvertInterpolate1ToInterpolate4
          • ConvertLSTMSequenceMatcher
          • ConvertLSTMSequenceToTensorIterator
          • ConvertMVN1ToMVN6
          • ConvertMaxPool1ToMaxPool8
          • ConvertMaxPool8ToMaxPool1
          • ConvertNmsGatherPathToUnsigned
          • ConvertPadToGroupConvolution
          • ConvertPriorBox8To0
          • ConvertQuantizeDequantize
          • ConvertRNNSequenceMatcher
          • ConvertRNNSequenceToTensorIterator
          • ConvertROIAlign3To9
          • ConvertROIAlign9To3
          • ConvertScatterElementsToScatter
          • ConvertSoftMax1ToSoftMax8
          • ConvertSoftMax8ToSoftMax1
          • ConvertSpaceToBatch
          • ConvertSubtractConstant
          • ConvertTensorIteratorToGRUSequence
          • ConvertTensorIteratorToLSTMSequence
          • ConvertTensorIteratorToRNNSequence
          • ConvolutionBackpropDataTransformation
          • ConvolutionTransformation
          • CreateAttribute
          • CreatePrecisionsDependentAttribute
          • DepthToSpaceFusion
          • DepthToSpaceTransformation
          • DilatedConvolutionConverter
          • DisableDecompressionConvertConstantFolding
          • DisableRandomUniformConstantFolding
          • DivideFusion
          • DivisionByZeroFP16Resolver
          • DropoutWithRandomUniformReplacer
          • EinsumDecomposition
          • EliminateConcat
          • EliminateConvert
          • EliminateConvertNonZero
          • EliminateEltwise
          • EliminateGatherUnsqueeze
          • EliminatePad
          • EliminateSplit
          • EliminateSqueeze
          • EliminateTranspose
          • EliminateUnsqueezeGather
          • EltwiseBaseTransformation
          • EnableDecompressionConvertConstantFolding
          • FakeQuantizeDecomposition
          • FakeQuantizeDecompositionTransformation
          • FakeQuantizeMulFusion
          • FakeQuantizeReshapeFusion
          • FakeQuantizeTransformation
          • FixRtInfo
          • FoldConvertTransformation
          • FoldFakeQuantizeTransformation
          • FuseConvertTransformation
          • FuseMultiplyToFakeQuantizeTransformation
          • FuseSubtractToFakeQuantizeTransformation
          • GRUCellDecomposition
          • GatherNegativeConstIndicesNormalize
          • GatherNopElimination
          • Gelu7Downgrade
          • GeluFusion
          • GeluFusionWithErfOne
          • GeluFusionWithErfThree
          • GeluFusionWithErfTwo
          • GeluFusionWithTanh
          • GroupConvolutionTransformation
          • GroupedGatherElimination
          • GroupedStridedSliceOptimizer
          • HSigmoidDecomposition
          • HSigmoidFusion
          • HSigmoidFusionWithClampDiv
          • HSigmoidFusionWithClampMul
          • HSigmoidFusionWithReluDiv
          • HSigmoidFusionWithReluMul
          • HSigmoidFusionWithoutRelu
          • HSwishDecomposition
          • HSwishFusion
          • HSwishFusionWithClamp
          • HSwishFusionWithHSigmoid
          • HSwishFusionWithReluDiv
          • HSwishFusionWithReluMul
          • InitConstMask
          • InitMasks
          • InitNodeInfo
          • InterpolateSequenceFusion
          • InterpolateTransformation
          • IntervalsAlignmentAttribute
          • IntervalsAlignmentSharedValue
          • LSTMCellDecomposition
          • LSTMStatesBroadcast
          • LayerTransformation
          • LeakyReluFusion
          • LinOpSequenceFusion
          • LogSoftmaxDecomposition
          • MVN6Decomposition
          • MVNFusion
          • MVNFusionWithConstantsInside
          • MVNFusionWithoutConstants
          • MVNTransformation
          • MarkPrecisionSensitiveDivides
          • MarkPrecisionSensitiveSubgraphs
          • MarkupAvgPoolPrecisionPreserved
          • MarkupCanBeQuantized
          • MarkupPrecisions
          • MarkupQuantizationGranularity
          • MatMulMultiplyFusion
          • MatMulTransformation
          • MaxPoolTransformation
          • MimicSetBatchSize
          • MishFusion
          • MulFakeQuantizeFusion
          • MultiplyConvolutionFusion
          • MultiplyToGroupConvolutionTransformation
          • MultiplyTransformation
          • NearestNeighborUpsamplingFusion
          • NormalizeL2Decomposition
          • NormalizeL2Fusion
          • NormalizeL2Transformation
          • PReluFusion
          • PReluFusionMultiplyAdd
          • PReluFusionMultiplySub
          • PReluFusionNegativeAdd
          • PReluFusionNegativeSub
          • PReluTransformation
          • PadFusionAvgPool
          • PadFusionConvolution
          • PadFusionConvolutionBackpropData
          • PadFusionGroupConvolution
          • PadFusionGroupConvolutionBackpropData
          • PadTransformation
          • PrecisionPreservedAttribute
          • PrecisionsAttribute
          • PropagateMasks
          • PropagatePrecisions
          • PropagateSharedValue
          • PropagateThroughPrecisionPreserved
          • PropagateToInput
          • Proposal1Scales
          • Pruning
          • PullReshapeThroughDequantization
          • PullSqueezeThroughEltwise
          • PullTransposeThroughDequantization
          • QuantizationAlignmentAttribute
          • QuantizationGranularityAttribute
          • RNNCellDecomposition
          • RandomUniformFusion
          • ReduceBaseTransformation
          • ReduceL1Decomposition
          • ReduceL2Decomposition
          • ReduceMaxTransformation
          • ReduceMeanTransformation
          • ReduceMerge
          • ReduceMinTransformation
          • ReduceSumTransformation
          • ReluFakeQuantizeFusion
          • ReluTransformation
          • RemoveConcatZeroDimInput
          • ReplaceConcatReduceByMinOrMax
          • ReshapeAMatMul
          • ReshapeSequenceFusion
          • ReshapeSinkingMatMul
          • ReshapeTo1D
          • ReshapeTransformation
          • ResolveNameCollisions
          • ReverseInputChannelsFusion
          • SetBatchSize
          • SharedShapeOf
          • SharedSqueeze
          • SharedStridedSliceEraser
          • ShrinkWeights
          • ShuffleChannelsFusion
          • ShuffleChannelsTransformation
          • SkipGatherBeforeTransposeAndReshape
          • SliceToStridedSlice
          • SoftPlusDecomposition
          • SoftPlusFusion
          • SoftPlusToMishFusion
          • SoftSignDecomposition
          • SoftmaxDecomposition
          • SoftmaxFusion
          • SpaceToBatchFusion
          • SplitConcatPairToInterpolateFusion
          • SplitSqueezeConcatFusion
          • SplitTransformation
          • SqueezeStridedSlice
          • SqueezeTransformation
          • StridedSliceOptimization
          • StridedSliceSqueeze
          • StridedSliceTransformation
          • StridesOptimization
          • SubtractFusion
          • SubtractTransformation
          • SupportedNodesStridesPropagation
          • SwishFusion
          • SwishFusionWithBeta
          • SwishFusionWithSigmoid
          • SwishFusionWithSigmoidWithBeta
          • SwishFusionWithoutBeta
          • TransformationContext
          • TransparentBaseTransformation
          • TransposeConvert
          • TransposeEltwise
          • TransposeFQReduction
          • TransposeFuse
          • TransposeReduction
          • TransposeReshapeEliminationForMatmul
          • TransposeSinking
          • TransposeToReshape
          • TransposeTransformation
          • UnrollIf
          • UnrollTensorIterator
          • UnsqueezeTransformation
          • UnsupportedNodesStridesPropagation
          • UpdateSharedPrecisionPreserved
          • UselessStridedSliceEraser
          • VariadicSplitTransformation
          • WeightableLayerTransformation
          • WeightsDequantizeToFakeQuantize
          • WrapInterpolateIntoTransposes
        • Conversion from opset2 to opset1
        • Conversion from opset3 to opset2
        • Runtime information
          • Decompression
          • DisableFP16Compression
          • FusedNames
          • Mask
          • NonconvertibleDivide
          • OldApiMapElementType
          • OldApiMapOrder

Use OpenVINO™ Toolkit Securely

  • Introduction to OpenVINO™ Security
  • Deep Learning Workbench Security
  • Using Encrypted Models with OpenVINO
  • OpenVINO™ Security Add-on

Media Processing and Computer Vision Libraries

  • Intel® Deep Learning Streamer
  • Introduction to OpenCV Graph API (G-API)
    • Graph API Kernel API
    • Implementing a Face Beautification Algorithm
    • Building a Face Analytics Pipeline
  • OpenCV Developer Guide
  • OpenCL™ Developer Guide
  • OneVPL Developer Guide
On this page
.pdf .zip
Edit this page

Tuning Utilities¶

Tuning Utilities

  • Deep Learning accuracy validation framework
  • Dataset Preparation Guide
  • Cross Check Tool
Prev Next

©2023 Intel Corporation Terms of Use Cookies Privacy

Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.