OpenVINO™ Toolkit Overview


OpenVINO™ toolkit is a comprehensive toolkit for quickly developing applications and solutions that solve a variety of tasks including emulation of human vision, automatic speech recognition, natural language processing, recommendation systems, and many others. Based on latest generations of artificial neural networks, including Convolutional Neural Networks (CNNs), recurrent and attention-based networks, the toolkit extends computer vision and non-vision workloads across Intel® hardware, maximizing performance. It accelerates applications with high-performance, AI and deep learning inference deployed from edge to cloud.

OpenVINO™ toolkit:

  • Enables CNN-based deep learning inference on the edge
  • Supports heterogeneous execution across an Intel® CPU, Intel® Integrated Graphics, Intel® Neural Compute Stick 2 and Intel® Vision Accelerator Design with Intel® Movidius™ VPUs
  • Speeds time-to-market via an easy-to-use library of computer vision functions and pre-optimized kernels
  • Includes optimized calls for computer vision standards, including OpenCV* and OpenCL™

OpenVINO™ Toolkit Workflow

The following diagram illustrates the typical OpenVINO™ workflow (click to see the full-size image):

Model Preparation, Conversion and Optimization

You can use your framework of choice to prepare and train a Deep Learning model or just download a pretrained model from the Open Model Zoo. The Open Model Zoo includes Deep Learning solutions to a variety of vision problems, including object recognition, face recognition, pose estimation, text detection, and action recognition, at a range of measured complexities. Several of these pretrained models are used also in the code samples and application demos. To download models from the Open Model Zoo, the Model Downloader tool is used.

One of the core component of the OpenVINO™ toolkit is the Model Optimizer a cross-platform command-line tool that converts a trained neural network from its source framework to an open-source, nGraph-compatible Intermediate Representation (IR) for use in inference operations. The Model Optimizer imports models trained in popular frameworks such as Caffe*, TensorFlow*, MXNet*, Kaldi*, and ONNX* and performs a few optimizations to remove excess layers and group operations when possible into simpler, faster graphs.

If your neural network model contains layers that are not in the list of known layers for supported frameworks, you can adjust the conversion and optimization process through use of Custom Layers.

Run the Accuracy Checker utility either against source topologies or against the output representation to evaluate the accuracy of inference. The Accuracy Checker is also part of the Deep Learning Workbench, an integrated web-based performance analysis studio.

Useful documents for model optimization:

Running and Tuning Inference

The other core component of OpenVINO™ is the Inference Engine, which manages the loading and compiling of the optimized neural network model, runs inference operations on input data, and outputs the results. Inference Engine can execute synchronously or asynchronously, and its plugin architecture manages the appropriate compilations for execution on multiple Intel® devices, including both workhorse CPUs and specialized graphics and video processing platforms (see below, Packaging and Deployment).

You can use OpenVINO™ Tuning Utilities with the Inference Engine to trial and test inference on your model. The Benchmark utility uses an input model to run iterative tests for throughput or latency measures, and the Cross Check utility compares performance of differently configured inferences. The Post-Training Optimization Tool integrates a suite of quantization- and calibration-based tools to further streamline performance.

For a full browser-based studio integrating these other key tuning utilities, try the Deep Learning Workbench.

OpenVINO™ toolkit includes a set of inference code samples and application demos showing how inference is run and output processed for use in retail environments, classrooms, smart camera applications, and other solutions.

OpenVINO also makes use of open-Source and Intel™ tools for traditional graphics processing and performance management. Intel® Media SDK supports accelerated rich-media processing, including transcoding. OpenVINO™ optimizes calls to the rich OpenCV and OpenVX libraries for processing computer vision workloads. And the new DL Streamer integration further accelerates video pipelining and performance.

Useful documents for inference tuning:

Packaging and Deployment

The Intel Distribution of OpenVINO™ toolkit outputs optimized inference runtimes for the following devices:

  • Intel® CPUs
  • Intel® Processor Graphics
  • Intel® Neural Compute Stick 2
  • Intel® Vision Accelerator Design with Intel® Movidius™ VPUs

The Inference Engine's plug-in architecture can be extended to meet other specialized needs. Deployment Manager is a Python* command-line tool that assembles the tuned model, IR files, your application, and required dependencies into a runtime package for your target device. It outputs packages for CPU, GPU, and VPU on Linux* and Windows*, and Neural Compute Stick-optimized packages with Linux.

OpenVINO™ Toolkit Components

Intel® Distribution of OpenVINO™ toolkit includes the following components:

OpenVINO™ Toolkit opensource version is available on GitHub. For building the Inference Engine from the source code, see the build instructions.