Intel® Movidius™ VPUs Programming Guide for Use with Intel® Distribution of OpenVINO™ toolkit

See Also

The following section provides information on how to distribute a model across all 8 VPUs to maximize performance.

Programming a C++ Application for the Accelerator

Declare a Structure to Track Requests

The structure should hold:

  1. A pointer to an inference request.
  2. An ID to keep track of the request.
    struct Request {
    int frameidx;
    };

Declare a Vector of Requests

// numRequests is the number of frames (max size, equal to the number of VPUs in use)
vector<Request> request(numRequests);

Declare and initialize 2 mutex variables:

  1. For each request
  2. For when all 8 requests are done

Declare a Conditional Variable

Conditional variable indicates when at most 8 requests are done at a time.

For inference requests, use the asynchronous IE API calls:

// initialize infer request pointer – Consult IE API for more detail.
request[i].inferRequest = executable_network.CreateInferRequestPtr();
// Run inference
request[i].inferRequest->StartAsync();

Create a Lambda Function

Lambda Function enables the parsing and display of results.

Inside the Lambda body use the completion callback function:

request[i].inferRequest->SetCompletionCallback
(nferenceEngine::IInferRequest::Ptr context)

Additional Resources