See Also
The following section provides information on how to distribute a model across all 8 VPUs to maximize performance.
Programming a C++ Application for the Accelerator
Declare a Structure to Track Requests
The structure should hold:
- A pointer to an inference request.
- An ID to keep track of the request.
struct Request {
int frameidx;
};
Declare a Vector of Requests
vector<Request> request(numRequests);
Declare and initialize 2 mutex variables:
- For each request
- For when all 8 requests are done
Declare a Conditional Variable
Conditional variable indicates when at most 8 requests are done at a time.
For inference requests, use the asynchronous IE API calls:
request[i].inferRequest = executable_network.CreateInferRequestPtr();
request[i].inferRequest->StartAsync();
Create a Lambda Function
Lambda Function enables the parsing and display of results.
Inside the Lambda body use the completion callback function:
request[i].inferRequest->SetCompletionCallback
(nferenceEngine::IInferRequest::Ptr context)
Additional Resources