Inference Devices and Modes#

The OpenVINO runtime offers multiple inference modes to enable the best hardware utilization under different conditions:

single-device inference
Define just one device responsible for the entire inference workload. It supports a range of processors by means of the following plugins embedded in the Runtime library:
automated inference modes
Assume certain level of automation in selecting devices for inference. They may potentially increase your deployed solution’s performance and portability. The automated modes are:

Enumerating Available Devices#

The OpenVINO Runtime API features dedicated methods of enumerating devices and their capabilities. Note that beyond the typical “CPU” or “GPU” device names, more qualified names are used when multiple instances of a device are available (iGPU is always GPU.0). The output you receive may look like this (truncated to device names only, two GPUs are listed as an example):

./hello_query_device
Available devices:
    Device: CPU
...
    Device: GPU.0
...
    Device: GPU.1

You may see how to obtain this information in the Hello Query Device Sample. Here is an example of a simple programmatic way to enumerate the devices and use them with the multi-device mode:

ov::Core core;
std::shared_ptr<ov::Model> model = core.read_model("sample.xml");
std::vector<std::string> availableDevices = core.get_available_devices();
std::string all_devices;
for (auto && device : availableDevices) {
    all_devices += device;
    all_devices += ((device == availableDevices[availableDevices.size()-1]) ? "" : ",");
}
ov::CompiledModel compileModel = core.compile_model(model, "MULTI",
    ov::device::priorities(all_devices));

With two GPU devices used in one setup, the explicit configuration would be “MULTI:GPU.1,GPU.0”. Accordingly, the code that loops over all available devices of the “GPU” type only is as follows:

ov::Core core;
std::vector<std::string> GPUDevices = core.get_property("GPU", ov::available_devices);
std::string all_devices;
for (size_t i = 0; i < GPUDevices.size(); ++i) {
    all_devices += std::string("GPU.")
                            + GPUDevices[i]
                            + std::string(i < (GPUDevices.size() -1) ? "," : "");
}
ov::CompiledModel compileModel = core.compile_model("sample.xml", "MULTI",
    ov::device::priorities(all_devices));