Group Inference#

group ov_runtime_cpp_api

OpenVINO Inference C++ API provides ov::Core, ov::CompiledModel, ov::InferRequest and ov::Tensor classes

Typedefs

using SupportedOpsMap = std::map<std::string, std::string>#

This type of map is used for result of Core::query_model.

  • key means operation name

  • value means device name supporting this operation

class Allocator
#include <allocator.hpp>

Wraps allocator implementation to provide safe way to store allocater loaded from shared library And constructs default based on new delete c++ calls allocator if created without parameters Accepts any std::pmr::memory_resource like allocator.

Public Functions

~Allocator()

Destructor preserves unloading order of implementation object and reference to library.

Allocator()

Default constructor.

Allocator(const Allocator &other) = default

Default copy constructor.

Parameters:

other – other Allocator object

Allocator &operator=(const Allocator &other) = default

Default copy assignment operator.

Parameters:

other – other Allocator object

Returns:

reference to the current object

Allocator(Allocator &&other) = default

Default move constructor.

Parameters:

other – other Allocator object

Allocator &operator=(Allocator &&other) = default

Default move assignment operator.

Parameters:

other – other Allocator object

Returns:

reference to the current object

template<typename A, typename std::enable_if<!std::is_same<typename std::decay<A>::type, Allocator>::value && !std::is_abstract<typename std::decay<A>::type>::value && !std::is_convertible<typename std::decay<A>::type, std::shared_ptr<Base>>::value, bool>::type = true>
inline Allocator(A &&a)

Initialize allocator using any allocator like object.

Template Parameters:

A – Type of allocator

Parameters:

a – allocator object

void *allocate(const size_t bytes, const size_t alignment = alignof(max_align_t))

Allocates memory.

Parameters:
  • bytes – The size in bytes at least to allocate

  • alignment – The alignment of storage

Throws:

Exception – if specified size and alignment is not supported

Returns:

Handle to the allocated resource

void deallocate(void *ptr, const size_t bytes = 0, const size_t alignment = alignof(max_align_t))

Releases the handle and all associated memory resources which invalidates the handle.

Parameters:
  • ptr – The handle to free

  • bytes – The size in bytes that was passed into allocate() method

  • alignment – The alignment of storage that was passed into allocate() method

bool operator==(const Allocator &other) const

Compares with other Allocator.

Parameters:

other – Other instance of allocator

Returns:

true if and only if memory allocated from one Allocator can be deallocated from the other and vice versa

bool operator!() const noexcept

Checks if current Allocator object is not initialized.

Returns:

true if current Allocator object is not initialized, false - otherwise

explicit operator bool() const noexcept

Checks if current Allocator object is initialized.

Returns:

true if current Allocator object is initialized, false - otherwise

class Tensor
#include <tensor.hpp>

Tensor API holding host memory It can throw exceptions safely for the application, where it is properly handled.

Subclassed by ov::RemoteTensor

Public Functions

Tensor() = default

Default constructor.

Tensor(const Tensor &other, const std::shared_ptr<void> &so)

Copy constructor with adding new shared object.

Parameters:
  • other – Original tensor

  • so – Shared object

Tensor(const Tensor &other) = default

Default copy constructor.

Parameters:

other – other Tensor object

Tensor &operator=(const Tensor &other) = default

Default copy assignment operator.

Parameters:

other – other Tensor object

Returns:

reference to the current object

Tensor(Tensor &&other) = default

Default move constructor.

Parameters:

other – other Tensor object

Tensor &operator=(Tensor &&other) = default

Default move assignment operator.

Parameters:

other – other Tensor object

Returns:

reference to the current object

~Tensor()

Destructor preserves unloading order of implementation object and reference to library.

Tensor(const element::Type &type, const Shape &shape, const Allocator &allocator = {})

Constructs Tensor using element type and shape. Allocate internal host storage using default allocator.

Parameters:
  • typeTensor element type

  • shapeTensor shape

  • allocator – allocates memory for internal tensor storage

Tensor(const element::Type &type, const Shape &shape, void *host_ptr, const Strides &strides = {})

Constructs Tensor using element type and shape. Wraps allocated host memory.

Note

Does not perform memory allocation internally

Parameters:
  • typeTensor element type

  • shapeTensor shape

  • host_ptr – Pointer to pre-allocated host memory with initialized objects

  • strides – Optional strides parameters in bytes. Strides are supposed to be computed automatically based on shape and element size

Tensor(const ov::Output<const ov::Node> &port, const Allocator &allocator = {})

Constructs Tensor using port from node. Allocate internal host storage using default allocator.

Parameters:
  • port – port from node

  • allocator – allocates memory for internal tensor storage

Tensor(const ov::Output<const ov::Node> &port, void *host_ptr, const Strides &strides = {})

Constructs Tensor using port from node. Wraps allocated host memory.

Note

Does not perform memory allocation internally

Parameters:
  • port – port from node

  • host_ptr – Pointer to pre-allocated host memory with initialized objects

  • strides – Optional strides parameters in bytes. Strides are supposed to be computed automatically based on shape and element size

Tensor(const Tensor &other, const Coordinate &begin, const Coordinate &end)

Constructs region of interest (ROI) tensor form another tensor.

Note

Does not perform memory allocation internally

Note

A Number of dimensions in begin and end must match number of dimensions in other.get_shape()

Parameters:
  • other – original tensor

  • begin – start coordinate of ROI object inside of the original object.

  • end – end coordinate of ROI object inside of the original object.

void set_shape(const ov::Shape &shape)

Set new shape for tensor, deallocate/allocate if new total size is bigger than previous one.

Note

Memory allocation may happen

Parameters:

shape – A new shape

const element::Type &get_element_type() const
Returns:

A tensor element type

const Shape &get_shape() const
Returns:

A tensor shape

void copy_to(ov::Tensor dst) const

Copy tensor, destination tensor should have the same element type and shape.

Parameters:

dst – destination tensor

bool is_continuous() const

Reports whether the tensor is continuous or not.

Returns:

true if tensor is continuous

size_t get_size() const

Returns the total number of elements (a product of all the dims or 1 for scalar)

Returns:

The total number of elements

size_t get_byte_size() const

Returns the size of the current Tensor in bytes.

Returns:

Tensor’s size in bytes

Strides get_strides() const
Returns:

Tensor’s strides in bytes

void *data(const element::Type &type = {}) const

Provides an access to the underlaying host memory.

Note

If type parameter is specified, the method throws an exception if specified type’s fundamental type does not match with tensor element type’s fundamental type

Parameters:

type – Optional type parameter.

Returns:

A host pointer to tensor memory

template<typename T, typename datatype = typename std::decay<T>::type>
inline T *data() const

Provides an access to the underlaying host memory casted to type T

Note

Throws exception if specified type does not match with tensor element type

Returns:

A host pointer to tensor memory casted to specified type T.

bool operator!() const noexcept

Checks if current Tensor object is not initialized.

Returns:

true if current Tensor object is not initialized, false - otherwise

explicit operator bool() const noexcept

Checks if current Tensor object is initialized.

Returns:

true if current Tensor object is initialized, false - otherwise

template<typename T>
inline std::enable_if<std::is_base_of<Tensor, T>::value, bool>::type is() const noexcept

Checks if the Tensor object can be cast to the type T.

Template Parameters:

T – Type to be checked. Must represent a class derived from the Tensor

Returns:

true if this object can be dynamically cast to the type const T*. Otherwise, false

template<typename T>
inline const std::enable_if<std::is_base_of<Tensor, T>::value, T>::type as() const

Casts this Tensor object to the type T.

Template Parameters:

T – Type to cast to. Must represent a class derived from the Tensor

Returns:

T object

Public Static Functions

static void type_check(const Tensor &tensor)

Checks openvino tensor type.

Parameters:

tensor – a tensor which type will be checked

Throws:

Exception – if type check with specified tensor is not pass

class CompiledModel
#include <compiled_model.hpp>

This class represents a compiled model.

A model is compiled by a specific device by applying multiple optimization transformations, then mapping to compute kernels.

Public Functions

CompiledModel() = default

Default constructor.

~CompiledModel()

Destructor that preserves unloading order of an implementation object and reference to library.

std::shared_ptr<const Model> get_runtime_model() const

Gets runtime model information from a device. This object represents an internal device-specific model that is optimized for a particular accelerator. It contains device-specific nodes, runtime information and can be used only to understand how the source model is optimized and which kernels, element types, and layouts are selected for optimal inference.

Returns:

A model containing Executable Graph Info.

const std::vector<ov::Output<const ov::Node>> &inputs() const

Gets all inputs of a compiled model. Inputs are represented as a vector of outputs of the ov::op::v0::Parameter operations. They contain information about input tensors such as tensor shape, names, and element type.

Returns:

std::vector of model inputs.

const ov::Output<const ov::Node> &input() const

Gets a single input of a compiled model. The input is represented as an output of the ov::op::v0::Parameter operation. The input contains information about input tensor such as tensor shape, names, and element type.

Note

If a model has more than one input, this method throws ov::Exception.

Returns:

Compiled model input.

const ov::Output<const ov::Node> &input(size_t i) const

Gets input of a compiled model identified by i. The input contains information about input tensor such as tensor shape, names, and element type.

Note

The method throws ov::Exception if input with the specified index i is not found.

Parameters:

i – Index of input.

Returns:

Compiled model input.

const ov::Output<const ov::Node> &input(const std::string &tensor_name) const

Gets input of a compiled model identified by tensor_name. The input contains information about input tensor such as tensor shape, names, and element type.

Note

The method throws ov::Exception if input with the specified tensor name tensor_name is not found.

Parameters:

tensor_name – The input tensor name.

Returns:

Compiled model input.

const std::vector<ov::Output<const ov::Node>> &outputs() const

Get all outputs of a compiled model. Outputs are represented as a vector of output from the ov::op::v0::Result operations. Outputs contain information about output tensors such as tensor shape, names, and element type.

Returns:

std::vector of model outputs.

const ov::Output<const ov::Node> &output() const

Gets a single output of a compiled model. The output is represented as an output from the ov::op::v0::Result operation. The output contains information about output tensor such as tensor shape, names, and element type.

Note

If a model has more than one output, this method throws ov::Exception.

Returns:

Compiled model output.

const ov::Output<const ov::Node> &output(size_t i) const

Gets output of a compiled model identified by index. The output contains information about output tensor such as tensor shape, names, and element type.

Note

The method throws ov::Exception if output with the specified index index is not found.

Parameters:

i – Index of input.

Returns:

Compiled model output.

const ov::Output<const ov::Node> &output(const std::string &tensor_name) const

Gets output of a compiled model identified by tensor_name. The output contains information about output tensor such as tensor shape, names, and element type.

Note

The method throws ov::Exception if output with the specified tensor name tensor_name is not found.

Parameters:

tensor_nameOutput tensor name.

Returns:

Compiled model output.

InferRequest create_infer_request()

Creates an inference request object used to infer the compiled model. The created request has allocated input and output tensors (which can be changed later).

Returns:

InferRequest object

void export_model(std::ostream &model_stream)

Exports the current compiled model to an output stream std::ostream. The exported model can also be imported via the ov::Core::import_model method.

Parameters:

model_streamOutput stream to store the model to.

void set_property(const AnyMap &properties)

Sets properties for the current compiled model.

Parameters:

properties – Map of pairs: (property name, property value).

template<typename ...Properties>
inline util::EnableIfAllStringAny<void, Properties...> set_property(Properties&&... properties)

Sets properties for the current compiled model.

Template Parameters:

Properties – Should be the pack of std::pair<std::string, ov::Any> types.

Parameters:

properties – Optional pack of pairs: (property name, property value).

Any get_property(const std::string &name) const

Gets properties for current compiled model.

The method is responsible for extracting information that affects compiled model inference. The list of supported configuration values can be extracted via CompiledModel::get_property with the ov::supported_properties key, but some of these keys cannot be changed dynamically, for example, ov::device::id cannot be changed if a compiled model has already been compiled for a particular device.

Parameters:

nameProperty key, can be found in openvino/runtime/properties.hpp.

Returns:

Property value.

template<typename T, PropertyMutability mutability>
inline T get_property(const ov::Property<T, mutability> &property) const

Gets properties related to device behaviour.

The method extracts information that can be set via the set_property method.

Template Parameters:

T – Type of a returned value.

Parameters:

propertyProperty object.

Returns:

Value of property.

RemoteContext get_context() const

Returns pointer to device-specific shared context on a remote accelerator device that was used to create this CompiledModel.

Returns:

A context.

bool operator!() const noexcept

Checks if the current CompiledModel object is not initialized.

Returns:

true if the current CompiledModel object is not initialized; false, otherwise.

explicit operator bool() const noexcept

Checks if the current CompiledModel object is initialized.

Returns:

true if the current CompiledModel object is initialized; false, otherwise.

class Core
#include <core.hpp>

This class represents an OpenVINO runtime Core entity.

User applications can create several Core class instances, but in this case the underlying plugins are created multiple times and not shared between several Core instances. The recommended way is to have a single Core instance per application.

Unnamed Group

CompiledModel compile_model(const std::string &model_path, const AnyMap &properties = {})

Reads and loads a compiled model from the IR/ONNX/PDPD file to the default OpenVINO device selected by the AUTO plugin.

This can be more efficient than using the Core::read_model + Core::compile_model(model_in_memory_object) flow, especially for cases when caching is enabled and a cached model is available.

Parameters:
  • model_path – Path to a model.

  • properties – Optional map of pairs: (property name, property value) relevant only for this load operation.

Returns:

A compiled model.

Unnamed Group

template<typename ...Properties>
inline util::EnableIfAllStringAny<CompiledModel, Properties...> compile_model(const std::string &model_path, Properties&&... properties)

Reads and loads a compiled model from IR / ONNX / PDPD file to the default OpenVINO device selected by AUTO plugin.

This can be more efficient than using read_model + compile_model(Model) flow especially for cases when caching is enabled and cached model is available

Template Parameters:

Properties – Should be the pack of std::pair<std::string, ov::Any> types

Parameters:
  • model_path – path to model with string or wstring

  • properties – Optional pack of pairs: (property name, property value) relevant only for this load operation

Returns:

A compiled model

Unnamed Group

CompiledModel compile_model(const std::string &model_path, const std::string &device_name, const AnyMap &properties = {})

Reads a model and creates a compiled model from the IR/ONNX/PDPD file.

This can be more efficient than using the Core::read_model + Core::compile_model(model_in_memory_object) flow, especially for cases when caching is enabled and a cached model is available.

Parameters:
  • model_path – Path to a model.

  • device_name – Name of a device to load a model to.

  • properties – Optional map of pairs: (property name, property value) relevant only for this load operation.

Returns:

A compiled model.

Unnamed Group

template<typename ...Properties>
inline util::EnableIfAllStringAny<CompiledModel, Properties...> compile_model(const std::string &model_path, const std::string &device_name, Properties&&... properties)

Reads a model and creates a compiled model from the IR/ONNX/PDPD file.

This can be more efficient than using read_model + compile_model(Model) flow especially for cases when caching is enabled and cached model is available.

Template Parameters:

Properties – Should be a pack of std::pair<std::string, ov::Any> types.

Parameters:
  • model_path – Path to a model.

  • device_name – Name of a device to load a model to.

  • properties – Optional pack of pairs: (property name, property value) relevant only for this load operation.

Returns:

A compiled model.

Public Functions

explicit Core(const std::string &xml_config_file = {})

Constructs an OpenVINO Core instance with devices and their plugins description.

There are two ways how to configure device plugins:

  1. (default) Use XML configuration file in case of dynamic libraries build;

  2. Use strictly defined configuration in case of static libraries build.

Parameters:

xml_config_file – Path to the .xml file with plugins to load from. If path contains only file name with extension, file will be searched in a folder with OpenVINO runtime shared library. If the XML configuration file is not specified, default OpenVINO Runtime plugins are loaded from:

  1. (dynamic build) default plugins.xml file located in the same folder as OpenVINO runtime shared library;

  2. (static build) statically defined configuration. In this case path to the .xml file is ignored.

std::map<std::string, Version> get_versions(const std::string &device_name) const

Returns device plugins version information. Device name can be complex and identify multiple devices at once like HETERO:CPU,GPU; in this case, std::map contains multiple entries, each per device.

Parameters:

device_name – Device name to identify a plugin.

Returns:

A vector of versions.

std::shared_ptr<ov::Model> read_model(const std::string &model_path, const std::string &bin_path = {}) const

Reads models from IR / ONNX / PDPD / TF / TFLite file formats.

Parameters:
  • model_path – Path to a model.

  • bin_path – Path to a data file. For IR format (*.bin):

    • if bin_path is empty, will try to read a bin file with the same name as xml and

    • if the bin file with the same name is not found, will load IR without weights. For the following file formats the bin_path parameter is not used:

    • ONNX format (*.onnx)

    • PDPD (*.pdmodel)

    • TF (*.pb)

    • TFLite (*.tflite)

Returns:

A model.

std::shared_ptr<ov::Model> read_model(const std::string &model, const Tensor &weights) const

Reads models from IR / ONNX / PDPD / TF / TFLite formats.

Note

Created model object shares the weights with the weights object. Thus, do not create weights on temporary data that can be freed later, since the model constant data will point to an invalid memory.

Parameters:
  • model – String with a model in IR / ONNX / PDPD / TF / TFLite format.

  • weights – Shared pointer to a constant tensor with weights. Reading ONNX / PDPD / TF / TFLite models does not support loading weights from the weights tensors.

Returns:

A model.

CompiledModel compile_model(const std::shared_ptr<const ov::Model> &model, const AnyMap &properties = {})

Creates and loads a compiled model from a source model to the default OpenVINO device selected by the AUTO plugin.

Users can create as many compiled models as they need and use them simultaneously (up to the limitation of the hardware resources).

Parameters:
  • modelModel object acquired from Core::read_model.

  • properties – Optional map of pairs: (property name, property value) relevant only for this load operation.

Returns:

A compiled model.

template<typename ...Properties>
inline util::EnableIfAllStringAny<CompiledModel, Properties...> compile_model(const std::shared_ptr<const ov::Model> &model, Properties&&... properties)

Creates and loads a compiled model from a source model to the default OpenVINO device selected by AUTO plugin.

Users can create as many compiled models as they need and use them simultaneously (up to the limitation of the hardware resources)

Template Parameters:

Properties – Should be the pack of std::pair<std::string, ov::Any> types

Parameters:
  • modelModel object acquired from Core::read_model

  • properties – Optional pack of pairs: (property name, property value) relevant only for this load operation

Returns:

A compiled model

CompiledModel compile_model(const std::shared_ptr<const ov::Model> &model, const std::string &device_name, const AnyMap &properties = {})

Creates a compiled model from a source model object.

Users can create as many compiled models as they need and use them simultaneously (up to the limitation of the hardware resources).

Parameters:
  • modelModel object acquired from Core::read_model.

  • device_name – Name of a device to load a model to.

  • properties – Optional map of pairs: (property name, property value) relevant only for this load operation.

Returns:

A compiled model.

template<typename ...Properties>
inline util::EnableIfAllStringAny<CompiledModel, Properties...> compile_model(const std::shared_ptr<const ov::Model> &model, const std::string &device_name, Properties&&... properties)

Creates a compiled model from a source model object.

Users can create as many compiled models as they need and use them simultaneously (up to the limitation of the hardware resources)

Template Parameters:

Properties – Should be the pack of std::pair<std::string, ov::Any> types

Parameters:
  • modelModel object acquired from Core::read_model

  • device_name – Name of device to load model to

  • properties – Optional pack of pairs: (property name, property value) relevant only for this load operation

Returns:

A compiled model

CompiledModel compile_model(const std::string &model, const ov::Tensor &weights, const std::string &device_name, const AnyMap &properties = {})

Reads a model and creates a compiled model from the IR/ONNX/PDPD memory.

Note

Created model object shares the weights with the weights object. Thus, do not create weights on temporary data that can be freed later, since the model constant data will point to an invalid memory.

Parameters:
  • model – String with a model in IR/ONNX/PDPD format.

  • weights – Shared pointer to a constant tensor with weights. Reading ONNX/PDPD models does not support loading weights from the weights tensors.

  • device_name – Name of a device to load a model to.

  • properties – Optional map of pairs: (property name, property value) relevant only for this load operation.

Returns:

A compiled model.

template<typename ...Properties>
inline util::EnableIfAllStringAny<CompiledModel, Properties...> compile_model(const std::string &model, const ov::Tensor &weights, const std::string &device_name, Properties&&... properties)

Reads a model and creates a compiled model from the IR/ONNX/PDPD memory.

Note

Created model object shares the weights with the weights object. Thus, do not create weights on temporary data that can be freed later, since the model constant data will point to an invalid memory.

Parameters:
  • model – String with a model in IR/ONNX/PDPD format.

  • weights – Shared pointer to a constant tensor with weights. Reading ONNX/PDPD models does not support loading weights from the weights tensors.

  • device_name – Name of a device to load a model to.

Template Parameters:

Properties – Should be a pack of std::pair<std::string, ov::Any> types.

Returns:

A compiled model.

CompiledModel compile_model(const std::shared_ptr<const ov::Model> &model, const RemoteContext &context, const AnyMap &properties = {})

Creates a compiled model from a source model within a specified remote context.

Parameters:
  • modelModel object acquired from Core::read_model.

  • context – A reference to a RemoteContext object.

  • properties – Optional map of pairs: (property name, property value) relevant only for this load operation.

Returns:

A compiled model object.

template<typename ...Properties>
inline util::EnableIfAllStringAny<CompiledModel, Properties...> compile_model(const std::shared_ptr<const ov::Model> &model, const RemoteContext &context, Properties&&... properties)

Creates a compiled model from a source model within a specified remote context.

Template Parameters:

Properties – Should be the pack of std::pair<std::string, ov::Any> types

Parameters:
  • modelModel object acquired from Core::read_model

  • context – Pointer to RemoteContext object

  • properties – Optional pack of pairs: (property name, property value) relevant only for this load operation

Returns:

A compiled model object

void add_extension(const std::string &library_path)

Registers an extension to a Core object.

Parameters:

library_path – Path to the library with ov::Extension.

void add_extension(const std::shared_ptr<ov::Extension> &extension)

Registers an extension to a Core object.

Parameters:

extension – Pointer to the extension.

void add_extension(const std::vector<std::shared_ptr<ov::Extension>> &extensions)

Registers extensions to a Core object.

Parameters:

extensions – Vector of loaded extensions.

template<class T, typename std::enable_if<std::is_base_of<ov::Extension, T>::value, bool>::type = true>
inline void add_extension(const T &extension)

Registers an extension to a Core object.

Parameters:

extensionExtension class that is inherited from the ov::Extension class.

template<class T, class ...Targs, typename std::enable_if<std::is_base_of<ov::Extension, T>::value, bool>::type = true>
inline void add_extension(const T &extension, Targs... args)

Registers extensions to a Core object.

Parameters:
  • extensionExtension class that is inherited from the ov::Extension class.

  • args – A list of extensions.

template<class T, typename std::enable_if<std::is_base_of<ov::op::Op, T>::value, bool>::type = true>
inline void add_extension()

Registers a custom operation inherited from ov::op::Op.

template<class T, class ...Targs, typename std::enable_if<std::is_base_of<ov::op::Op, T>::value && sizeof...(Targs), bool>::type = true>
inline void add_extension()

Registers custom operations inherited from ov::op::Op.

CompiledModel import_model(std::istream &model_stream, const std::string &device_name, const AnyMap &properties = {})

Imports a compiled model from the previously exported one.

Parameters:
  • model_stream – std::istream input stream containing a model previously exported using the ov::CompiledModel::export_model method.

  • device_name – Name of a device to import a compiled model for. Note, if device_name device was not used to compile the original mode, an exception is thrown.

  • properties – Optional map of pairs: (property name, property value) relevant only for this load operation.

Returns:

A compiled model.

template<typename ...Properties>
inline util::EnableIfAllStringAny<CompiledModel, Properties...> import_model(std::istream &model_stream, const std::string &device_name, Properties&&... properties)

Imports a compiled model from the previously exported one.

Template Parameters:

Properties – Should be the pack of std::pair<std::string, ov::Any> types.

Parameters:
  • model_streamModel stream.

  • device_name – Name of a device to import a compiled model for. Note, if device_name device was not used to compile the original mode, an exception is thrown.

  • properties – Optional pack of pairs: (property name, property value) relevant only for this load operation.

Returns:

A compiled model.

CompiledModel import_model(std::istream &model_stream, const RemoteContext &context, const AnyMap &properties = {})

Imports a compiled model from the previously exported one with the specified remote context.

Parameters:
  • model_stream – std::istream input stream containing a model previously exported from ov::CompiledModel::export_model

  • context – A reference to a RemoteContext object. Note, if the device from context was not used to compile the original mode, an exception is thrown.

  • properties – Optional map of pairs: (property name, property value) relevant only for this load operation.

Returns:

A compiled model.

template<typename ...Properties>
inline util::EnableIfAllStringAny<CompiledModel, Properties...> import_model(std::istream &model_stream, const RemoteContext &context, Properties&&... properties)

Imports a compiled model from the previously exported one with the specified remote context.

Template Parameters:

Properties – Should be the pack of std::pair<std::string, ov::Any> types.

Parameters:
  • model_streamModel stream.

  • context – Pointer to a RemoteContext object.

  • properties – Optional pack of pairs: (property name, property value) relevant only for this load operation.

Returns:

A compiled model.

SupportedOpsMap query_model(const std::shared_ptr<const ov::Model> &model, const std::string &device_name, const AnyMap &properties = {}) const

Query device if it supports the specified model with specified properties.

Parameters:
  • device_name – Name of a device to query.

  • modelModel object to query.

  • properties – Optional map of pairs: (property name, property value).

Returns:

An object containing a map of pairs an operation name -> a device name supporting this operation.

template<typename ...Properties>
inline util::EnableIfAllStringAny<SupportedOpsMap, Properties...> query_model(const std::shared_ptr<const ov::Model> &model, const std::string &device_name, Properties&&... properties) const

Queries a device if it supports the specified model with specified properties.

Template Parameters:

Properties – Should be the pack of std::pair<std::string, ov::Any> types.

Parameters:
  • device_name – Name of a device to query.

  • modelModel object to query.

  • properties – Optional pack of pairs: (property name, property value) relevant only for this query operation.

Returns:

An object containing a map of pairs an operation name -> a device name supporting this operation.

void set_property(const AnyMap &properties)

Sets properties for all the registered devices, acceptable keys can be found in openvino/runtime/properties.hpp.

Parameters:

properties – Map of pairs: (property name, property value).

template<typename ...Properties>
inline util::EnableIfAllStringAny<void, Properties...> set_property(Properties&&... properties)

Sets properties for all the registered devices, acceptable keys can be found in openvino/runtime/properties.hpp.

Template Parameters:

Properties – Should be a pack of std::pair<std::string, ov::Any> types.

Parameters:

properties – Optional pack of pairs: property name, property value.

void set_property(const std::string &device_name, const AnyMap &properties)

Sets properties for a device, acceptable keys can be found in openvino/runtime/properties.hpp.

Parameters:
  • device_name – Name of a device.

  • properties – Map of pairs: (property name, property value).

template<typename ...Properties>
inline util::EnableIfAllStringAny<void, Properties...> set_property(const std::string &device_name, Properties&&... properties)

Sets properties for a device, acceptable keys can be found in openvino/runtime/properties.hpp.

Template Parameters:

Properties – Should be the pack of std::pair<std::string, ov::Any> types.

Parameters:
  • device_name – Name of a device.

  • properties – Optional pack of pairs: (property name, property value).

Any get_property(const std::string &device_name, const std::string &name) const

Gets properties related to device behaviour.

The method extracts information that can be set via the set_property method.

Parameters:
  • device_name – Name of a device to get a property value.

  • nameProperty name.

Returns:

Value of a property corresponding to the property name.

Any get_property(const std::string &device_name, const std::string &name, const AnyMap &arguments) const

Gets properties related to device behaviour.

The method extracts information that can be set via the set_property method.

Parameters:
  • device_name – Name of a device to get a property value.

  • nameProperty name.

  • arguments – Additional arguments to get a property.

Returns:

Value of a property corresponding to the property name.

inline Any get_property(const std::string &name) const

Gets properties related to core behaviour.

The method extracts information that can be set via the set_property method.

Parameters:

nameProperty name.

Returns:

Value of a property corresponding to the property name.

template<typename T, PropertyMutability M>
inline T get_property(const std::string &device_name, const ov::Property<T, M> &property) const

Gets properties related to device behaviour.

The method is needed to request common device or system properties. It can be device name, temperature, and other devices-specific values.

Template Parameters:
  • T – Type of a returned value.

  • MProperty mutability.

Parameters:
  • device_name – Name of a device to get a property value.

  • propertyProperty object.

Returns:

Property value.

template<typename T, PropertyMutability M>
inline T get_property(const std::string &device_name, const ov::Property<T, M> &property, const AnyMap &arguments) const

Gets properties related to device behaviour.

The method is needed to request common device or system properties. It can be device name, temperature, other devices-specific values.

Template Parameters:
  • T – Type of a returned value.

  • MProperty mutability.

Parameters:
  • device_name – Name of a device to get a property value.

  • propertyProperty object.

  • arguments – Additional arguments to get a property.

Returns:

Property value.

template<typename T, PropertyMutability M, typename ...Args>
inline util::EnableIfAllStringAny<T, Args...> get_property(const std::string &device_name, const ov::Property<T, M> &property, Args&&... args) const

Gets properties related to device behaviour.

The method is needed to request common device or system properties. It can be device name, temperature, other devices-specific values.

Template Parameters:
  • T – Type of a returned value.

  • MProperty mutability.

  • Args – Set of additional arguments ended with property object variable.

Parameters:
  • device_name – Name of a device to get a property value.

  • propertyProperty object.

  • args – Optional pack of pairs: (argument name, argument value) ended with property object.

Returns:

Property value.

std::vector<std::string> get_available_devices() const

Returns devices available for inference. Core objects go over all registered plugins and ask about available devices.

Returns:

A vector of devices. The devices are returned as { CPU, GPU.0, GPU.1, NPU }. If there is more than one device of a specific type, they are enumerated with the .# suffix. Such enumerated device can later be used as a device name in all Core methods like Core::compile_model, Core::query_model, Core::set_property and so on.

void register_plugin(const std::string &plugin, const std::string &device_name, const ov::AnyMap &config = {})

Register a new device and plugin that enables this device inside OpenVINO Runtime.

Note

For security purposes it suggested to specify absolute path to register plugin.

Parameters:
  • plugin – Path (absolute or relative) or name of a plugin. Depending on platform, plugin is wrapped with shared library suffix and prefix to identify library full name. For example, on Linux platform, plugin name specified as plugin_name will be wrapped as libplugin_name.so. Plugin search algorithm:

    • If plugin points to an exact library path (absolute or relative), it will be used.

    • If plugin specifies file name (libplugin_name.so) or plugin name (plugin_name), it will be searched by file name (libplugin_name.so) in CWD or in paths pointed by PATH/LD_LIBRARY_PATH/DYLD_LIBRARY_PATH environment variables depending on the platform.

  • device_name – Device name to register a plugin for.

  • config – Plugin configuration options

void unload_plugin(const std::string &device_name)

Unloads the previously loaded plugin identified by device_name from OpenVINO Runtime. The method is needed to remove loaded plugin instance and free its resources. If plugin for a specified device has not been created before, the method throws an exception.

Note

This method does not remove plugin from the plugins known to OpenVINO Core object.

Parameters:

device_name – Device name identifying plugin to remove from OpenVINO Runtime.

void register_plugins(const std::string &xml_config_file)

Registers a device plugin to the OpenVINO Runtime Core instance using an XML configuration file with plugins description.

The XML file has the following structure:

<ie>
    <plugins>
        <plugin name="" location="">
            <extensions>
                <extension location=""/>
            </extensions>
            <properties>
                <property key="" value=""/>
            </properties>
        </plugin>
    </plugins>
</ie>

  • name identifies name of a device enabled by a plugin.

  • location specifies absolute path to dynamic library with a plugin. The path can also be relative to XML file directory. It allows having common config for different systems with different configurations.

  • properties are set to a plugin via the ov::Core::set_property method.

  • extensions are set to a plugin via the ov::Core::add_extension method.

Note

For security purposes it suggested to specify absolute path to register plugin.

Parameters:

xml_config_file – A path to .xml file with plugins to register.

RemoteContext create_context(const std::string &device_name, const AnyMap &remote_properties)

Creates a new remote shared context object on the specified accelerator device using specified plugin-specific low-level device API parameters (device handle, pointer, context, etc.).

Parameters:
  • device_name – Name of a device to create a new shared context on.

  • remote_properties – Map of device-specific shared context remote properties.

Returns:

Reference to a created remote context.

template<typename ...Properties>
inline util::EnableIfAllStringAny<RemoteContext, Properties...> create_context(const std::string &device_name, Properties&&... remote_properties)

Creates a new shared context object on specified accelerator device using specified plugin-specific low level device API properties (device handle, pointer, etc.)

Template Parameters:

Properties – Should be the pack of std::pair<std::string, ov::Any> types

Parameters:
  • device_name – Name of a device to create new shared context on.

  • remote_properties – Pack of device-specific shared context remote properties.

Returns:

A shared pointer to a created remote context.

RemoteContext get_default_context(const std::string &device_name)

Gets a pointer to default (plugin-supplied) shared context object for the specified accelerator device.

Parameters:

device_name – Name of a device to get a default shared context from.

Returns:

Reference to a default remote context.

class Cancelled : public ov::Exception
#include <exception.hpp>

Thrown in case of cancelled asynchronous operation.

class Busy : public ov::Exception
#include <exception.hpp>

Thrown in case of calling the InferRequest methods while the request is busy with compute operation.

class InferRequest
#include <infer_request.hpp>

This is a class of infer request that can be run in asynchronous or synchronous manners.

Public Functions

InferRequest() = default

Default constructor.

InferRequest(const InferRequest &other) = default

Default copy constructor.

Parameters:

other – Another InferRequest object.

InferRequest &operator=(const InferRequest &other) = default

Default copy assignment operator.

Parameters:

other – Another InferRequest object.

Returns:

Reference to the current object.

InferRequest(InferRequest &&other) = default

Default move constructor.

Parameters:

other – Another InferRequest object.

InferRequest &operator=(InferRequest &&other) = default

Default move assignment operator.

Parameters:

other – Another InferRequest object.

Returns:

Reference to the current object.

~InferRequest()

Destructor that preserves unloading order of implementation object and reference to the library.

Note

To preserve destruction order inside the default generated assignment operator, _impl is stored before _so. Use the destructor to remove implementation object before referencing to the library explicitly.

void set_tensor(const std::string &tensor_name, const Tensor &tensor)

Sets an input/output tensor to infer on.

Parameters:
  • tensor_name – Name of the input or output tensor.

  • tensor – Reference to the tensor. The element_type and shape of the tensor must match the model’s input/output element_type and size.

void set_tensor(const ov::Output<const ov::Node> &port, const Tensor &tensor)

Sets an input/output tensor to infer.

Parameters:
void set_tensor(const ov::Output<ov::Node> &port, const Tensor &tensor)

Sets an input/output tensor to infer.

Parameters:
void set_tensors(const std::string &tensor_name, const std::vector<Tensor> &tensors)

Sets a batch of tensors for input data to infer by tensor name. Model input must have batch dimension, and the number of tensors must match the batch size. The current version supports setting tensors to model inputs only. If tensor_name is associated with output (or any other non-input node), an exception is thrown.

Parameters:
  • tensor_name – Name of the input tensor.

  • tensorsInput tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors must match the input size.

void set_tensors(const ov::Output<const ov::Node> &port, const std::vector<Tensor> &tensors)

Sets a batch of tensors for input data to infer by input port. Model input must have batch dimension, and the number of tensors must match the batch size. The current version supports setting tensors to model inputs only. If port is associated with output (or any other non-input node), an exception is thrown.

Parameters:
  • port – Port of the input tensor.

  • tensorsInput tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors must match the input size.

void set_input_tensor(size_t idx, const Tensor &tensor)

Sets an input tensor to infer.

Parameters:
  • idx – Index of the input tensor. If idx is greater than the number of model inputs, an exception is thrown.

  • tensor – Reference to the tensor. The element_type and shape of the tensor must match the model’s input/output element_type and size.

void set_input_tensor(const Tensor &tensor)

Sets an input tensor to infer models with single input.

Note

If model has several inputs, an exception is thrown.

Parameters:

tensor – Reference to the input tensor.

void set_input_tensors(const std::vector<Tensor> &tensors)

Sets a batch of tensors for single input data. Model input must have batch dimension, and the number of tensors must match the batch size.

Parameters:

tensorsInput tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors must match the input size.

void set_input_tensors(size_t idx, const std::vector<Tensor> &tensors)

Sets a batch of tensors for input data to infer by input name. Model input must have batch dimension, and number of tensors must match the batch size.

Parameters:
  • idx – Name of the input tensor.

  • tensorsInput tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors must match the input size.

void set_output_tensor(size_t idx, const Tensor &tensor)

Sets an output tensor to infer.

Note

Index of the input preserved accross ov::Model, ov::CompiledModel, and ov::InferRequest.

Parameters:
  • idx – Index of the output tensor.

  • tensor – Reference to the output tensor. The type of the tensor must match the model output element type and shape.

void set_output_tensor(const Tensor &tensor)

Sets an output tensor to infer models with single output.

Note

If model has several outputs, an exception is thrown.

Parameters:

tensor – Reference to the output tensor.

Tensor get_tensor(const std::string &tensor_name)

Gets an input/output tensor for inference by tensor name.

Parameters:

tensor_name – Name of a tensor to get.

Returns:

The tensor with name tensor_name. If the tensor is not found, an exception is thrown.

Tensor get_tensor(const ov::Output<const ov::Node> &port)

Gets an input/output tensor for inference.

Note

If the tensor with the specified port is not found, an exception is thrown.

Parameters:

port – Port of the tensor to get.

Returns:

Tensor for the port port.

Tensor get_tensor(const ov::Output<ov::Node> &port)

Gets an input/output tensor for inference.

Note

If the tensor with the specified port is not found, an exception is thrown.

Parameters:

port – Port of the tensor to get.

Returns:

Tensor for the port port.

Tensor get_input_tensor(size_t idx)

Gets an input tensor for inference.

Parameters:

idx – Index of the tensor to get.

Returns:

Tensor with the input index idx. If the tensor with the specified idx is not found, an exception is thrown.

Tensor get_input_tensor()

Gets an input tensor for inference.

Returns:

The input tensor for the model. If model has several inputs, an exception is thrown.

Tensor get_output_tensor(size_t idx)

Gets an output tensor for inference.

Parameters:

idx – Index of the tensor to get.

Returns:

Tensor with the output index idx. If the tensor with the specified idx is not found, an exception is thrown.

Tensor get_output_tensor()

Gets an output tensor for inference.

Returns:

Output tensor for the model. If model has several outputs, an exception is thrown.

void infer()

Infers specified input(s) in synchronous mode.

Note

It blocks all methods of InferRequest while request is ongoing (running or waiting in a queue). Calling any method leads to throwning the ov::Busy exception.

void cancel()

Cancels inference request.

std::vector<ProfilingInfo> get_profiling_info() const

Queries performance measures per layer to identify the most time consuming operation.

Note

Not all plugins provide meaningful data.

Returns:

Vector of profiling information for operations in a model.

void start_async()

Starts inference of specified input(s) in asynchronous mode.

Note

It returns immediately. Inference starts also immediately. Calling any method while the request in a running state leads to throwning the ov::Busy exception.

void wait()

Waits for the result to become available. Blocks until the result becomes available.

bool wait_for(const std::chrono::milliseconds timeout)

Waits for the result to become available. Blocks until the specified timeout has elapsed or the result becomes available, whichever comes first.

Parameters:

timeout – Maximum duration, in milliseconds, to block for.

Returns:

True if inference request is ready and false, otherwise.

void set_callback(std::function<void(std::exception_ptr)> callback)

Sets a callback std::function that is called on success or failure of an asynchronous request.

Warning

Do not capture strong references to OpenVINO runtime objects into callback. Following objects should not be captured like:

  • ov::InferRequest

  • ov::ExecutableNetwork

  • ov::Core As specified objects implement shared reference concept do not capture this objects by value. It can lead to memory leaks or undefined behaviour! Try to use weak references or pointers.

Parameters:

callback – callback object which will be called on when inference finish.

std::vector<VariableState> query_state()

Gets state control interface for the given infer request.

State control essential for recurrent models.

Returns:

Vector of Variable State objects.

void reset_state()

Resets all internal variable states for relevant infer request to a value specified as default for the corresponding ReadValue node.

CompiledModel get_compiled_model()

Returns a compiled model that creates this inference request.

Returns:

Compiled model object.

bool operator!() const noexcept

Checks if the current InferRequest object is not initialized.

Returns:

True if the current InferRequest object is not initialized; false, otherwise.

explicit operator bool() const noexcept

Checks if the current InferRequest object is initialized.

Returns:

True if the current InferRequest object is initialized; false, otherwise.

bool operator!=(const InferRequest &other) const noexcept

Compares whether this request wraps the same impl underneath.

Parameters:

other – Another inference request.

Returns:

True if the current InferRequest object does not wrap the same impl as the operator’s arg.

bool operator==(const InferRequest &other) const noexcept

Compares whether this request wraps the same impl underneath.

Parameters:

other – Another inference request.

Returns:

True if the current InferRequest object wraps the same impl as the operator’s arg.

class RemoteContext
#include <remote_context.hpp>

This class represents an abstraction

for remote (non-CPU) accelerator device-specific inference context. Such context represents a scope on the device within which compiled models and remote memory tensors can exist, function, and exchange data.

Subclassed by ov::intel_gpu::ocl::ClContext

Public Functions

RemoteContext() = default

Default constructor.

RemoteContext(const RemoteContext &other) = default

Default copy constructor.

Parameters:

other – Another RemoteContext object.

RemoteContext &operator=(const RemoteContext &other) = default

Default copy assignment operator.

Parameters:

other – Another RemoteContext object.

Returns:

Reference to the current object.

RemoteContext(RemoteContext &&other) = default

Default move constructor.

Parameters:

other – Another RemoteContext object.

RemoteContext &operator=(RemoteContext &&other) = default

Default move assignment operator.

Parameters:

other – Another RemoteContext object.

Returns:

Reference to the current object.

operator bool() const noexcept

Checks if current RemoteContext object is initialized.

Returns:

true if current RemoteContext object is initialized, false - otherwise

~RemoteContext()

Destructor that preserves unloading order of implementation object and reference to the library.

template<typename T>
inline bool is() const noexcept

Checks if the RemoteContext object can be cast to the type T.

Template Parameters:

T – Type to be checked. Must represent a class derived from RemoteContext.

Returns:

True if this object can be dynamically cast to the type T*; false, otherwise.

template<typename T>
inline const T as() const

Casts this RemoteContext object to the type T.

Template Parameters:

T – Type to cast to. Must represent a class derived from RemoteContext.

Returns:

T Object.

RemoteTensor create_tensor(const element::Type &type, const Shape &shape, const AnyMap &params = {})

Allocates memory tensor in device memory or wraps user-supplied memory handle using the specified tensor description and low-level device-specific parameters. Returns a pointer to the object that implements the RemoteTensor interface.

Parameters:
  • type – Defines the element type of the tensor.

  • shape – Defines the shape of the tensor.

  • params – Map of the low-level tensor object parameters.

Returns:

Pointer to a plugin object that implements the RemoteTensor interface.

AnyMap get_params() const

Returns a map of device-specific parameters required for low-level operations with the underlying object. Parameters include device/context handles, access flags, etc. Content of the returned map depends on a remote execution context that is currently set on the device (working scenario). Abstract method.

Returns:

A map of name/parameter elements.

Tensor create_host_tensor(const element::Type type, const Shape &shape)

This method is used to create a host tensor object friendly for the device in current context. For example, GPU context may allocate USM host memory (if corresponding extension is available), which could be more efficient than regular host memory.

Parameters:
Returns:

A tensor instance with device friendly memory.

Public Static Functions

static void type_check(const RemoteContext &remote_context, const std::map<std::string, std::vector<std::string>> &type_info = {})

Internal method: checks remote type.

Parameters:
  • remote_context – Remote context which type is checked.

  • type_info – Map with remote object runtime info.

Throws:

Exception – if type check with the specified parameters failed.

class RemoteTensor : public ov::Tensor
#include <remote_tensor.hpp>

Remote memory access and interoperability API.

Subclassed by ov::intel_gpu::ocl::ClBufferTensor, ov::intel_gpu::ocl::ClImage2DTensor, ov::intel_gpu::ocl::USMTensor

Public Functions

void *data(const element::Type) = delete

Access to host memory is not available for RemoteTensor. To access a device-specific memory, cast to a specific RemoteTensor derived object and work with its properties or parse device memory properties via RemoteTensor::get_params.

Returns:

Nothing, throws an exception.

ov::AnyMap get_params() const

Returns a map of device-specific parameters required for low-level operations with underlying object. Parameters include device/context/surface/buffer handles, access flags, etc. Content of the returned map depends on remote execution context that is currently set on the device (working scenario). Abstract method.

Returns:

A map of name/parameter elements.

Public Static Functions

static void type_check(const Tensor &tensor, const std::map<std::string, std::vector<std::string>> &type_info = {})

Checks OpenVINO remote type.

Parameters:
  • tensorTensor which type is checked.

  • type_info – Map with remote object runtime info.

Throws:

Exception – if type check with specified parameters failed.

class VariableState
#include <variable_state.hpp>

VariableState class.

Public Functions

VariableState() = default

Default constructor.

~VariableState()

Destructor that preserves unloading order of implementation object and reference to the library.

void reset()

Resets internal variable state for relevant infer request to a value specified as default for the corresponding ReadValue node.

std::string get_name() const

Gets the name of the current variable state. If length of an array is not enough, the name is truncated by len, null terminator is inserted as well. variable_id from the corresponding ReadValue is used as variable state name.

Returns:

A string representing state name.

Tensor get_state() const

Returns the value of the variable state.

Returns:

A tensor representing a state.

void set_state(const Tensor &state)

Sets the new state for the next inference.

Parameters:

state – The current state to set.

struct ProfilingInfo
#include <profiling_info.hpp>

Represents basic inference profiling information per operation.

If the operation is executed using tiling, the sum time per each tile is indicated as the total execution time. Due to parallel execution, the total execution time for all nodes might be greater than the total inference time.

Public Types

enum class Status

Defines the general status of a node.

Values:

enumerator NOT_RUN

A node is not executed.

enumerator OPTIMIZED_OUT

A node is optimized out during graph optimization phase.

enumerator EXECUTED

A node is executed.

Public Members

Status status

Defines the node status.

std::chrono::microseconds real_time

The absolute time, in microseconds, that the node ran (in total).

std::chrono::microseconds cpu_time

The net host CPU time that the node ran.

std::string node_name

Name of a node.

std::string exec_type

Execution type of a unit.

std::string node_type

Node type.