openvino.runtime.InferRequest

class openvino.runtime.InferRequest(other: openvino._pyopenvino.InferRequest)

Bases: openvino.runtime.utils.data_helpers.wrappers._InferRequestWrapper

InferRequest class represents infer request which can be run in asynchronous or synchronous manners.

__init__(self: openvino._pyopenvino.InferRequest, other: openvino._pyopenvino.InferRequest) None

Methods

__delattr__(name, /)

Implement delattr(self, name).

__dir__()

Default dir() implementation.

__eq__(value, /)

Return self==value.

__format__(format_spec, /)

Default object formatter.

__ge__(value, /)

Return self>=value.

__getattribute__(name, /)

Return getattr(self, name).

__gt__(value, /)

Return self>value.

__hash__()

Return hash(self).

__init__(self, other)

__init_subclass__

This method is called when a class is subclassed.

__le__(value, /)

Return self<=value.

__lt__(value, /)

Return self<value.

__ne__(value, /)

Return self!=value.

__new__(**kwargs)

__reduce__()

Helper for pickle.

__reduce_ex__(protocol, /)

Helper for pickle.

__repr__(self)

__setattr__(name, value, /)

Implement setattr(self, name, value).

__sizeof__()

Size of object in memory, in bytes.

__str__()

Return str(self).

__subclasshook__

Abstract classes can override this to customize issubclass().

_is_single_input()

cancel(self)

Cancels inference request.

get_compiled_model()

Gets the compiled model this InferRequest is using.

get_input_tensor(*args, **kwargs)

Overloaded function.

get_output_tensor(*args, **kwargs)

Overloaded function.

get_profiling_info(self)

Queries performance is measured per layer to get feedback on what is the most time-consuming operation, not all plugins provide meaningful data.

get_tensor(*args, **kwargs)

Overloaded function.

infer([inputs, share_inputs, share_outputs, ...])

Infers specified input(s) in synchronous mode.

query_state(self)

Gets state control interface for given infer request.

reset_state(self)

Resets all internal variable states for relevant infer request to a value specified as default for the corresponding ReadValue node

set_callback(self, callback, userdata)

Sets a callback function that will be called on success or failure of asynchronous InferRequest.

set_input_tensor(*args, **kwargs)

Overloaded function.

set_input_tensors(*args, **kwargs)

Overloaded function.

set_output_tensor(*args, **kwargs)

Overloaded function.

set_output_tensors(self, outputs)

Set output tensors using given indexes.

set_tensor(*args, **kwargs)

Overloaded function.

set_tensors(*args, **kwargs)

Overloaded function.

start_async([inputs, userdata, ...])

Starts inference of specified input(s) in asynchronous mode.

wait(self)

Waits for the result to become available.

wait_for(self, timeout)

Waits for the result to become available.

Attributes

input_tensors

Gets all input tensors of this InferRequest.

inputs

Gets all input tensors of this InferRequest.

latency

Gets latency of this InferRequest.

model_inputs

Gets all inputs of a compiled model which was used to create this InferRequest.

model_outputs

Gets all outputs of a compiled model which was used to create this InferRequest.

output_tensors

Gets all output tensors of this InferRequest.

outputs

Gets all output tensors of this InferRequest.

profiling_info

Performance is measured per layer to get feedback on the most time-consuming operation.

results

Gets all outputs tensors of this InferRequest.

userdata

Gets currently held userdata.

__class__

alias of pybind11_builtins.pybind11_type

__delattr__(name, /)

Implement delattr(self, name).

__dir__()

Default dir() implementation.

__eq__(value, /)

Return self==value.

__format__(format_spec, /)

Default object formatter.

__ge__(value, /)

Return self>=value.

__getattribute__(name, /)

Return getattr(self, name).

__gt__(value, /)

Return self>value.

__hash__()

Return hash(self).

__init__(self: openvino._pyopenvino.InferRequest, other: openvino._pyopenvino.InferRequest) None
__init_subclass__()

This method is called when a class is subclassed.

The default implementation does nothing. It may be overridden to extend subclasses.

__le__(value, /)

Return self<=value.

__lt__(value, /)

Return self<value.

__ne__(value, /)

Return self!=value.

__new__(**kwargs)
__reduce__()

Helper for pickle.

__reduce_ex__(protocol, /)

Helper for pickle.

__repr__(self: openvino._pyopenvino.InferRequest) str
__setattr__(name, value, /)

Implement setattr(self, name, value).

__sizeof__()

Size of object in memory, in bytes.

__str__()

Return str(self).

__subclasshook__()

Abstract classes can override this to customize issubclass().

This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).

_is_single_input() bool
cancel(self: openvino._pyopenvino.InferRequest) None

Cancels inference request.

get_compiled_model() openvino.runtime.ie_api.CompiledModel

Gets the compiled model this InferRequest is using.

Returns

a CompiledModel object

Return type

openvino.runtime.ie_api.CompiledModel

get_input_tensor(*args, **kwargs)

Overloaded function.

  1. get_input_tensor(self: openvino._pyopenvino.InferRequest, index: int) -> openvino._pyopenvino.Tensor

    Gets input tensor of InferRequest.

    param idx

    An index of tensor to get.

    type idx

    int

    return

    An input Tensor with index idx for the model. If a tensor with specified idx is not found,

    an exception is thrown. :rtype: openvino.runtime.Tensor

  2. get_input_tensor(self: openvino._pyopenvino.InferRequest) -> openvino._pyopenvino.Tensor

    Gets input tensor of InferRequest.

    return

    An input Tensor for the model. If model has several inputs, an exception is thrown.

    rtype

    openvino.runtime.Tensor

get_output_tensor(*args, **kwargs)

Overloaded function.

  1. get_output_tensor(self: openvino._pyopenvino.InferRequest, index: int) -> openvino._pyopenvino.Tensor

    Gets output tensor of InferRequest.

    param idx

    An index of tensor to get.

    type idx

    int

    return

    An output Tensor with index idx for the model. If a tensor with specified idx is not found, an exception is thrown.

    rtype

    openvino.runtime.Tensor

  2. get_output_tensor(self: openvino._pyopenvino.InferRequest) -> openvino._pyopenvino.Tensor

    Gets output tensor of InferRequest.

    return

    An output Tensor for the model. If model has several outputs, an exception is thrown.

    rtype

    openvino.runtime.Tensor

get_profiling_info(self: openvino._pyopenvino.InferRequest) List[ov::ProfilingInfo]

Queries performance is measured per layer to get feedback on what is the most time-consuming operation, not all plugins provide meaningful data.

GIL is released while running this function.

Returns

List of profiling information for operations in model.

Return type

List[openvino.runtime.ProfilingInfo]

get_tensor(*args, **kwargs)

Overloaded function.

  1. get_tensor(self: openvino._pyopenvino.InferRequest, name: str) -> openvino._pyopenvino.Tensor

    Gets input/output tensor of InferRequest.

    param name

    Name of tensor to get.

    type name

    str

    return

    A Tensor object with given name.

    rtype

    openvino.runtime.Tensor

  2. get_tensor(self: openvino._pyopenvino.InferRequest, port: openvino._pyopenvino.ConstOutput) -> openvino._pyopenvino.Tensor

    Gets input/output tensor of InferRequest.

    param port

    Port of tensor to get.

    type port

    openvino.runtime.ConstOutput

    return

    A Tensor object for the port.

    rtype

    openvino.runtime.Tensor

  3. get_tensor(self: openvino._pyopenvino.InferRequest, port: openvino._pyopenvino.Output) -> openvino._pyopenvino.Tensor

    Gets input/output tensor of InferRequest.

    param port

    Port of tensor to get.

    type port

    openvino.runtime.Output

    return

    A Tensor object for the port.

    rtype

    openvino.runtime.Tensor

infer(inputs: Optional[Any] = None, share_inputs: bool = False, share_outputs: bool = False, *, shared_memory: Optional[Any] = None) openvino.runtime.utils.data_helpers.wrappers.OVDict

Infers specified input(s) in synchronous mode.

Blocks all methods of InferRequest while request is running. Calling any method will lead to throwing exceptions.

The allowed types of keys in the inputs dictionary are:

  1. int

  2. str

  3. openvino.runtime.ConstOutput

The allowed types of values in the inputs are:

  1. numpy.ndarray and all the types that are castable to it, e.g. torch.Tensor

  2. openvino.runtime.Tensor

Can be called with only one openvino.runtime.Tensor or numpy.ndarray, it will work only with one-input models. When model has more inputs, function throws error.

Parameters
  • inputs (Any, optional) – Data to be set on input tensors.

  • share_inputs (bool, optional) –

    Enables share_inputs mode. Controls memory usage on inference’s inputs.

    If set to False inputs the data dispatcher will safely copy data to existing Tensors (including up- or down-casting according to data type, resizing of the input Tensor). Keeps Tensor inputs “as-is”.

    If set to True the data dispatcher tries to provide “zero-copy” Tensors for every input in form of: * numpy.ndarray and all the types that are castable to it, e.g. torch.Tensor Data that is going to be copied: * numpy.ndarray which are not C contiguous and/or not writable (WRITEABLE flag is set to False) * inputs which data types are mismatched from Infer Request’s inputs * inputs that should be in BF16 data type * scalar inputs (i.e. np.float_/str/bytes/int/float) * lists of simple data types (i.e. str/bytes/int/float) Keeps Tensor inputs “as-is”.

    Note: Use with extra care, shared data can be modified during runtime! Note: Using share_inputs may result in extra memory overhead.

    Default value: False

  • share_outputs (bool, optional) –

    Enables share_outputs mode. Controls memory usage on inference’s outputs.

    If set to False outputs will safely copy data to numpy arrays.

    If set to True the data will be returned in form of views of output Tensors. This mode still returns the data in format of numpy arrays but lifetime of the data is connected to OpenVINO objects.

    Note: Use with extra care, shared data can be modified or lost during runtime!

    Default value: False

  • shared_memory (bool, optional) –

    Deprecated. Works like share_inputs mode.

    If not specified, function uses share_inputs value.

    Note: Will be removed in 2024.0 release! Note: This is keyword-only argument.

    Default value: None

Returns

Dictionary of results from output tensors with port/int/str keys.

Return type

OVDict

property input_tensors

Gets all input tensors of this InferRequest.

Return type

List[openvino.runtime.Tensor]

property inputs

Gets all input tensors of this InferRequest.

Return type

List[openvino.runtime.Tensor]

property latency

Gets latency of this InferRequest.

Return type

float

property model_inputs

Gets all inputs of a compiled model which was used to create this InferRequest.

Return type

List[openvino.runtime.ConstOutput]

property model_outputs

Gets all outputs of a compiled model which was used to create this InferRequest.

Return type

List[openvino.runtime.ConstOutput]

property output_tensors

Gets all output tensors of this InferRequest.

Return type

List[openvino.runtime.Tensor]

property outputs

Gets all output tensors of this InferRequest.

Return type

List[openvino.runtime.Tensor]

property profiling_info

Performance is measured per layer to get feedback on the most time-consuming operation. Not all plugins provide meaningful data!

GIL is released while running this function.

Returns

Inference time.

Return type

List[openvino.runtime.ProfilingInfo]

query_state(self: openvino._pyopenvino.InferRequest) List[ov::VariableState]

Gets state control interface for given infer request.

GIL is released while running this function.

Returns

List of VariableState objects.

Return type

List[openvino.runtime.VariableState]

reset_state(self: openvino._pyopenvino.InferRequest) None

Resets all internal variable states for relevant infer request to a value specified as default for the corresponding ReadValue node

property results: openvino.runtime.utils.data_helpers.wrappers.OVDict

Gets all outputs tensors of this InferRequest.

Returns

Dictionary of results from output tensors with ports as keys.

Return type

Dict[openvino.runtime.ConstOutput, numpy.array]

set_callback(self: openvino._pyopenvino.InferRequest, callback: function, userdata: object) None

Sets a callback function that will be called on success or failure of asynchronous InferRequest.

Parameters
  • callback (function) – Function defined in Python.

  • userdata (Any) – Any data that will be passed inside callback call.

set_input_tensor(*args, **kwargs)

Overloaded function.

  1. set_input_tensor(self: openvino._pyopenvino.InferRequest, index: int, tensor: openvino._pyopenvino.Tensor) -> None

    Sets input tensor of InferRequest.

    param idx

    Index of input tensor. If idx is greater than number of model’s inputs, an exception is thrown.

    type idx

    int

    param tensor

    Tensor object. The element_type and shape of a tensor must match the model’s input element_type and shape.

    type tensor

    openvino.runtime.Tensor

  2. set_input_tensor(self: openvino._pyopenvino.InferRequest, tensor: openvino._pyopenvino.Tensor) -> None

    Sets input tensor of InferRequest with single input. If model has several inputs, an exception is thrown.

    param tensor

    Tensor object. The element_type and shape of a tensor must match the model’s input element_type and shape.

    type tensor

    openvino.runtime.Tensor

set_input_tensors(*args, **kwargs)

Overloaded function.

  1. set_input_tensors(self: openvino._pyopenvino.InferRequest, inputs: dict) -> None

    Set input tensors using given indexes.

    param inputs

    Data to set on output tensors.

    type inputs

    Dict[int, openvino.runtime.Tensor]

  2. set_input_tensors(self: openvino._pyopenvino.InferRequest, tensors: List[openvino._pyopenvino.Tensor]) -> None

    Sets batch of tensors for single input data. Model input needs to have batch dimension and the number of tensors needs to match with batch size.

    param tensors

    Input tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors needs to match with input’s size.

    type tensors

    List[openvino.runtime.Tensor]

  3. set_input_tensors(self: openvino._pyopenvino.InferRequest, idx: int, tensors: List[openvino._pyopenvino.Tensor]) -> None

    Sets batch of tensors for single input data to infer by index. Model input needs to have batch dimension and the number of tensors needs to match with batch size.

    param idx

    Index of input tensor.

    type idx

    int

    param tensors

    Input tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors needs to match with input’s size.

set_output_tensor(*args, **kwargs)

Overloaded function.

  1. set_output_tensor(self: openvino._pyopenvino.InferRequest, index: int, tensor: openvino._pyopenvino.Tensor) -> None

    Sets output tensor of InferRequest.

    param idx

    Index of output tensor.

    type idx

    int

    param tensor

    Tensor object. The element_type and shape of a tensor must match the model’s output element_type and shape.

    type tensor

    openvino.runtime.Tensor

  2. set_output_tensor(self: openvino._pyopenvino.InferRequest, tensor: openvino._pyopenvino.Tensor) -> None

    Sets output tensor of InferRequest with single output. If model has several outputs, an exception is thrown.

    param tensor

    Tensor object. The element_type and shape of a tensor must match the model’s output element_type and shape.

    type tensor

    openvino.runtime.Tensor

set_output_tensors(self: openvino._pyopenvino.InferRequest, outputs: dict) None

Set output tensors using given indexes.

Parameters

inputs (Dict[int, openvino.runtime.Tensor]) – Data to set on output tensors.

set_tensor(*args, **kwargs)

Overloaded function.

  1. set_tensor(self: openvino._pyopenvino.InferRequest, name: str, tensor: openvino._pyopenvino.Tensor) -> None

    Sets input/output tensor of InferRequest.

    param name

    Name of input/output tensor.

    type name

    str

    param tensor

    Tensor object. The element_type and shape of a tensor must match the model’s input/output element_type and shape.

    type tensor

    openvino.runtime.Tensor

  2. set_tensor(self: openvino._pyopenvino.InferRequest, port: openvino._pyopenvino.ConstOutput, tensor: openvino._pyopenvino.Tensor) -> None

    Sets input/output tensor of InferRequest.

    param port

    Port of input/output tensor.

    type port

    openvino.runtime.ConstOutput

    param tensor

    Tensor object. The element_type and shape of a tensor must match the model’s input/output element_type and shape.

    type tensor

    openvino.runtime.Tensor

  3. set_tensor(self: openvino._pyopenvino.InferRequest, port: openvino._pyopenvino.Output, tensor: openvino._pyopenvino.Tensor) -> None

    Sets input/output tensor of InferRequest.

    param port

    Port of input/output tensor.

    type port

    openvino.runtime.Output

    param tensor

    Tensor object. The element_type and shape of a tensor must match the model’s input/output element_type and shape.

    type tensor

    openvino.runtime.Tensor

set_tensors(*args, **kwargs)

Overloaded function.

  1. set_tensors(self: openvino._pyopenvino.InferRequest, inputs: dict) -> None

    Set tensors using given keys.

    param inputs

    Data to set on tensors.

    type inputs

    Dict[Union[int, str, openvino.runtime.ConstOutput], openvino.runtime.Tensor]

  2. set_tensors(self: openvino._pyopenvino.InferRequest, tensor_name: str, tensors: List[openvino._pyopenvino.Tensor]) -> None

    Sets batch of tensors for input data to infer by tensor name. Model input needs to have batch dimension and the number of tensors needs to be matched with batch size. Current version supports set tensors to model inputs only. In case if tensor_name is associated with output (or any other non-input node), an exception will be thrown.

    param tensor_name

    Name of input tensor.

    type tensor_name

    str

    param tensors

    Input tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors needs to match with input’s size.

    type tensors

    List[openvino.runtime.Tensor]

  3. set_tensors(self: openvino._pyopenvino.InferRequest, port: openvino._pyopenvino.ConstOutput, tensors: List[openvino._pyopenvino.Tensor]) -> None

    Sets batch of tensors for input data to infer by tensor name. Model input needs to have batch dimension and the number of tensors needs to be matched with batch size. Current version supports set tensors to model inputs only. In case if port is associated with output (or any other non-input node), an exception will be thrown.

    param port

    Port of input tensor.

    type port

    openvino.runtime.ConstOutput

    param tensors

    Input tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors needs to match with input’s size.

    type tensors

    List[openvino.runtime.Tensor]

    rtype

    None

start_async(inputs: Optional[Any] = None, userdata: Optional[Any] = None, share_inputs: bool = False, *, shared_memory: Optional[Any] = None) None

Starts inference of specified input(s) in asynchronous mode.

Returns immediately. Inference starts also immediately. Calling any method on the InferRequest object while the request is running will lead to throwing exceptions.

The allowed types of keys in the inputs dictionary are:

  1. int

  2. str

  3. openvino.runtime.ConstOutput

The allowed types of values in the inputs are:

  1. numpy.ndarray and all the types that are castable to it, e.g. torch.Tensor

  2. openvino.runtime.Tensor

Can be called with only one openvino.runtime.Tensor or numpy.ndarray, it will work only with one-input models. When model has more inputs, function throws error.

Parameters
  • inputs (Any, optional) – Data to be set on input tensors.

  • userdata (Any) – Any data that will be passed inside the callback.

  • share_inputs (bool, optional) –

    Enables share_inputs mode. Controls memory usage on inference’s inputs.

    If set to False inputs the data dispatcher will safely copy data to existing Tensors (including up- or down-casting according to data type, resizing of the input Tensor). Keeps Tensor inputs “as-is”.

    If set to True the data dispatcher tries to provide “zero-copy” Tensors for every input in form of: * numpy.ndarray and all the types that are castable to it, e.g. torch.Tensor Data that is going to be copied: * numpy.ndarray which are not C contiguous and/or not writable (WRITEABLE flag is set to False) * inputs which data types are mismatched from Infer Request’s inputs * inputs that should be in BF16 data type * scalar inputs (i.e. np.float_/str/bytes/int/float) * lists of simple data types (i.e. str/bytes/int/float) Keeps Tensor inputs “as-is”.

    Note: Use with extra care, shared data can be modified during runtime! Note: Using share_inputs may result in extra memory overhead.

    Default value: False

  • shared_memory (bool, optional) –

    Deprecated. Works like share_inputs mode.

    If not specified, function uses share_inputs value.

    Note: Will be removed in 2024.0 release! Note: This is keyword-only argument.

    Default value: None

property userdata

Gets currently held userdata.

Return type

Any

wait(self: openvino._pyopenvino.InferRequest) None

Waits for the result to become available. Blocks until the result becomes available.

GIL is released while running this function.

wait_for(self: openvino._pyopenvino.InferRequest, timeout: int) bool

Waits for the result to become available. Blocks until specified timeout has elapsed or the result becomes available, whichever comes first.

GIL is released while running this function.

Parameters

timeout (int) – Maximum duration in milliseconds (ms) of blocking call.

Returns

True if InferRequest is ready, False otherwise.

Return type

bool