openvino.InferRequest#

class openvino.InferRequest(other: InferRequest)#

Bases: _InferRequestWrapper

InferRequest class represents infer request which can be run in asynchronous or synchronous manners.

__init__(self: openvino._pyopenvino.InferRequest, other: openvino._pyopenvino.InferRequest) None#

Methods

__delattr__(name, /)

Implement delattr(self, name).

__dir__()

Default dir() implementation.

__eq__(value, /)

Return self==value.

__format__(format_spec, /)

Default object formatter.

__ge__(value, /)

Return self>=value.

__getattribute__(name, /)

Return getattr(self, name).

__gt__(value, /)

Return self>value.

__hash__()

Return hash(self).

__init__(self, other)

__init_subclass__

This method is called when a class is subclassed.

__le__(value, /)

Return self<=value.

__lt__(value, /)

Return self<value.

__ne__(value, /)

Return self!=value.

__new__(**kwargs)

__reduce__()

Helper for pickle.

__reduce_ex__(protocol, /)

Helper for pickle.

__repr__(self)

__setattr__(name, value, /)

Implement setattr(self, name, value).

__sizeof__()

Size of object in memory, in bytes.

__str__()

Return str(self).

__subclasshook__

Abstract classes can override this to customize issubclass().

_is_single_input()

cancel(self)

Cancels inference request.

get_compiled_model()

Gets the compiled model this InferRequest is using.

get_input_tensor(*args, **kwargs)

Overloaded function.

get_output_tensor(*args, **kwargs)

Overloaded function.

get_profiling_info(self)

Queries performance is measured per layer to get feedback on what is the most time-consuming operation, not all plugins provide meaningful data.

get_tensor(*args, **kwargs)

Overloaded function.

infer([inputs, share_inputs, share_outputs, ...])

Infers specified input(s) in synchronous mode.

query_state(self)

Gets state control interface for given infer request.

reset_state(self)

Resets all internal variable states for relevant infer request to a value specified as default for the corresponding ReadValue node

set_callback(self, callback, userdata)

Sets a callback function that will be called on success or failure of asynchronous InferRequest.

set_input_tensor(*args, **kwargs)

Overloaded function.

set_input_tensors(*args, **kwargs)

Overloaded function.

set_output_tensor(*args, **kwargs)

Overloaded function.

set_output_tensors(self, outputs)

Set output tensors using given indexes.

set_tensor(*args, **kwargs)

Overloaded function.

set_tensors(*args, **kwargs)

Overloaded function.

start_async([inputs, userdata, share_inputs])

Starts inference of specified input(s) in asynchronous mode.

wait(self)

Waits for the result to become available.

wait_for(self, timeout)

Waits for the result to become available.

Attributes

input_tensors

Gets all input tensors of this InferRequest.

latency

Gets latency of this InferRequest.

model_inputs

Gets all inputs of a compiled model which was used to create this InferRequest.

model_outputs

Gets all outputs of a compiled model which was used to create this InferRequest.

output_tensors

Gets all output tensors of this InferRequest.

profiling_info

Performance is measured per layer to get feedback on the most time-consuming operation.

results

Gets all outputs tensors of this InferRequest.

userdata

Gets currently held userdata.

__class__#

alias of pybind11_type

__delattr__(name, /)#

Implement delattr(self, name).

__dir__()#

Default dir() implementation.

__eq__(value, /)#

Return self==value.

__format__(format_spec, /)#

Default object formatter.

__ge__(value, /)#

Return self>=value.

__getattribute__(name, /)#

Return getattr(self, name).

__gt__(value, /)#

Return self>value.

__hash__()#

Return hash(self).

__init__(self: openvino._pyopenvino.InferRequest, other: openvino._pyopenvino.InferRequest) None#
__init_subclass__()#

This method is called when a class is subclassed.

The default implementation does nothing. It may be overridden to extend subclasses.

__le__(value, /)#

Return self<=value.

__lt__(value, /)#

Return self<value.

__ne__(value, /)#

Return self!=value.

__new__(**kwargs)#
__reduce__()#

Helper for pickle.

__reduce_ex__(protocol, /)#

Helper for pickle.

__repr__(self: openvino._pyopenvino.InferRequest) str#
__setattr__(name, value, /)#

Implement setattr(self, name, value).

__sizeof__()#

Size of object in memory, in bytes.

__str__()#

Return str(self).

__subclasshook__()#

Abstract classes can override this to customize issubclass().

This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).

_is_single_input() bool#
cancel(self: openvino._pyopenvino.InferRequest) None#

Cancels inference request.

get_compiled_model() CompiledModel#

Gets the compiled model this InferRequest is using.

Returns:

a CompiledModel object

Return type:

openvino.runtime.ie_api.CompiledModel

get_input_tensor(*args, **kwargs)#

Overloaded function.

  1. get_input_tensor(self: openvino._pyopenvino.InferRequest, index: int) -> openvino._pyopenvino.Tensor

    Gets input tensor of InferRequest.

    param idx:

    An index of tensor to get.

    type idx:

    int

    return:

    An input Tensor with index idx for the model. If a tensor with specified idx is not found,

    an exception is thrown. :rtype: openvino.runtime.Tensor

  2. get_input_tensor(self: openvino._pyopenvino.InferRequest) -> openvino._pyopenvino.Tensor

    Gets input tensor of InferRequest.

    return:

    An input Tensor for the model. If model has several inputs, an exception is thrown.

    rtype:

    openvino.runtime.Tensor

get_output_tensor(*args, **kwargs)#

Overloaded function.

  1. get_output_tensor(self: openvino._pyopenvino.InferRequest, index: int) -> openvino._pyopenvino.Tensor

    Gets output tensor of InferRequest.

    param idx:

    An index of tensor to get.

    type idx:

    int

    return:

    An output Tensor with index idx for the model. If a tensor with specified idx is not found, an exception is thrown.

    rtype:

    openvino.runtime.Tensor

  2. get_output_tensor(self: openvino._pyopenvino.InferRequest) -> openvino._pyopenvino.Tensor

    Gets output tensor of InferRequest.

    return:

    An output Tensor for the model. If model has several outputs, an exception is thrown.

    rtype:

    openvino.runtime.Tensor

get_profiling_info(self: openvino._pyopenvino.InferRequest) list[ov::ProfilingInfo]#

Queries performance is measured per layer to get feedback on what is the most time-consuming operation, not all plugins provide meaningful data.

GIL is released while running this function.

Returns:

List of profiling information for operations in model.

Return type:

List[openvino.runtime.ProfilingInfo]

get_tensor(*args, **kwargs)#

Overloaded function.

  1. get_tensor(self: openvino._pyopenvino.InferRequest, name: str) -> openvino._pyopenvino.Tensor

    Gets input/output tensor of InferRequest.

    param name:

    Name of tensor to get.

    type name:

    str

    return:

    A Tensor object with given name.

    rtype:

    openvino.runtime.Tensor

  2. get_tensor(self: openvino._pyopenvino.InferRequest, port: openvino._pyopenvino.ConstOutput) -> openvino._pyopenvino.Tensor

    Gets input/output tensor of InferRequest.

    param port:

    Port of tensor to get.

    type port:

    openvino.runtime.ConstOutput

    return:

    A Tensor object for the port.

    rtype:

    openvino.runtime.Tensor

  3. get_tensor(self: openvino._pyopenvino.InferRequest, port: openvino._pyopenvino.Output) -> openvino._pyopenvino.Tensor

    Gets input/output tensor of InferRequest.

    param port:

    Port of tensor to get.

    type port:

    openvino.runtime.Output

    return:

    A Tensor object for the port.

    rtype:

    openvino.runtime.Tensor

infer(inputs: Any | None = None, share_inputs: bool = False, share_outputs: bool = False, *, decode_strings: bool = True) OVDict#

Infers specified input(s) in synchronous mode.

Blocks all methods of InferRequest while request is running. Calling any method will lead to throwing exceptions.

The allowed types of keys in the inputs dictionary are:

  1. int

  2. str

  3. openvino.runtime.ConstOutput

The allowed types of values in the inputs are:

  1. numpy.ndarray and all the types that are castable to it, e.g. torch.Tensor

  2. openvino.runtime.Tensor

Can be called with only one openvino.runtime.Tensor or numpy.ndarray, it will work only with one-input models. When model has more inputs, function throws error.

Parameters:
  • inputs (Any, optional) – Data to be set on input tensors.

  • share_inputs (bool, optional) –

    Enables share_inputs mode. Controls memory usage on inference’s inputs.

    If set to False inputs the data dispatcher will safely copy data to existing Tensors (including up- or down-casting according to data type, resizing of the input Tensor). Keeps Tensor inputs “as-is”.

    If set to True the data dispatcher tries to provide “zero-copy” Tensors for every input in form of: * numpy.ndarray and all the types that are castable to it, e.g. torch.Tensor Data that is going to be copied: * numpy.ndarray which are not C contiguous and/or not writable (WRITEABLE flag is set to False) * inputs which data types are mismatched from Infer Request’s inputs * inputs that should be in BF16 data type * scalar inputs (i.e. np.float_/str/bytes/int/float) * lists of simple data types (i.e. str/bytes/int/float) Keeps Tensor inputs “as-is”.

    Note: Use with extra care, shared data can be modified during runtime! Note: Using share_inputs may result in extra memory overhead.

    Default value: False

  • share_outputs (bool, optional) –

    Enables share_outputs mode. Controls memory usage on inference’s outputs.

    If set to False outputs will safely copy data to numpy arrays.

    If set to True the data will be returned in form of views of output Tensors. This mode still returns the data in format of numpy arrays but lifetime of the data is connected to OpenVINO objects.

    Note: Use with extra care, shared data can be modified or lost during runtime! Note: String/textual data will always be copied!

    Default value: False

  • decode_strings (bool, optional, keyword-only) –

    Controls decoding outputs of textual based data.

    If set to True string outputs will be returned as numpy arrays of U kind.

    If set to False string outputs will be returned as numpy arrays of S kind.

    Default value: True

Returns:

Dictionary of results from output tensors with port/int/str keys.

Return type:

OVDict

property input_tensors#

Gets all input tensors of this InferRequest.

Return type:

List[openvino.runtime.Tensor]

property latency#

Gets latency of this InferRequest.

Return type:

float

property model_inputs#

Gets all inputs of a compiled model which was used to create this InferRequest.

Return type:

List[openvino.runtime.ConstOutput]

property model_outputs#

Gets all outputs of a compiled model which was used to create this InferRequest.

Return type:

List[openvino.runtime.ConstOutput]

property output_tensors#

Gets all output tensors of this InferRequest.

Return type:

List[openvino.runtime.Tensor]

property profiling_info#

Performance is measured per layer to get feedback on the most time-consuming operation. Not all plugins provide meaningful data!

GIL is released while running this function.

Returns:

Inference time.

Return type:

List[openvino.runtime.ProfilingInfo]

query_state(self: openvino._pyopenvino.InferRequest) list[ov::VariableState]#

Gets state control interface for given infer request.

GIL is released while running this function.

Returns:

List of VariableState objects.

Return type:

List[openvino.runtime.VariableState]

reset_state(self: openvino._pyopenvino.InferRequest) None#

Resets all internal variable states for relevant infer request to a value specified as default for the corresponding ReadValue node

property results: OVDict#

Gets all outputs tensors of this InferRequest.

Returns:

Dictionary of results from output tensors with ports as keys.

Return type:

Dict[openvino.runtime.ConstOutput, numpy.array]

set_callback(self: openvino._pyopenvino.InferRequest, callback: Callable, userdata: object) None#

Sets a callback function that will be called on success or failure of asynchronous InferRequest.

Parameters:
  • callback (function) – Function defined in Python.

  • userdata (Any) – Any data that will be passed inside callback call.

set_input_tensor(*args, **kwargs)#

Overloaded function.

  1. set_input_tensor(self: openvino._pyopenvino.InferRequest, index: int, tensor: openvino._pyopenvino.Tensor) -> None

    Sets input tensor of InferRequest.

    param idx:

    Index of input tensor. If idx is greater than number of model’s inputs, an exception is thrown.

    type idx:

    int

    param tensor:

    Tensor object. The element_type and shape of a tensor must match the model’s input element_type and shape.

    type tensor:

    openvino.runtime.Tensor

  2. set_input_tensor(self: openvino._pyopenvino.InferRequest, tensor: openvino._pyopenvino.Tensor) -> None

    Sets input tensor of InferRequest with single input. If model has several inputs, an exception is thrown.

    param tensor:

    Tensor object. The element_type and shape of a tensor must match the model’s input element_type and shape.

    type tensor:

    openvino.runtime.Tensor

set_input_tensors(*args, **kwargs)#

Overloaded function.

  1. set_input_tensors(self: openvino._pyopenvino.InferRequest, inputs: dict) -> None

    Set input tensors using given indexes.

    param inputs:

    Data to set on output tensors.

    type inputs:

    Dict[int, openvino.runtime.Tensor]

  2. set_input_tensors(self: openvino._pyopenvino.InferRequest, tensors: list[openvino._pyopenvino.Tensor]) -> None

    Sets batch of tensors for single input data. Model input needs to have batch dimension and the number of tensors needs to match with batch size.

    param tensors:

    Input tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors needs to match with input’s size.

    type tensors:

    List[openvino.runtime.Tensor]

  3. set_input_tensors(self: openvino._pyopenvino.InferRequest, idx: int, tensors: list[openvino._pyopenvino.Tensor]) -> None

    Sets batch of tensors for single input data to infer by index. Model input needs to have batch dimension and the number of tensors needs to match with batch size.

    param idx:

    Index of input tensor.

    type idx:

    int

    param tensors:

    Input tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors needs to match with input’s size.

set_output_tensor(*args, **kwargs)#

Overloaded function.

  1. set_output_tensor(self: openvino._pyopenvino.InferRequest, index: int, tensor: openvino._pyopenvino.Tensor) -> None

    Sets output tensor of InferRequest.

    param idx:

    Index of output tensor.

    type idx:

    int

    param tensor:

    Tensor object. The element_type and shape of a tensor must match the model’s output element_type and shape.

    type tensor:

    openvino.runtime.Tensor

  2. set_output_tensor(self: openvino._pyopenvino.InferRequest, tensor: openvino._pyopenvino.Tensor) -> None

    Sets output tensor of InferRequest with single output. If model has several outputs, an exception is thrown.

    param tensor:

    Tensor object. The element_type and shape of a tensor must match the model’s output element_type and shape.

    type tensor:

    openvino.runtime.Tensor

set_output_tensors(self: openvino._pyopenvino.InferRequest, outputs: dict) None#

Set output tensors using given indexes.

Parameters:

inputs (Dict[int, openvino.runtime.Tensor]) – Data to set on output tensors.

set_tensor(*args, **kwargs)#

Overloaded function.

  1. set_tensor(self: openvino._pyopenvino.InferRequest, name: str, tensor: RemoteTensorWrapper) -> None

    Sets input/output tensor of InferRequest.

    param name:

    Name of input/output tensor.

    type name:

    str

    param tensor:

    RemoteTensor object. The element_type and shape of a tensor must match the model’s input/output element_type and shape.

    type tensor:

    openvino.runtime.RemoteTensor

  2. set_tensor(self: openvino._pyopenvino.InferRequest, name: str, tensor: openvino._pyopenvino.Tensor) -> None

    Sets input/output tensor of InferRequest.

    param name:

    Name of input/output tensor.

    type name:

    str

    param tensor:

    Tensor object. The element_type and shape of a tensor must match the model’s input/output element_type and shape.

    type tensor:

    openvino.runtime.Tensor

  3. set_tensor(self: openvino._pyopenvino.InferRequest, port: openvino._pyopenvino.ConstOutput, tensor: openvino._pyopenvino.Tensor) -> None

    Sets input/output tensor of InferRequest.

    param port:

    Port of input/output tensor.

    type port:

    openvino.runtime.ConstOutput

    param tensor:

    Tensor object. The element_type and shape of a tensor must match the model’s input/output element_type and shape.

    type tensor:

    openvino.runtime.Tensor

  4. set_tensor(self: openvino._pyopenvino.InferRequest, port: openvino._pyopenvino.Output, tensor: openvino._pyopenvino.Tensor) -> None

    Sets input/output tensor of InferRequest.

    param port:

    Port of input/output tensor.

    type port:

    openvino.runtime.Output

    param tensor:

    Tensor object. The element_type and shape of a tensor must match the model’s input/output element_type and shape.

    type tensor:

    openvino.runtime.Tensor

set_tensors(*args, **kwargs)#

Overloaded function.

  1. set_tensors(self: openvino._pyopenvino.InferRequest, inputs: dict) -> None

    Set tensors using given keys.

    param inputs:

    Data to set on tensors.

    type inputs:

    Dict[Union[int, str, openvino.runtime.ConstOutput], openvino.runtime.Tensor]

  2. set_tensors(self: openvino._pyopenvino.InferRequest, tensor_name: str, tensors: list[openvino._pyopenvino.Tensor]) -> None

    Sets batch of tensors for input data to infer by tensor name. Model input needs to have batch dimension and the number of tensors needs to be matched with batch size. Current version supports set tensors to model inputs only. In case if tensor_name is associated with output (or any other non-input node), an exception will be thrown.

    param tensor_name:

    Name of input tensor.

    type tensor_name:

    str

    param tensors:

    Input tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors needs to match with input’s size.

    type tensors:

    List[openvino.runtime.Tensor]

  3. set_tensors(self: openvino._pyopenvino.InferRequest, port: openvino._pyopenvino.ConstOutput, tensors: list[openvino._pyopenvino.Tensor]) -> None

    Sets batch of tensors for input data to infer by tensor name. Model input needs to have batch dimension and the number of tensors needs to be matched with batch size. Current version supports set tensors to model inputs only. In case if port is associated with output (or any other non-input node), an exception will be thrown.

    param port:

    Port of input tensor.

    type port:

    openvino.runtime.ConstOutput

    param tensors:

    Input tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors needs to match with input’s size.

    type tensors:

    List[openvino.runtime.Tensor]

    rtype:

    None

start_async(inputs: Any | None = None, userdata: Any | None = None, share_inputs: bool = False) None#

Starts inference of specified input(s) in asynchronous mode.

Returns immediately. Inference starts also immediately. Calling any method on the InferRequest object while the request is running will lead to throwing exceptions.

The allowed types of keys in the inputs dictionary are:

  1. int

  2. str

  3. openvino.runtime.ConstOutput

The allowed types of values in the inputs are:

  1. numpy.ndarray and all the types that are castable to it, e.g. torch.Tensor

  2. openvino.runtime.Tensor

Can be called with only one openvino.runtime.Tensor or numpy.ndarray, it will work only with one-input models. When model has more inputs, function throws error.

Parameters:
  • inputs (Any, optional) – Data to be set on input tensors.

  • userdata (Any) – Any data that will be passed inside the callback.

  • share_inputs (bool, optional) –

    Enables share_inputs mode. Controls memory usage on inference’s inputs.

    If set to False inputs the data dispatcher will safely copy data to existing Tensors (including up- or down-casting according to data type, resizing of the input Tensor). Keeps Tensor inputs “as-is”.

    If set to True the data dispatcher tries to provide “zero-copy” Tensors for every input in form of: * numpy.ndarray and all the types that are castable to it, e.g. torch.Tensor Data that is going to be copied: * numpy.ndarray which are not C contiguous and/or not writable (WRITEABLE flag is set to False) * inputs which data types are mismatched from Infer Request’s inputs * inputs that should be in BF16 data type * scalar inputs (i.e. np.float_/str/bytes/int/float) * lists of simple data types (i.e. str/bytes/int/float) Keeps Tensor inputs “as-is”.

    Note: Use with extra care, shared data can be modified during runtime! Note: Using share_inputs may result in extra memory overhead.

    Default value: False

property userdata#

Gets currently held userdata.

Return type:

Any

wait(self: openvino._pyopenvino.InferRequest) None#

Waits for the result to become available. Blocks until the result becomes available.

GIL is released while running this function.

wait_for(self: openvino._pyopenvino.InferRequest, timeout: int) bool#

Waits for the result to become available. Blocks until specified timeout has elapsed or the result becomes available, whichever comes first.

GIL is released while running this function.

Parameters:

timeout (int) – Maximum duration in milliseconds (ms) of blocking call.

Returns:

True if InferRequest is ready, False otherwise.

Return type:

bool