openvino.inference_engine.InferRequest¶

class openvino.inference_engine.InferRequest¶

Bases: object

OpenVINO Inference Engine Python API is deprecated and will be removed in the 2024.0 release. For instructions on transitioning to the new API, please refer to https://docs.openvino.ai/latest/openvino_2_0_transition_guide.html

This class provides an interface to infer requests of ExecutableNetwork and serves to handle infer requests execution and to set and get output data.

__init__()¶: There is no explicit class constructor. To make a valid InferRequest instance, use IECore.load_network() method of the IECore class with specified number of requests to get ExecutableNetwork instance which stores infer requests.

Methods

`__delattr__`(name, /)	Implement delattr(self, name).
`__dir__`()	Default dir() implementation.
`__eq__`(value, /)	Return self==value.
`__format__`(format_spec, /)	Default object formatter.
`__ge__`(value, /)	Return self>=value.
`__getattribute__`(name, /)	Return getattr(self, name).
`__gt__`(value, /)	Return self>value.
`__hash__`()	Return hash(self).
`__init__`	There is no explicit class constructor.
`__init_subclass__`	This method is called when a class is subclassed.
`__le__`(value, /)	Return self<=value.
`__lt__`(value, /)	Return self<value.
`__ne__`(value, /)	Return self!=value.
`__new__`(**kwargs)
`__reduce__`	InferRequest.__reduce_cython__(self)
`__reduce_ex__`(protocol, /)	Helper for pickle.
`__repr__`()	Return repr(self).
`__setattr__`(name, value, /)	Implement setattr(self, name, value).
`__setstate__`	InferRequest.__setstate_cython__(self, __pyx_state)
`__sizeof__`()	Size of object in memory, in bytes.
`__str__`()	Return str(self).
`__subclasshook__`	Abstract classes can override this to customize issubclass().
`_fill_inputs`(self, inputs)
`_get_blob_buffer`(self, string blob_name)
`async_infer`(self[, inputs])	Starts asynchronous inference of the infer request and fill outputs array
`get_perf_counts`(self)	Queries performance measures per layer to get feedback of what is the most time consuming layer.
`infer`(self[, inputs])	Starts synchronous inference of the infer request and fill outputs array
`query_state`(self)	Gets state control interface for given infer request State control essential for recurrent networks :return: A vector of Memory State objects
`set_blob`(self, unicode blob_name, Blob blob)	Sets user defined Blob for the infer request
`set_completion_callback`(self, py_callback[, ...])	Description: Sets a callback function that is called on success or failure of an asynchronous request
`wait`(self[, timeout])	Waits for the result to become available.

Attributes

`__pyx_vtable__`
`_inputs_list`	_inputs_list: object
`_outputs_list`	_outputs_list: object
`_py_callback`	_py_callback: object
`_py_data`	_py_data: object
`_user_blobs`	_user_blobs: object
`input_blobs`	Dictionary that maps input layer names to corresponding Blobs
`latency`	Current infer request inference time in milliseconds
`output_blobs`	Dictionary that maps output layer names to corresponding Blobs
`preprocess_info`	Dictionary that maps input layer names to corresponding preprocessing information

__class__¶: alias of type

__delattr__(name, /)¶: Implement delattr(self, name).

__dir__()¶: Default dir() implementation.

__eq__(value, /)¶: Return self==value.

__format__(format_spec, /)¶: Default object formatter.

__ge__(value, /)¶: Return self>=value.

__getattribute__(name, /)¶: Return getattr(self, name).

__gt__(value, /)¶: Return self>value.

__hash__()¶: Return hash(self).

__init__()¶: There is no explicit class constructor. To make a valid InferRequest instance, use IECore.load_network() method of the IECore class with specified number of requests to get ExecutableNetwork instance which stores infer requests.

__init_subclass__()¶

This method is called when a class is subclassed.

The default implementation does nothing. It may be overridden to extend subclasses.

__le__(value, /)¶: Return self<=value.

__lt__(value, /)¶: Return self<value.

__ne__(value, /)¶: Return self!=value.

__new__(**kwargs)¶

__pyx_vtable__ = <capsule object NULL>¶

__reduce__()¶: InferRequest.__reduce_cython__(self)

__reduce_ex__(protocol, /)¶: Helper for pickle.

__repr__()¶: Return repr(self).

__setattr__(name, value, /)¶: Implement setattr(self, name, value).

__setstate__()¶: InferRequest.__setstate_cython__(self, __pyx_state)

__sizeof__()¶: Size of object in memory, in bytes.

__str__()¶: Return str(self).

__subclasshook__()¶

Abstract classes can override this to customize issubclass().

This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).

_fill_inputs(self, inputs)¶

_get_blob_buffer(self, string blob_name) → BlobBuffer¶

_inputs_list¶: _inputs_list: object

_outputs_list¶: _outputs_list: object

_py_callback¶: _py_callback: object

_py_data¶: _py_data: object

_user_blobs¶: _user_blobs: object

async_infer(self, inputs=None)¶

Starts asynchronous inference of the infer request and fill outputs array

Parameters: inputs – A dictionary that maps input layer names to numpy.ndarray objects of proper shape with input data for the layer
Returns: None

Usage example:

exec_net = ie_core.load_network(network=net, device_name="CPU", num_requests=2)
exec_net.requests[0].async_infer({input_blob: image})
request_status = exec_net.requests[0].wait()
res = exec_net.requests[0].output_blobs['prob']

get_perf_counts(self)¶

Queries performance measures per layer to get feedback of what is the most time consuming layer.

Note

Performance counters data and format depends on the plugin

Returns: Dictionary containing per-layer execution information.

Usage example:

exec_net = ie_core.load_network(network=net, device_name="CPU", num_requests=2)
exec_net.requests[0].infer({input_blob: image})
exec_net.requests[0].get_perf_counts()
#  {'Conv2D': {'exec_type': 'jit_avx2_1x1',
#              'real_time': 154,
#              'cpu_time': 154,
#              'status': 'EXECUTED',
#              'layer_type': 'Convolution'},
#   'Relu6':  {'exec_type': 'undef',
#              'real_time': 0,
#              'cpu_time': 0,
#              'status': 'NOT_RUN',
#              'layer_type': 'Clamp'}
#   ...
#  }

infer(self, inputs=None)¶

Starts synchronous inference of the infer request and fill outputs array

Parameters: inputs – A dictionary that maps input layer names to numpy.ndarray objects of proper shape with input data for the layer
Returns: None

Usage example:

exec_net = ie_core.load_network(network=net, device_name="CPU", num_requests=2)
exec_net.requests[0].infer({input_blob: image})
res = exec_net.requests[0].output_blobs['prob']
np.flip(np.sort(np.squeeze(res)),0)

# array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
#         5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
#         2.26027006e-03, 2.12283316e-03 ...])

input_blobs¶: Dictionary that maps input layer names to corresponding Blobs

latency¶: Current infer request inference time in milliseconds

output_blobs¶: Dictionary that maps output layer names to corresponding Blobs

preprocess_info¶: Dictionary that maps input layer names to corresponding preprocessing information

query_state(self)¶: Gets state control interface for given infer request State control essential for recurrent networks :return: A vector of Memory State objects

set_blob(self, unicode blob_name: str, Blob blob: Blob)¶

Sets user defined Blob for the infer request

Parameters

blob_name – A name of input blob
blob – Blob object to set for the infer request
preprocess_info – PreProcessInfo object to set for the infer request.

Returns

None

Usage example:

ie = IECore()
net = IENetwork("./model.xml", "./model.bin")
exec_net = ie.load_network(net, "CPU", num_requests=2)
td = TensorDesc("FP32", (1, 3, 224, 224), "NCHW")
blob_data = np.ones(shape=(1, 3, 224, 224), dtype=np.float32)
blob = Blob(td, blob_data)
exec_net.requests[0].set_blob(blob_name="input_blob_name", blob=blob),

set_completion_callback(self, py_callback, py_data=None)¶

Description: Sets a callback function that is called on success or failure of an asynchronous request

Parameters

py_callback – Any defined or lambda function
py_data – Data that is passed to the callback function

Returns

None

Usage example:

callback = lambda status, py_data: print(f"Request with id {py_data} finished with status {status}")
ie = IECore()
net = ie.read_network(model="./model.xml", weights="./model.bin")
exec_net = ie.load_network(net, "CPU", num_requests=4)
for id, req in enumerate(exec_net.requests):
    req.set_completion_callback(py_callback=callback, py_data=id)

for req in exec_net.requests:
    req.async_infer({"data": img})

wait(self, timeout=None)¶

Waits for the result to become available. Blocks until specified timeout elapses or the result becomes available, whichever comes first.

Parameters: timeout – Time to wait in milliseconds or special (0, -1) cases described above. If not specified, timeout value is set to -1 by default.
Returns: Request status code.

Note

There are special values of the timeout parameter:

0 - Immediately returns the inference status. It does not block or interrupt execution. To find statuses meaning, please refer to enum_InferenceEngine_StatusCode in Inference Engine C++ documentation
-1 - Waits until inference result becomes available (default value)

Usage example: See InferRequest.async_infer() method of the the InferRequest class.