openvino.inference_engine.InferRequest¶
- class openvino.inference_engine.InferRequest¶
Bases:
object
This class provides an interface to infer requests of
ExecutableNetwork
and serves to handle infer requests execution and to set and get output data.- __init__()¶
There is no explicit class constructor. To make a valid
InferRequest
instance, useIECore.load_network()
method of theIECore
class with specified number of requests to getExecutableNetwork
instance which stores infer requests.
Methods
__delattr__
(name, /)Implement delattr(self, name).
__dir__
()Default dir() implementation.
__eq__
(value, /)Return self==value.
__format__
(format_spec, /)Default object formatter.
__ge__
(value, /)Return self>=value.
__getattribute__
(name, /)Return getattr(self, name).
__gt__
(value, /)Return self>value.
__hash__
()Return hash(self).
There is no explicit class constructor.
This method is called when a class is subclassed.
__le__
(value, /)Return self<=value.
__lt__
(value, /)Return self<value.
__ne__
(value, /)Return self!=value.
__new__
(**kwargs)InferRequest.__reduce_cython__(self)
__reduce_ex__
(protocol, /)Helper for pickle.
__repr__
()Return repr(self).
__setattr__
(name, value, /)Implement setattr(self, name, value).
InferRequest.__setstate_cython__(self, __pyx_state)
Size of object in memory, in bytes.
__str__
()Return str(self).
Abstract classes can override this to customize issubclass().
_fill_inputs
(self, inputs)_get_blob_buffer
(self, string blob_name)async_infer
(self[, inputs])Starts asynchronous inference of the infer request and fill outputs array
get_perf_counts
(self)Queries performance measures per layer to get feedback of what is the most time consuming layer.
infer
(self[, inputs])Starts synchronous inference of the infer request and fill outputs array
query_state
(self)Gets state control interface for given infer request State control essential for recurrent networks :return: A vector of Memory State objects
set_batch
(self, size)Sets new batch size for certain infer request when dynamic batching is enabled in executable network that created this request.
set_blob
(self, str blob_name, Blob blob, ...)Sets user defined Blob for the infer request
set_completion_callback
(self, py_callback[, ...])Description: Sets a callback function that is called on success or failure of an asynchronous request
wait
(self[, timeout])Waits for the result to become available.
Attributes
_inputs_list: object
_outputs_list: object
_py_callback: object
_py_data: object
_user_blobs: object
Dictionary that maps input layer names to corresponding Blobs
Current infer request inference time in milliseconds
Dictionary that maps output layer names to corresponding Blobs
Dictionary that maps input layer names to corresponding preprocessing information
- __class__¶
alias of
type
- __delattr__(name, /)¶
Implement delattr(self, name).
- __dir__()¶
Default dir() implementation.
- __eq__(value, /)¶
Return self==value.
- __format__(format_spec, /)¶
Default object formatter.
- __ge__(value, /)¶
Return self>=value.
- __getattribute__(name, /)¶
Return getattr(self, name).
- __gt__(value, /)¶
Return self>value.
- __hash__()¶
Return hash(self).
- __init__()¶
There is no explicit class constructor. To make a valid
InferRequest
instance, useIECore.load_network()
method of theIECore
class with specified number of requests to getExecutableNetwork
instance which stores infer requests.
- __init_subclass__()¶
This method is called when a class is subclassed.
The default implementation does nothing. It may be overridden to extend subclasses.
- __le__(value, /)¶
Return self<=value.
- __lt__(value, /)¶
Return self<value.
- __ne__(value, /)¶
Return self!=value.
- __new__(**kwargs)¶
- __pyx_vtable__ = <capsule object NULL>¶
- __reduce__()¶
InferRequest.__reduce_cython__(self)
- __reduce_ex__(protocol, /)¶
Helper for pickle.
- __repr__()¶
Return repr(self).
- __setattr__(name, value, /)¶
Implement setattr(self, name, value).
- __setstate__()¶
InferRequest.__setstate_cython__(self, __pyx_state)
- __sizeof__()¶
Size of object in memory, in bytes.
- __str__()¶
Return str(self).
- __subclasshook__()¶
Abstract classes can override this to customize issubclass().
This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).
- _fill_inputs(self, inputs)¶
- _get_blob_buffer(self, string blob_name) BlobBuffer ¶
- _inputs_list¶
_inputs_list: object
- _outputs_list¶
_outputs_list: object
- _py_callback¶
_py_callback: object
- _py_data¶
_py_data: object
- _user_blobs¶
_user_blobs: object
- async_infer(self, inputs=None)¶
Starts asynchronous inference of the infer request and fill outputs array
- Parameters
inputs – A dictionary that maps input layer names to
numpy.ndarray
objects of proper shape with input data for the layer- Returns
None
Usage example:
exec_net = ie_core.load_network(network=net, device_name="CPU", num_requests=2) exec_net.requests[0].async_infer({input_blob: image}) request_status = exec_net.requests[0].wait() res = exec_net.requests[0].output_blobs['prob']
- get_perf_counts(self)¶
Queries performance measures per layer to get feedback of what is the most time consuming layer.
Note
Performance counters data and format depends on the plugin
- Returns
Dictionary containing per-layer execution information.
Usage example:
exec_net = ie_core.load_network(network=net, device_name="CPU", num_requests=2) exec_net.requests[0].infer({input_blob: image}) exec_net.requests[0].get_perf_counts() # {'Conv2D': {'exec_type': 'jit_avx2_1x1', # 'real_time': 154, # 'cpu_time': 154, # 'status': 'EXECUTED', # 'layer_type': 'Convolution'}, # 'Relu6': {'exec_type': 'undef', # 'real_time': 0, # 'cpu_time': 0, # 'status': 'NOT_RUN', # 'layer_type': 'Clamp'} # ... # }
- infer(self, inputs=None)¶
Starts synchronous inference of the infer request and fill outputs array
- Parameters
inputs – A dictionary that maps input layer names to
numpy.ndarray
objects of proper shape with input data for the layer- Returns
None
Usage example:
exec_net = ie_core.load_network(network=net, device_name="CPU", num_requests=2) exec_net.requests[0].infer({input_blob: image}) res = exec_net.requests[0].output_blobs['prob'] np.flip(np.sort(np.squeeze(res)),0) # array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01, # 5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03, # 2.26027006e-03, 2.12283316e-03 ...])
- input_blobs¶
Dictionary that maps input layer names to corresponding Blobs
- latency¶
Current infer request inference time in milliseconds
- output_blobs¶
Dictionary that maps output layer names to corresponding Blobs
- preprocess_info¶
Dictionary that maps input layer names to corresponding preprocessing information
- query_state(self)¶
Gets state control interface for given infer request State control essential for recurrent networks :return: A vector of Memory State objects
- set_batch(self, size)¶
Sets new batch size for certain infer request when dynamic batching is enabled in executable network that created this request.
Note
Support of dynamic batch size depends on the target plugin.
- Parameters
size – New batch size to be used by all the following inference calls for this request
- Returns
None
Usage example:
ie = IECore() net = ie.read_network(model=path_to_xml_file, weights=path_to_bin_file) # Set max batch size # net.batch = 10 ie.set_config(config={"DYN_BATCH_ENABLED": "YES"}, device_name=device) exec_net = ie.load_network(network=net, device_name=device) # Set batch size for certain network. # NOTE: Input data shape will not be changed, but will be used partially in inference which increases performance exec_net.requests[0].set_batch(2)
- set_blob(self, str blob_name: str, Blob blob: Blob, PreProcessInfo preprocess_info: PreProcessInfo = None)¶
Sets user defined Blob for the infer request
- Parameters
blob_name – A name of input blob
blob – Blob object to set for the infer request
preprocess_info – PreProcessInfo object to set for the infer request.
- Returns
None
Usage example:
ie = IECore() net = IENetwork("./model.xml", "./model.bin") exec_net = ie.load_network(net, "CPU", num_requests=2) td = TensorDesc("FP32", (1, 3, 224, 224), "NCHW") blob_data = np.ones(shape=(1, 3, 224, 224), dtype=np.float32) blob = Blob(td, blob_data) exec_net.requests[0].set_blob(blob_name="input_blob_name", blob=blob),
- set_completion_callback(self, py_callback, py_data=None)¶
Description: Sets a callback function that is called on success or failure of an asynchronous request
- Parameters
py_callback – Any defined or lambda function
py_data – Data that is passed to the callback function
- Returns
None
Usage example:
callback = lambda status, py_data: print(f"Request with id {py_data} finished with status {status}") ie = IECore() net = ie.read_network(model="./model.xml", weights="./model.bin") exec_net = ie.load_network(net, "CPU", num_requests=4) for id, req in enumerate(exec_net.requests): req.set_completion_callback(py_callback=callback, py_data=id) for req in exec_net.requests: req.async_infer({"data": img})
- wait(self, timeout=None)¶
Waits for the result to become available. Blocks until specified timeout elapses or the result becomes available, whichever comes first.
- Parameters
timeout – Time to wait in milliseconds or special (0, -1) cases described above. If not specified, timeout value is set to -1 by default.
- Returns
Request status code.
Note
There are special values of the timeout parameter:
0 - Immediately returns the inference status. It does not block or interrupt execution. To find statuses meaning, please refer to enum_InferenceEngine_StatusCode in Inference Engine C++ documentation
-1 - Waits until inference result becomes available (default value)
Usage example: See
InferRequest.async_infer()
method of the theInferRequest
class.