class ov::InferRequest

Overview

This is a class of infer request that can be run in asynchronous or synchronous manners. More…

#include <infer_request.hpp>

class InferRequest
{
public:
    // construction

    InferRequest();
    InferRequest(const InferRequest& other);
    InferRequest(InferRequest&& other);

    // methods

    InferRequest& operator = (const InferRequest& other);
    InferRequest& operator = (InferRequest&& other);
    void set_tensor(const std::string& tensor_name, const Tensor& tensor);
    void set_tensor(const ov::Output<const ov::Node>& port, const Tensor& tensor);
    void set_tensor(const ov::Output<ov::Node>& port, const Tensor& tensor);

    void set_tensors(
        const std::string& tensor_name,
        const std::vector<Tensor>& tensors
        );

    void set_tensors(
        const ov::Output<const ov::Node>& port,
        const std::vector<Tensor>& tensors
        );

    void set_input_tensor(size_t idx, const Tensor& tensor);
    void set_input_tensor(const Tensor& tensor);
    void set_input_tensors(const std::vector<Tensor>& tensors);
    void set_input_tensors(size_t idx, const std::vector<Tensor>& tensors);
    void set_output_tensor(size_t idx, const Tensor& tensor);
    void set_output_tensor(const Tensor& tensor);
    Tensor get_tensor(const std::string& tensor_name);
    Tensor get_tensor(const ov::Output<const ov::Node>& port);
    Tensor get_tensor(const ov::Output<ov::Node>& port);
    Tensor get_input_tensor(size_t idx);
    Tensor get_input_tensor();
    Tensor get_output_tensor(size_t idx);
    Tensor get_output_tensor();
    void infer();
    void cancel();
    std::vector<ProfilingInfo> get_profiling_info() const;
    void start_async();
    void wait();
    bool wait_for(const std::chrono::milliseconds timeout);
    void set_callback(std::function<void(std::exception_ptr)> callback);
    std::vector<VariableState> query_state();
    CompiledModel get_compiled_model();
    bool operator ! () const;
    operator bool () const;
    bool operator != (const InferRequest& other) const;
    bool operator == (const InferRequest& other) const;
};

Detailed Documentation

This is a class of infer request that can be run in asynchronous or synchronous manners.

Construction

InferRequest()

Default constructor.

InferRequest(const InferRequest& other)

Default copy constructor.

Parameters:

other

Another InferRequest object.

InferRequest(InferRequest&& other)

Default move constructor.

Parameters:

other

Another InferRequest object.

Methods

InferRequest& operator = (const InferRequest& other)

Default copy assignment operator.

Parameters:

other

Another InferRequest object.

Returns:

Reference to the current object.

InferRequest& operator = (InferRequest&& other)

Default move assignment operator.

Parameters:

other

Another InferRequest object.

Returns:

Reference to the current object.

void set_tensor(const std::string& tensor_name, const Tensor& tensor)

Sets an input/output tensor to infer on.

Parameters:

tensor_name

Name of the input or output tensor.

tensor

Reference to the tensor. The element_type and shape of the tensor must match the model’s input/output element_type and size.

void set_tensor(const ov::Output<const ov::Node>& port, const Tensor& tensor)

Sets an input/output tensor to infer.

Parameters:

port

Port of the input or output tensor. Use the following methods to get the ports:

tensor

Reference to a tensor. The element_type and shape of a tensor must match the model’s input/output element_type and size.

void set_tensor(const ov::Output<ov::Node>& port, const Tensor& tensor)

Sets an input/output tensor to infer.

Parameters:

port

Port of the input or output tensor. Use the following methods to get the ports:

tensor

Reference to a tensor. The element_type and shape of a tensor must match the model’s input/output element_type and size.

void set_tensors(
    const std::string& tensor_name,
    const std::vector<Tensor>& tensors
    )

Sets a batch of tensors for input data to infer by tensor name. Model input must have batch dimension, and the number of tensors must match the batch size. The current version supports setting tensors to model inputs only. If tensor_name is associated with output (or any other non-input node), an exception is thrown.

Parameters:

tensor_name

Name of the input tensor.

tensors

Input tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors must match the input size.

void set_tensors(
    const ov::Output<const ov::Node>& port,
    const std::vector<Tensor>& tensors
    )

Sets a batch of tensors for input data to infer by input port. Model input must have batch dimension, and the number of tensors must match the batch size. The current version supports setting tensors to model inputs only. If port is associated with output (or any other non-input node), an exception is thrown.

Parameters:

port

Port of the input tensor.

tensors

Input tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors must match the input size.

void set_input_tensor(size_t idx, const Tensor& tensor)

Sets an input tensor to infer.

Parameters:

idx

Index of the input tensor. If idx is greater than the number of model inputs, an exception is thrown.

tensor

Reference to the tensor. The element_type and shape of the tensor must match the model’s input/output element_type and size.

void set_input_tensor(const Tensor& tensor)

Sets an input tensor to infer models with single input.

If model has several inputs, an exception is thrown.

Parameters:

tensor

Reference to the input tensor.

void set_input_tensors(const std::vector<Tensor>& tensors)

Sets a batch of tensors for single input data. Model input must have batch dimension, and the number of tensors must match the batch size.

Parameters:

tensors

Input tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors must match the input size.

void set_input_tensors(size_t idx, const std::vector<Tensor>& tensors)

Sets a batch of tensors for input data to infer by input name. Model input must have batch dimension, and number of tensors must match the batch size.

Parameters:

idx

Name of the input tensor.

tensors

Input tensors for batched infer request. The type of each tensor must match the model input element type and shape (except batch dimension). Total size of tensors must match the input size.

void set_output_tensor(size_t idx, const Tensor& tensor)

Sets an output tensor to infer.

Index of the input preserved accross ov::Model, ov::CompiledModel, and ov::InferRequest.

Parameters:

idx

Index of the output tensor.

tensor

Reference to the output tensor. The type of the tensor must match the model output element type and shape.

void set_output_tensor(const Tensor& tensor)

Sets an output tensor to infer models with single output.

If model has several outputs, an exception is thrown.

Parameters:

tensor

Reference to the output tensor.

Tensor get_tensor(const std::string& tensor_name)

Gets an input/output tensor for inference by tensor name.

Parameters:

tensor_name

Name of a tensor to get.

Returns:

The tensor with name tensor_name. If the tensor is not found, an exception is thrown.

Tensor get_tensor(const ov::Output<const ov::Node>& port)

Gets an input/output tensor for inference.

If the tensor with the specified port is not found, an exception is thrown.

Parameters:

port

Port of the tensor to get.

Returns:

Tensor for the port port.

Tensor get_tensor(const ov::Output<ov::Node>& port)

Gets an input/output tensor for inference.

If the tensor with the specified port is not found, an exception is thrown.

Parameters:

port

Port of the tensor to get.

Returns:

Tensor for the port port.

Tensor get_input_tensor(size_t idx)

Gets an input tensor for inference.

Parameters:

idx

Index of the tensor to get.

Returns:

Tensor with the input index idx. If the tensor with the specified idx is not found, an exception is thrown.

Tensor get_input_tensor()

Gets an input tensor for inference.

Returns:

The input tensor for the model. If model has several inputs, an exception is thrown.

Tensor get_output_tensor(size_t idx)

Gets an output tensor for inference.

Parameters:

idx

Index of the tensor to get.

Returns:

Tensor with the output index idx. If the tensor with the specified idx is not found, an exception is thrown.

Tensor get_output_tensor()

Gets an output tensor for inference.

Returns:

Output tensor for the model. If model has several outputs, an exception is thrown.

void infer()

Infers specified input(s) in synchronous mode.

It blocks all methods of InferRequest while request is ongoing (running or waiting in a queue). Calling any method leads to throwning the ov::Busy exception.

void cancel()

Cancels inference request.

std::vector<ProfilingInfo> get_profiling_info() const

Queries performance measures per layer to identify the most time consuming operation.

Not all plugins provide meaningful data.

Returns:

Vector of profiling information for operations in a model.

void start_async()

Starts inference of specified input(s) in asynchronous mode.

It returns immediately. Inference starts also immediately. Calling any method while the request in a running state leads to throwning the ov::Busy exception.

void wait()

Waits for the result to become available. Blocks until the result becomes available.

bool wait_for(const std::chrono::milliseconds timeout)

Waits for the result to become available. Blocks until the specified timeout has elapsed or the result becomes available, whichever comes first.

Parameters:

timeout

Maximum duration, in milliseconds, to block for.

Returns:

True if inference request is ready and false, otherwise.

void set_callback(std::function<void(std::exception_ptr)> callback)

Sets a callback std::function that is called on success or failure of an asynchronous request.

Do not capture strong references to OpenVINO runtime objects into callback. Following objects should not be captured like:

  • ov::InferRequest

  • ov::ExecutableNetwork

  • ov::Core As specified objects implement shared reference concept do not capture this objects by value. It can lead to memory leaks or undefined behaviour! Try to use weak references or pointers.

Parameters:

callback

callback object which will be called on when inference finish.

std::vector<VariableState> query_state()

Gets state control interface for the given infer request.

State control essential for recurrent models.

Returns:

Vector of Variable State objects.

CompiledModel get_compiled_model()

Returns a compiled model that creates this inference request.

Returns:

Compiled model object.

bool operator ! () const

Checks if the current InferRequest object is not initialized.

Returns:

True if the current InferRequest object is not initialized; false, otherwise.

operator bool () const

Checks if the current InferRequest object is initialized.

Returns:

True if the current InferRequest object is initialized; false, otherwise.

bool operator != (const InferRequest& other) const

Compares whether this request wraps the same impl underneath.

Parameters:

other

Another inference request.

Returns:

True if the current InferRequest object does not wrap the same impl as the operator’s arg.

bool operator == (const InferRequest& other) const

Compares whether this request wraps the same impl underneath.

Parameters:

other

Another inference request.

Returns:

True if the current InferRequest object wraps the same impl as the operator’s arg.