Functions
ie_transformations.hpp File Reference

This header file defines the list of public transformations. More...

#include <ie_api.h>
#include <cpp/ie_cnn_network.h>

Go to the source code of this file.

Functions

void InferenceEngine::LowLatency (InferenceEngine::CNNNetwork &network)
 The transformation finds all TensorIterator layers in the network, processes all back edges that describe a connection between Result and Parameter of the TensorIterator body, and inserts ReadValue layer between Parameter and the next layers after this Parameter, and Assign layer after the layers before the Result layer. Supported platforms: CPU, GNA. More...
 

Detailed Description

This header file defines the list of public transformations.

Function Documentation

◆ LowLatency()

void InferenceEngine::LowLatency ( InferenceEngine::CNNNetwork network)

The transformation finds all TensorIterator layers in the network, processes all back edges that describe a connection between Result and Parameter of the TensorIterator body, and inserts ReadValue layer between Parameter and the next layers after this Parameter, and Assign layer after the layers before the Result layer. Supported platforms: CPU, GNA.

The example below describes the changes to the inner part (body, back edges) of the TensorIterator layer. [] - TensorIterator body () - new layer

before applying the transformation: back_edge_1 -> [Parameter -> some layers ... -> Result ] -> back_edge_1

after applying the transformation: back_edge_1 -> [Parameter -> (ReadValue layer) -> some layers ... -> (Assign layer) ] \ -> Result ] -> back_edge_1

It is recommended to use this transformation in conjunction with the Reshape feature to set sequence dimension to 1 and with the UnrollTensorIterator transformation. For convenience, we have already enabled the unconditional execution of the UnrollTensorIterator transformation when using the LowLatency transformation for CPU, GNA plugins, no action is required here. After applying both of these transformations, the resulting network can be inferred step by step, the states will store between inferences.

An illustrative example, not real API:

network->reshape(...) // Set sequence dimension to 1, recalculating shapes. Optional, depends on the network. LowLatency(network) // Applying LowLatency and UnrollTensorIterator transformations. network->infer (...) // Calculating new values for states. // All states are stored between inferences via Assign, ReadValue layers. network->infer (...) // Using stored states, calculating new values for states.

Parameters
networkA network to apply LowLatency transformation