Several reasons exist for why the Model Optimizer could not generate an Intermediate Representation for a model. However, in some cases, the Intermediate Representation could be generated after providing certain hints to the tool. The examples of hints below are mostly related to TensorFlow*, but potentially could be actual for models created in any framework:
The detailed solutions for the examples above are given later, the next subsection shows what is common in all three examples.
In these cases, the sub-graph (or a single node) of initial graph is replaced with a new sub-graph (single node). The sub-graph replacement consists of the following steps:
Model Optimizer provides several ways to perform most of the sub-graph replacement steps. The next subsections describe these methods.
For example, there is an operation
SquaredDifference in TensorFlow which calculates , where and are input tensors. Inference Engine does not support such operation. However,
SquaredDifference could be expressed using two
Power operations and one
Eltwise Add. The
Power operation calculates , where is a tensor and , and are float values. The first
Power operation negates the value of tensor . The second one is used to square the result of which is calculated using the
Eltwise Add operation applied to tensor and tensor .
Given that, we can replace all
SquaredDifference operations in the initial model with two
Power and one
Eltwise operations. The replacer is implemented in the following file
Model Optimizer internal representation of the graph uses the networkx module.
FrontReplacementOp that is used to replace operation of particular type with a new sub-graph. This class performs the first step of the sub-graph replacement (identifies an existing sub-graph for replacement). It is important to mention that the replacement happens before shape inference and creation of data nodes representing tensors with values. At this stage of model conversion pipeline, all nodes in the graph are operation nodes or nodes of type
Const that produce tensor with fixed value embedded into the node.
Node representing a single node in the computation graph.
Eltwise. These classes are inherited from base class
mo.ops.Op that represents operation and stores its attributes.
SquaredDifference inherited from
FrontReplacementOp. This is a replacer class that is automatically registered and executed by Model Optimizer. Since the class is located in the common (not framework) specific directory
<INSTALL_DIR>/deployment_tools/model_optimizer/extensions/front, it is used for replacement for all supported frameworks.
op that stores the name of the operation to be replaced. In this case, it is
enabled that controls whether the replacer is enabled or not. The only function that should be implemented in the class is
replace_op. It gets graph to operate on and an instance of node of desired operation (
SquaredDifference in this case). This function performs step two and three of the sub-graph replacement (generates a new sub-graph to replace with and connects a new sub-graph to the graph).
create_node method of the
Op class generates
Node from the
Op and uses single mandatory argument - the list of input nodes (represented as instances of
Node class) to create input edges to the node being generated. Inputs of the
SquaredDifference node are retrieved using
node.in_node(1) method calls. The
Eltwise Add node gets first input as initial first input of
SquaredDifference node, the second input of
add is the result of negation of the second input of
[add.create_node([node.in_node(0), negate.create_node([node.in_node(1)])])]. Then the result of
Add node is squared.
out_node node performs this calculation.
replace_op function returns a list of node names used to create output edges of the sub-graph to connect it with the rest of the graph. Each element of the list describes mapping between old output edge of the matched node and new sub-graph node and output edge index. The i-th element of the list corresponds to the i-th output tensor of the matched node. In this case,
SquaredDifference produces single tensor through output port 0, so the returned list contains single element. In general, each element is a tuple, where the first element is the name of a new node producing required tensor and the second is the output port for that tensor. If the output port is 0, it is possible to use shortcut - just the name of the node instead of a tuple. Line 26 uses this shortcut. The returned value is used to create the new sub-graph output edges (step 4 of the sub-graph replacement).
Default implementation of the
FrontReplacementOp class removes matched node and all its input/output edges (step 5 of the sub-graph replacement).
Another example of such kind of replacement is in the
<INSTALL_DIR>/deployment_tools/model_optimizer/extensions/front/Sub.py class where all instances of
Sub operations are replaced with two operations:
Power to negate the second argument and the
Eltwise to perform elementwise add.
The previous example considered situation when one single node of a specific type is replaced. When it is necessary to replace a sub-graph of operations it is necessary to tell Model Optimizer how to identify this sub-graph. There are three ways to achieve that:
scope (according to TensorFlow terminology) to be replaced
end node names to match all nodes "between" them
The next sections explain each option using real examples.
networkx Python* module provides methods to find graph isomorphic to the given one using nodes and edges match: for example,
networkx.algorithms.isomorphism.categorical_multiedge_match. Model Optimizer uses these methods and provides simple API to use that feature.
For example, the Caffe* has layer called Mean-Variance Normalization (MVN), which is also supported by the Inference Engine. This layer is implemented with low-level operations in TensorFlow:
FusedBatchNorm. Model Optimizer should replace sub-graph with these operations with a single Inference Engine layer of type
<INSTALL_DIR>/deployment_tools/model_optimizer/extensions/front/tf/mvn.py performs such a replacement. The first part of the file is:
MVN inherited from class
FrontReplacementSubgraph that performs sub-graph replacement using sub-graph isomorphism pattern.
enabled to value True meaning that this replacer is enabled.
pattern defines the sub-graph constraints to be matched. It returns a dictionary with four keys:
nodes defines a list of nodes to be matched. Each element in the list is a tuple. The first element is the alias name assigned for the matched node, the second element is a dictionary with desired attributes of the node.
edges defines a list of edges to be matched. Each element in the list is a tuple. The first and the second elements are the start and end edge nodes alias names respectively. The third element is a dictionary with desired edge attributes.
node_attrs contains the names of nodes attributes to use during sub-graph isomorphism search.
edge_attrs contains the names of edges attributes to use during sub-graph isomorphism search.
The sub-graph is matched if all provided constraints are satisfied. If at least one node with desired attributes is missing or at least one defined edge is absent, the sub-graph is not matched.
op with value
Mean. The matched node gets an alias name
mean. The same way the line 10 add constrain for node
StopGradient, the matched node gets an alias name
mean to node with alias name
stop_grad having attribute
in equal to 0. This means that the output of node
mean is connected to the node
stop_grad as a first input (Model Optimizer uses zero-based indexing that is why
in is 0). Another example of defining the edges constraints is in line 25 where the edge from
squeeze_mean is connected to the
fbn node as fourth input.
Now when the Model Optimizer knows how to find sub-graph (step 1 of the sub-graph replacement), it is necessary to implement function that will perform actual sub-graph replacement (step 2 and 3). The code for this function is:
The function accepts two arguments - the graph and the dictionary
match. The keys in the dictionary are the alias names of matched nodes (defined in the
nodes list in the function
pattern) and the values are the matched node of the graph (the instance of Node object).
The function generates new sub-graph with node of type
MVN and two nodes of the type
Eltwise calculating sum and product. There is nothing interesting in how the graph is generated and mathematics behind that, so attention will be put to two aspects of this function.
The first one is the call to function
replace_node in line 36.
FusedBatchNorm node is replaced with the output node of the generated sub-graph: all input edges of the
FusedBatchNorm node are re-connected to the
new_subgraph node, all consumers of the
FusedBatchNorm node are updated to get inputs from the
new_subgraph node. This action connects newly generated sub-graph with an existing graph (step 4 of the sub-graph replacement).
The second one is that the default implementation of the inference function for
MVN operation is overwritten. In line 16, the default implementation of the inference function for
MVN is saved to attribute
old_infer. In line 17, the new inference function is saved to the instance of the
MVN operation class. The new inference function code looks the following way:
infer function is needed to infer value of the node (if it is possible) and to infer shapes of the output tensors of the node (mandatory). The custom
infer function performs additional checks that describe limitations of the
MVN layer implementation in the Inference Engine. For example, reduction indices for mean and variance must be constants (line 10), while in TensorFlow they could be computed during model inference. In addition, the function removes two edges from the graph (lines 17 and 18) because all required information is already stored in the
MVN node attributes. This is due to different
MVN layer implementation in Inference Engine and TensorFlow*:
variance are attributes of the node in Inference Engine while in TensorFlow they are input tensors. Edges are not removed in the
replace_sub_graph function, because these edges are used in the
infer function (lines 7-12).
The last action in the
infer method (line 19) is to call default infer function for the
MVN, which is saved in the attribute
old_infer of the node to infer output tensors shapes.
On the step 5 of the sub-graph replacement, six matching nodes are automatically removed during the dead code elimination pass that is performed after applying of custom sub-graph replacements defined. Six matching nodes are no more connected to the inputs of the network after replacing node
fbn with a newly created sub-graph node. Since they are not marked as output nodes (using
--output command line parameter), they could be removed.
The replacement works for all sub-graph isomorphism instances found in the network.
TensorFlow uses a mechanism of scope to group related operation nodes. It is a good practice to put nodes performing particular task into the scope. This approach divides a graph into logical blocks that are easier to review in TensorBoard*. The
scope, in fact, just defines a common prefix for the node names in the scope.
For example, Inception topologies contain several types of so-called "Inception blocks". Some of them are exactly equal to each other, but located in different places of the network. For example, Inception V4 from
tensorflow.contrib.slim module has inception blocks
Mixed_5d with exactly the same nodes with the same attributes.
Now consider situation when someone implemented these Inception blocks extremely efficiently using single Inference Engine custom layer called
InceptionBlock and would like to replace these blocks with instances of the layer to decrease inference time. Model Optimizer provides mechanism to replace sub-graph of operations defined by the regular expressions for the node names prefixes (scope). In this particular case, some of the patterns are:
.*InceptionV4/Mixed_5d. Each pattern starts with
.*, because a prefix
InceptionV4 is added to all nodes names during a model freeze.
The sub-graph replacement using nodes name pattern is a bit trickier than replacements of single operation and networkx isomorphism pattern described above. You should do the following additional steps in comparison with previously described replacements:
Consider the following possible configuration file for the Inception Block replacer:
.json file contains list of dictionaries. Each dictionary defines one replacement. Each replacement is defined with several keys:
id (mandatory) is a unique identifier of the replacer. It is used in the Python* code that implements sub-graph replacement to link the class and the replacement description from the configuration file.
match_kind (mandatory) is a string that specifies what matching algorithm is used. Currently supported
points. In this example, the first one is considered. The
points match kind is described below.
instances (mandatory) specifies instances of the sub-graph to be matched. It contains a list of node names prefixes patterns for the match kind
custom_attributes (optional) is a dictionary with static attributes of the layer to be dumped to Inference Engine Intermediate Representation
op (optional) is used only if the sub-graph replacement Python code is not needed, because the sub-graph should be replaced with a single node of type
op. If this attribute is not set, it is necessary to implement Python code with sub-graph generation code. Both options are considered in this example.
When the configuration file is ready, run the Model Optimizer with regular command line parameters pointing to the file with model and input shapes (if necessary) and additional parameter
--tensorflow_custom_operations_config_update pointing to the generated configuration file. If the file is correct, Model Optimizer adds two keys to the
outputs with the following content:
The value for key
inputs is a list of lists describing input tensors of the sub-graph. Each element of the top-level list corresponds to one unique input tensor of the sub-graph. Each internal list describes a list of nodes consuming this tensor and port numbers where the tensor is consumed. Model Optimizer generates regular expressions for the input nodes names to uniquely identify them in each instance of the sub-graph defined by the
instances. Denote these nodes as input nodes of the sub-graph.
In the InceptionV4 topology, the
InceptionV4/Mixed_5b block has four input tensors from outside of the sub-graph, but all of them are produced by the node
InceptionV4/Mixed_5a/concat. Therefore, the top-level list of the
inputs contains one list corresponding to this tensor. Four input nodes of the sub-graph consume the tensor produced by
InceptionV4/Mixed_5a/concat node. In this case, all four input nodes consume input tensor into port 0.
The order of items in the internal list describing nodes does not matter, but the order of elements in the top-level list is important. This order defines the order in which the Model Optimizer attaches input tensors to a new generated node if the sub-graph is replaced with a single node. The i-th input node of the sub-graph is obtained using call
match.single_input_node(i) in the sub-graph replacer code. More information about API is given below. If you need to change the order of input tensors, you can edit the configuration file in the text-editor.
The value for the key
outputs is a list describing nodes of the sub-graph producing tensor that goes outside of the sub-graph or does not have child nodes. Denote these nodes as output nodes of the sub-graph. The order of elements in the list is important. The i-th element of the list describes the i-th output tensor of the sub-graph, which could be obtained using call
match.output_node(i). The order of elements can be manually changed in the configuration file. Model Optimizer uses this order to connect output edges if the sub-graph is replaced with a single node.
Now, when meaning of
outputs attributes is clean, return back to the replacer implementation. The replacer
InceptionBlockReplacer contains attribute
op with the value
InceptionBlock, which means that the identified sub-graph should be replaced with a single layer of type
InceptionBlock. This layer is not known for the Model Optimizer, so it is necessary to define it. See Extending the Model Optimizer with New Primitives. You must create file
extension/ops/InceptionBlock.py with the following content:
The shape inference function is not defined. In this case, Model Optimizer uses TensorFlow fallback to calculate shapes of the sub-graph output tensors.
Run the Model Optimizer with the regular command line parameters, path to the model file and input shape (if necessary), and the parameter
--tensorflow_use_custom_operations_config and point to the created configuration file. Model Optimizer generates Intermediate Representation
.xml file with three sequential layers of type
InceptionBlock like in the following example:
The implementation of the sub-graph replacement by scope with a single layer is complete. The next subsection explains how Model Optimizer replaces sub-graph identified by start/end nodes (
points) with another sub-graph.
In this scenario, for the matching algorithm user defines the sub-graph via a set of "start" and "end" nodes. Given the set, the Model Optimizer performs the following steps:
This algorithm finds all nodes "between" start and end nodes. Also nodes needed for calculation of non-input nodes of the matched sub-graph produce constant values because they do not depend on input of the network. This sub-graph match has a limitation that each start node must have only one input. Therefore, it is not possible to specify, for example, convolution node as input because it has two inputs: data tensor and tensor with weights.
For example of replacement with points, please refer to the case-study of the conversion for the SSD models, created with TensorFlow Object Detection API.