Versioned name: MatMul-1
Category: Matrix multiplication
Short description: Generalized matrix multiplication
Detailed description
MatMul operation takes two tensors and performs usual matrix-matrix multiplication, matrix-vector multiplication or vector-matrix multiplication depending on argument shapes. Input tensors can have any rank >= 1. Two right-most axes in each tensor are interpreted as matrix rows and columns dimensions while all left-most axes (if present) are interpreted as multi-dimensional batch: [BATCH_DIM_1, BATCH_DIM_2,..., BATCH_DIM_K, ROW_INDEX_DIM, COL_INDEX_DIM]. The operation supports usual broadcast semantics for batch dimensions. It enables multiplication of batch of pairs of matrices in a single shot.
Before matrix multiplication, there is an implicit shape alignment for input arguments. It consists of the following steps:
- Applying transpositions specified by optional
transpose_a
and transpose_b
attributes. Only the two right-most dimensions are transposed, other dimensions remain the same. Transpose attributes are ignored for 1D tensors.
- One-dimensional tensors unsqueezing is applied for each input independently. The axes inserted in this step are not included in the output shape.
- If rank of the first input is equal to 1, it is always unsqueezed to 2D tensor row vector (regardless of
transpose_a
) by adding axes with size 1 at ROW_INDEX_DIM, to the left of the shape. For example [S]
will be reshaped to [1, S]
.
- If rank of the second input is equal to 1, it is always unsqueezed to 2D tensor column vector (regardless of
transpose_b
) by adding axes with size 1 at COL_INDEX_DIM, to the right of the shape. For example [S]
will be reshaped to [S, 1]
.
- If ranks of input arguments are different after steps 1 and 2, the tensor with a smaller rank is unsqueezed from the left side of the shape by necessary number of axes to make both shapes of the same rank.
- Usual rules of the broadcasting are applied for batch dimensions.
Temporary axes inserted in step 2 are removed from the final output shape after multiplying. After vector-matrix multiplication, the temporary axis inserted at ROW_INDEX_DIM is removed. After matrix-vector multiplication, the temporary axis inserted at COL_INDEX_DIM is removed. Output shape of two 1D tensors multiplication [S] x [S]
is squeezed to scalar.
Output shape inference logic examples (ND here means bigger than 1D):
- 1D x 1D:
[X] x [X] -> [1, X] x [X, 1] -> [1, 1] => []
(scalar)
- 1D x ND:
[X] x [B, ..., X, Y] -> [1, X] x [B, ..., X, Y] -> [B, ..., 1, Y] => [B, ..., Y]
- ND x 1D:
[B, ..., X, Y] x [Y] -> [B, ..., X, Y] x [Y, 1] -> [B, ..., X, 1] => [B, ..., X]
- ND x ND:
[B, ..., X, Y] x [B, ..., Y, Z] => [B, ..., X, Z]
Two attributes, transpose_a
and transpose_b
specify embedded transposition for two right-most dimensions for the first and the second input tensors correspondingly. It implies swapping of ROW_INDEX_DIM and COL_INDEX_DIM in the corresponding input tensor. Batch dimensions and 1D tensors are not affected by these attributes.
Attributes
- transpose_a
- Description: transposes dimensions ROW_INDEX_DIM and COL_INDEX_DIM of the 1st input; false means no transpose, true means transpose. It is ignored if first input is 1D tensor.
- Range of values: false or true
- Type: boolean
- Default value: false
- Required: no
- transpose_b
- Description: transposes dimensions ROW_INDEX_DIM and COL_INDEX_DIM of the 2nd input; false means no transpose, true means transpose. It is ignored if second input is 1D tensor.
- Range of values: false or true
- Type: boolean
- Default value: false
- Required: no
Inputs:
- 1: Input batch of matrices A. Rank >= 1. Required.
- 2: Input batch of matrices B. Rank >= 1. Required.
Outputs
- 1: Tensor with results of the multiplication.
Example
Vector-matrix multiplication
<layer ... type="MatMul">
<input>
<port id="0">
<dim>1024</dim>
</port>
<port id="1">
<dim>1024</dim>
<dim>1000</dim>
</port>
</input>
<output>
<port id="2">
<dim>1000</dim>
</port>
</output>
</layer>
Matrix-vector multiplication
<layer ... type="MatMul">
<input>
<port id="0">
<dim>1000</dim>
<dim>1024</dim>
</port>
<port id="1">
<dim>1024</dim>
</port>
</input>
<output>
<port id="2">
<dim>1000</dim>
</port>
</output>
</layer>
Matrix-matrix multiplication (like FullyConnected with batch size 1)
<layer ... type="MatMul">
<input>
<port id="0">
<dim>1</dim>
<dim>1024</dim>
</port>
<port id="1">
<dim>1024</dim>
<dim>1000</dim>
</port>
</input>
<output>
<port id="2">
<dim>1</dim>
<dim>1000</dim>
</port>
</output>
</layer>
Vector-matrix multiplication with embedded transposition of the second matrix
<layer ... type="MatMul">
<data transpose_b="true"/>
<input>
<port id="0">
<dim>1024</dim>
</port>
<port id="1">
<dim>1000</dim>
<dim>1024</dim>
</port>
</input>
<output>
<port id="2">
<dim>1000</dim>
</port>
</output>
</layer>
Matrix-matrix multiplication (like FullyConnected with batch size 10)
<layer ... type="MatMul">
<input>
<port id="0">
<dim>10</dim>
<dim>1024</dim>
</port>
<port id="1">
<dim>1024</dim>
<dim>1000</dim>
</port>
</input>
<output>
<port id="2">
<dim>10</dim>
<dim>1000</dim>
</port>
</output>
</layer>
Multiplication of batch of 5 matrices by a one matrix with broadcasting
<layer ... type="MatMul">
<input>
<port id="0">
<dim>5</dim>
<dim>10</dim>
<dim>1024</dim>
</port>
<port id="1">
<dim>1024</dim>
<dim>1000</dim>
</port>
</input>
<output>
<port id="2">
<dim>5</dim>
<dim>10</dim>
<dim>1000</dim>
</port>
</output>
</layer>