BatchNormInference#
Versioned name: BatchNormInference-5
Category: Normalization
Short description: BatchNormInference performs Batch Normalization operation described in the Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift article.
Detailed Description
BatchNormInference performs the following operations on a given data batch input tensor data
:
Normalizes each activation
by the mean and variance.where
and are the mean and variance, calculated per channel axis ofdata
input, and correspond tomean
andvariance
inputs, respectively. Additionally, is a value added to the variance for numerical stability and corresponds toepsilon
attribute.Performs linear transformation of each normalized activation based on
gamma
andbeta
input, representing the scaling factor and shift, respectively.where
and are learnable parameters, calculated per channel axis, and correspond togamma
andbeta
inputs.
Mathematical Formulation
Let x
be a d-dimensional input,
For a particular activation, consider a mini-batch
Input: Values of
over a mini-batch:Parameters to learn:
Output:
Mini-batch mean:
Mini-batch variance:
Normalize:
Scale and shift:
Attributes:
epsilon
Description: epsilon is a constant added to the variance for numerical stability.
Range of values: a floating-point number greater than or equal to zero
Type:
float
Required: yes
Inputs
1:
data
- A tensor of type T and at least rank 2. The second dimension represents the channel axis and must have a span of at least 1. Required.2:
gamma
- Scaling factor for normalized value. A 1D tensor of type T with the same span asdata
channel axis. Required.3:
beta
- Bias added to the scaled normalized value. A 1D tensor of type T with the same span asdata
channel axis. Required.4:
mean
- Value for mean normalization. A 1D tensor of type T with the same span asdata
channel axis. Required.5:
variance
- Value for variance normalization. A 1D tensor of type T with the same span asdata
channel axis. Required.
Outputs
1: The result of element-wise Batch Normalization operation applied to the input tensor
data
. A tensor of type T and the same shape asdata
input tensor.
Types
T: any supported floating-point type.
Examples
Example: 2D input tensor data
<layer ... type="BatchNormInference" ...>
<data epsilon="9.99e-06" />
<input>
<port id="0"> <!-- input -->
<dim>10</dim>
<dim>128</dim>
</port>
<port id="1"> <!-- gamma -->
<dim>128</dim>
</port>
<port id="2"> <!-- beta -->
<dim>128</dim>
</port>
<port id="3"> <!-- mean -->
<dim>128</dim>
</port>
<port id="4"> <!-- variance -->
<dim>128</dim>
</port>
</input>
<output>
<port id="5">
<dim>10</dim>
<dim>128</dim>
</port>
</output>
</layer>
Example: 4D input tensor data
<layer ... type="BatchNormInference" ...>
<data epsilon="9.99e-06" />
<input>
<port id="0"> <!-- input -->
<dim>1</dim>
<dim>3</dim>
<dim>224</dim>
<dim>224</dim>
</port>
<port id="1"> <!-- gamma -->
<dim>3</dim>
</port>
<port id="2"> <!-- beta -->
<dim>3</dim>
</port>
<port id="3"> <!-- mean -->
<dim>3</dim>
</port>
<port id="4"> <!-- variance -->
<dim>3</dim>
</port>
</input>
<output>
<port id="5">
<dim>1</dim>
<dim>3</dim>
<dim>224</dim>
<dim>224</dim>
</port>
</output>
</layer>