LSTMCell#
Versioned name: LSTMCell-1
Category: Sequence processing
Short description: LSTMCell operation represents a single LSTM cell. It computes the output using the formula described in the original paper Long Short-Term Memory.
Detailed description: LSTMCell computes the output Ht and ot for current time step based on the following formula:
Formula:
  *  - matrix multiplication
 (.) - Hadamard product (element-wise)
 [,] - concatenation
 f, g, h - are activation functions.
     it = f(Xt*(Wi^T) + Ht-1*(Ri^T) + Wbi + Rbi)
     ft = f(Xt*(Wf^T) + Ht-1*(Rf^T) + Wbf + Rbf)
     ct = g(Xt*(Wc^T) + Ht-1*(Rc^T) + Wbc + Rbc)
     Ct = ft (.) Ct-1 + it (.) ct
     ot = f(Xt*(Wo^T) + Ht-1*(Ro^T) + Wbo + Rbo)
     Ht = ot (.) h(Ct)
Attributes
- hidden_size - Description: hidden_size specifies hidden state size. 
- Range of values: a positive integer 
- Type: - int
- Required: yes 
 
- activations - Description: activations specifies activation functions for gates, there are three gates, so three activation functions should be specified as a value for this attributes 
- Range of values: any combination of relu, sigmoid, tanh 
- Type: a list of strings 
- Default value: sigmoid for f, tanh for g, tanh for h 
- Required: no 
 
- activations_alpha, activations_beta - Description: activations_alpha, activations_beta attributes of functions; applicability and meaning of these attributes depends on chosen activation functions 
- Range of values: a list of floating-point numbers 
- Type: - float[]
- Default value: None 
- Required: no 
 
- clip - Description: clip specifies bound values [-C, C] for tensor clipping. Clipping is performed before activations. 
- Range of values: a positive floating-point number 
- Type: - float
- Default value: infinity that means that the clipping is not applied 
- Required: no 
 
Inputs
- 1: - X- 2D tensor of type T- [batch_size, input_size], input data. Required.
- 2: - initial_hidden_state- 2D tensor of type T- [batch_size, hidden_size]. Required.
- 3: - initial_cell_state- 2D tensor of type T- [batch_size, hidden_size]. Required.
- 4: - W- 2D tensor of type T- [4 * hidden_size, input_size], the weights for matrix multiplication, gate order: fico. Required.
- 5: - R- 2D tensor of type T- [4 * hidden_size, hidden_size], the recurrence weights for matrix multiplication, gate order: fico. Required.
- 6: - B1D tensor of type T- [4 * hidden_size], the sum of biases (weights and recurrence weights), if not specified - assumed to be 0. optional.
Outputs
- 1: - Ho- 2D tensor of type T- [batch_size, hidden_size], the last output value of hidden state.
- 2: - Co- 2D tensor of type T- [batch_size, hidden_size], the last output value of cell state.
Types
- T: any supported floating-point type. 
Example
<layer ... type="LSTMCell" ...>
    <data hidden_size="128"/>
    <input>
        <port id="0">
            <dim>1</dim>
            <dim>16</dim>
        </port>
        <port id="1">
            <dim>1</dim>
            <dim>128</dim>
        </port>
        <port id="2">
            <dim>1</dim>
            <dim>128</dim>
        </port>
         <port id="3">
            <dim>512</dim>
            <dim>16</dim>
        </port>
         <port id="4">
            <dim>512</dim>
            <dim>128</dim>
        </port>
         <port id="5">
            <dim>512</dim>
        </port>
    </input>
    <output>
        <port id="6">
            <dim>1</dim>
            <dim>128</dim>
        </port>
        <port id="7">
            <dim>1</dim>
            <dim>128</dim>
        </port>
    </output>
</layer>