Versioned name: LSTMCell-1
Category: Sequence processing
Short description: LSTMCell operation represents a single LSTM cell. It computes the output using the formula described in the original paper Long Short-Term Memory.
Detailed description
Formula:
* - matrix mult
(.) - eltwise mult
[,] - concatenation
sigm - 1/(1 + e^{-x})
tanh - (e^{2x} - 1)/(e^{2x} + 1)
f = sigm(Wf*[Hi, X] + Bf)
i = sigm(Wi*[Hi, X] + Bi)
c = tanh(Wc*[Hi, X] + Bc)
o = sigm(Wo*[Hi, X] + Bo)
Co = f (.) Ci + i (.) c
Ho = o (.) tanh(Co)
Attributes
- hidden_size
- Description: hidden_size specifies hidden state size.
- Range of values: a positive integer
- Type:
int
- Default value: None
- Required: yes
- activations
- Description: activations specifies activation functions for gates, there are three gates, so three activation functions should be specified as a value for this attributes
- Range of values: any combination of relu, sigmoid, tanh
- Type: a list of strings
- Default value: sigmoid,tanh,tanh
- Required: no
- activations_alpha, activations_beta
- Description: activations_alpha, activations_beta attributes of functions; applicability and meaning of these attributes depends on chosen activation functions
- Range of values: a list of floating-point numbers
- Type:
float[]
- Default value: None
- Required: no
- clip
- Description: clip specifies bound values [-C, C] for tensor clipping. Clipping is performed before activations.
- Range of values: a positive floating-point number
- Type:
float
- Default value: infinity that means that the clipping is not applied
- Required: no
Inputs
- 1:
X
- 2D ([batch_size, input_size]) input data. Required.
- 2:
initial_hidden_state
- 2D ([batch_size, hidden_size]) input hidden state data. Required.
- 3:
initial_cell_state
- 2D ([batch_size, hidden_size]) input cell state data. Required.
- 4:
W
- 2D tensor with weights for matrix multiplication operation, shape is [4 * hidden_size, input_size]
, gate order: fico
- 5:
R
- 2D tensor with weights for matrix multiplication operation, shape is [4 * hidden_size, hidden_size]
, gate order: fico
- 6:
B
Tensor with biases, shape is [4 * hidden_size]
Outputs
- 1:
Ho
- 2D ([batch_size, hidden_size]) output hidden state.
- 2:
Co
- 2D ([batch_size, hidden_size]) output cell state.
Example
<layer ... type="LSTMCell" ... >
<input> ... </input>
<output> ... </output>
</layer>