CTCGreedyDecoderSeqLen¶
Versioned name : CTCGreedyDecoderSeqLen-6
Category : Sequence processing
Short description : CTCGreedyDecoderSeqLen performs greedy decoding of the logits provided as the first input. The sequence lengths are provided as the second input.
Detailed description :
This operation is similar to the TensorFlow CTCGreedyDecoder.
The operation CTCGreedyDecoderSeqLen implements best path decoding. Decoding is done in two steps:
Concatenate the most probable classes per time-step which yields the best path.
Remove duplicate consecutive elements if the attribute merge_repeated is true and then remove all blank elements.
Sequences in the batch can have different length. The lengths of sequences are coded in the second input integer tensor sequence_length
.
The main difference between CTCGreedyDecoder and CTCGreedyDecoderSeqLen is in the second input. CTCGreedyDecoder uses 2D input floating point tensor with sequence masks for each sequence in the batch while CTCGreedyDecoderSeqLen uses 1D integer tensor with sequence lengths.
Attributes
merge_repeated
Description : merge_repeated is a flag for merging repeated labels during the CTC calculation. If the value is false the sequence
ABB*B*B
(where ‘*’ is the blank class) will look likeABBBB
. But if the value is true, the sequence will beABBB
.Range of values : true or false
Type :
boolean
Default value : true
Required : No
classes_index_type
Description : the type of output tensor with classes indices
Range of values : “i64” or “i32”
Type : string
Default value : “i32”
Required : No
sequence_length_type
Description : the type of output tensor with sequence length
Range of values : “i64” or “i32”
Type : string
Default value : “i32”
Required : No
Inputs
1 :
data
- input tensor of type T_F of shape[N, T, C]
with a batch of sequences. WhereT
is the maximum sequence length,N
is the batch size andC
is the number of classes. Required.2 :
sequence_length
- input tensor of type T_I of shape[N]
with sequence lengths. The values of sequence length must be less or equal toT
. Required.3 :
blank_index
- scalar or 1D tensor with 1 element of type T_I. Specifies the class index to use for the blank class. Regardless of the value ofmerge_repeated
attribute, if the output index for a given batch and time step corresponds to theblank_index
, no new element is emitted. Default value isC-1
. Optional.
Output
1 : Output tensor of type T_IND1 shape
[N, T]
and containing the decoded classes. All elements that do not code sequence classes are filled with -1.2 : Output tensor of type T_IND2 shape
[N]
and containing length of decoded class sequence for each batch.
Types
T_F : any supported floating point type.
T_I :
int32
orint64
.T_IND1 :
int32
orint64
and depends onclasses_index_type
attribute.T_IND2 :
int32
orint64
and depends onsequence_length_type
attribute.
Example
<layer type="CTCGreedyDecoderSeqLen" merge_repeated="true" classes_index_type="i64" sequence_length_type="i64">
<input>
<port id="0">
<dim>8</dim>
<dim>20</dim>
<dim>128</dim>
</port>
<port id="1">
<dim>8</dim>
</port>
<port id="2"/> <!-- blank_index = 120 -->
</input>
<output>
<port id="0" precision="I64">
<dim>8</dim>
<dim>20</dim>
</port>
<port id="1" precision="I64">
<dim>8</dim>
</port>
</output>
</layer>