This is a network for text recognition scenario. It consists of VGG16-like backbone and bidirectional LSTM encoder-decoder. The network is able to recognize school marks that should have format either <digit>
or <digit>.<digit>
(e.g. 4
or 3.5
).
-> Mark2.5
Metric | Value |
---|---|
Accuracy (internal test set) | 98.83% |
Text location requirements | Tight aligned crop |
GFlops | 0.792 |
MParams | 5.555 |
Source framework | TensorFlow |
Shape: [1x1x32x64] - An input image in the format [BxCxHxW], where:
Note that the source image should be tight aligned crop with detected text converted to grayscale.
The net outputs a blob with the shape [16, 1, 13] in the format [WxBxL], where:
"0123456789._#"
, where # - special blank character for CTC decoding algorithm and the character ‘’_'` replaces all non-numeric symbols.The network output can be decoded by CTC Greedy Decoder or CTC Beam Search decoder.