Use Case and High-Level Description¶
Text detector based on FCOS architecture with MobileNetV2-like as a backbone for indoor/outdoor scenes with more or less horizontal text.
The key benefit of this model compared to the base model is its smaller size and faster performance.
F-measure (harmonic mean of precision and recall on ICDAR2013)
1, 3, 704, 704 in the format
1, C, H, W, where:
C- number of channels
H- image height
W- image width
Expected color order -
boxesis a blob with the shape
100, 5in the format
N, 5, where
Nis the number of detected bounding boxes. For each detection, the description has the format: [
y_min) - coordinates of the top left bounding box corner
y_max) - coordinates of the bottom right bounding box corner
conf- confidence for the predicted class
labelsis a blob with the shape
100in the format
Nis the number of detected bounding boxes. In case of text detection, it is equal to
0for each detected box.
The OpenVINO Training Extensions provide a training pipeline, allowing to fine-tune the model on custom dataset.
The model can be used in the following demos provided by the Open Model Zoo to show its capabilities:
[*] Other names and brands may be claimed as the property of others.