Horizontal Text Detection in Real-Time

This demo presents a use case with a client written in python which captures camera frames and performs text spotting analysis via gRPC requests to OVMS. The client visualizes the results as a boxes depicted on the original image frames using OpenCV in real-time. The client can work efficiently also over slow internet connection with long latency thanks to image data compression and parallel execution for multiple frames.

horizontal text detection

Download horizontal text detection model from OpenVINO Model Zoo

curl -L --create-dir https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/horizontal-text-detection-0001/FP32/horizontal-text-detection-0001.bin -o horizontal-text-detection-0001/1/horizontal-text-detection-0001.bin https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/horizontal-text-detection-0001/FP32/horizontal-text-detection-0001.xml -o horizontal-text-detection-0001/1/horizontal-text-detection-0001.xml
tree horizontal-text-detection-0001
horizontal-text-detection-0001
└── 1
    ├── horizontal-text-detection-0001.bin
    └── horizontal-text-detection-0001.xml

Start the OVMS container:

docker run -d -u $(id -u):$(id -g) -v $(pwd)/horizontal-text-detection-0001:/model -p 9000:9000 openvino/model_server:latest \
--model_path /model --model_name text --port 9000 --layout NHWC:NCHW

Run the client

Clone the repository and enter horizontal_text_detection directory

git clone https://github.com/openvinotoolkit/model_server.git
cd model_server/demos/horizontal_text_detection/python

Install required packages:

pip3 install -r requirements.txt

Start the client

python3 horizontal_text_detection.py --grpc_address localhost --grpc_port 9000

Initializing requesting thread index: 0
Initializing requesting thread index: 1
Initializing requesting thread index: 2
Initializing requesting thread index: 3
Launching requesting thread index: 0
Launching requesting thread index: 1
Launching requesting thread index: 2
Launching requesting thread index: 3
ThreadID:   0; Current FPS:    31.25; Average FPS:    25.64; Average latency:   140.98ms
ThreadID:   1; Current FPS:    31.23; Average FPS:    25.67; Average latency:   136.36ms
ThreadID:   2; Current FPS:    29.41; Average FPS:    25.70; Average latency:   130.88ms
ThreadID:   3; Current FPS:    30.30; Average FPS:    25.73; Average latency:   135.65ms
...

You can also change the camera ID:

python3 horizontal_text_detection.py --grpc_address localhost --grpc_port 9000 --video_source 0

Or choose to work with video file as well:

python3 horizontal_text_detection.py --grpc_address localhost --grpc_port 9000 --video_source ~/video.mp4

Note

Video source is cropped to 704x704 resolution to match model input size.

Recognize Detected Text with OCR Pipeline

Optical Character Recognition (OCR) pipeline based on horizontal text detection model, text recognition combined with a custom node implementation can be used with the same python script used before. OCR pipeline provides location of detected text boxes on the image and additionaly recognized text for each box.

horizontal text detection using OCR pipeline

Prepare workspace to run the demo

To successfully deploy OCR pipeline you need to have a workspace that contains:

Clone the repository and enter horizontal_text_detection directory

git clone https://github.com/openvinotoolkit/model_server.git
cd model_server/demos/horizontal_text_detection/python

You can prepare the workspace that contains all the above by just running

make

Final directory structure

Once the make procedure is finished, you should have workspace directory ready with the following content.

workspace/
├── config.json
├── horizontal-text-detection-0001
   └── 1
       ├── horizontal-text-detection-0001.bin
       └── horizontal-text-detection-0001.xml
├── lib
   └── libcustom_node_horizontal_ocr.so
└── text-recognition-0014
    └── 1
        ├── text-recognition-0014.bin
        └── text-recognition-0014.xml

Deploying OVMS

Deploy OVMS with faces analysis pipeline using the following command:

docker run -p 9000:9000 -d -v ${PWD}/workspace:/workspace openvino/model_server --config_path /workspace/config.json --port 9000

Sending Request to the Server

Install python dependencies:

pip3 install -r requirements.txt

Start the client

python3 horizontal_text_detection.py --grpc_address localhost --grpc_port 9000 --use_case ocr

Initializing requesting thread index: 0
Initializing requesting thread index: 1
Initializing requesting thread index: 2
Initializing requesting thread index: 3
Launching requesting thread index: 0
Launching requesting thread index: 1
Launching requesting thread index: 2
Launching requesting thread index: 3
ThreadID:   0; Current FPS:    31.25; Average FPS:    25.64; Average latency:   140.98ms
ThreadID:   1; Current FPS:    31.23; Average FPS:    25.67; Average latency:   136.36ms
ThreadID:   2; Current FPS:    29.41; Average FPS:    25.70; Average latency:   130.88ms
ThreadID:   3; Current FPS:    30.30; Average FPS:    25.73; Average latency:   135.65ms
...