Real Time Stream Analysis Demo

Overview

This demo demonstrates how to write an application running AI analysis using OpenVINO Model Server. In the video analysis we can deal with various form of the source content. Here, you will see how to take the source of the video from a local USB camera, saved encoded video file and an encoded video stream.

The client application is expected to read the video source and send for the analysis every frame to the OpenVINO Model Server via gRPC connection. The analysis can be fully delegated to the model server endpoint with the complete processing pipeline arranged via a MediaPipe graph or DAG. The remote analysis can be also reduced just to inference execution but in such case the video frame preprocessing and the postprocessing of the results must be implemented on the client side.

In this demo, reading the video content from a local USB camera and encoded video file is straightforward using OpenCV library. The use case with encoded network stream might require more explanation. We will present using RTSP stream transferred by the server component and encoded using FFMPEG utility.

Below is depicted such a configuration: rtsp

All the client scenarios mentioned below can read the input content from mentioned 3 sources and also send the results to 3 destinations: local screen, encoded video file or RTSP output stream.

In the demo will be used two gRPC communication patterns which might be advantageous depending on the scenario.

  • gRPC streaming - recommended for MediaPipe graphs especially for stateful analysis

  • gRPC unary calls - recommended for inference only on DAG graphs

Requirements

  • on the client side it could be Windows, Mac or Linux. FFMPEG should be preinstalled in order to follow the scenario with RTSP client. Python3.7+ is needed.

  • the server can be deployed on Linux, MacOS (only with CPU execution on x86_64 arch) or inside WSL on Windows operating system.

  • images sent over gRPC are not encoded, so there should be good network connectivity between the client and the server. At least 100Mb/s for real-time video analysis at high rate.

gRPC streaming with MediaPipe graphs

gRPC stream connection is allowed for served MediaPipe graphs. It allows sending asynchronous calls to the endpoint all linked in a single session context. Responses are sent back via a stream and processed in the callback function. The helper class StreamClient provides a mechanism for flow control and tracking the sequence of the requests and responses. In the StreamClient initialization the streaming mode is set via the parameter streaming_api=True.

Using the streaming API has the following advantages:

  • good performance thanks to asynchronous calls and sharing the graph execution for multiple calls

  • support for stateful pipelines like object tracking when the response is dependent on the sequence of requests

Preparing the model server for gRPC streaming with a Holistic graph

The holistic graph is expecting and IMAGE object on the input and returns an IMAGE on the output. As such it doesn’t require any preprocessing and postprocessing. In this demo the returned stream will be just visualized or sent to the target sink.

The model server with the holistic use case can be deployed with the following steps:

git clone https://github.com/openvinotoolkit/model_server.git
cd model_server/demos/mediapipe/holistic_tracking
./prepare_server.sh
docker run -d -v $PWD/mediapipe:/mediapipe -v $PWD/ovms:/models -p 9000:9000 openvino/model_server:latest --config_path /models/config_holistic.json --port 9000

Check more info about this use case

Note All the graphs with an image on input and output can be applied here without any changes on the client application.

Start the client with real time stream analysis

Prepare the python environment by installing required dependencies:

cd ../../real_time_stream_analysis/python/
pip install -r ../../common/stream_client/requirements.txt

For the use case with RTSP client, install also FFMPEG component on the host.

Alternatively build a docker image with the client with the following command:

docker build ../../common/stream_client/ -t rtsp_client

Client parameters:

python3 client.py --help
usage: client.py [-h] [--grpc_address GRPC_ADDRESS]
                      [--input_stream INPUT_STREAM]
                      [--output_stream OUTPUT_STREAM]
                      [--model_name MODEL_NAME] [--input_name INPUT_NAME]
                      [--verbose] [--benchmark]
                      [--limit_stream_duration LIMIT_STREAM_DURATION]
                      [--limit_frames LIMIT_FRAMES]

options:
  -h, --help            show this help message and exit
  --grpc_address GRPC_ADDRESS
                        Specify url to grpc service
  --input_stream INPUT_STREAM
                        Url of input rtsp stream
  --output_stream OUTPUT_STREAM
                        Url of output rtsp stream
  --model_name MODEL_NAME
                        Name of the model
  --input_name INPUT_NAME
                        Name of the model's input
  --verbose             Should client dump debug information
  --benchmark           Should client collect processing times
  --limit_stream_duration LIMIT_STREAM_DURATION
                        Limit how long client should run
  --limit_frames LIMIT_FRAMES
                        Limit how many frames should be processed

Reading from the local camera and visualization on the screen

python3 client.py --grpc_address localhost:9000 --input_stream 0 --output_stream screen

The parameter --input_stream 0 indicates the camera ID 0.

Reading from the encoded video file and saving results to a file

wget -O video.mp4 "https://www.pexels.com/download/video/3044127/?fps=24.0&h=1080&w=1920"
python3 client.py --grpc_address localhost:9000 --input_stream 'video.mp4' --output_stream 'output.mp4'

Inference using RTSP stream

The rtsp client app needs to have access to RTSP stream to read from and write to. Below are the steps to simulate such stream with the video.mp4 and the content source.

Example rtsp server mediamtx

docker run --rm -d -p 8080:8554 -e RTSP_PROTOCOLS=tcp bluenviron/mediamtx:latest

Then write to the server using ffmpeg, example using video or camera

ffmpeg -stream_loop -1 -i ./video.mp4 -f rtsp -rtsp_transport tcp rtsp://localhost:8080/channel1
ffmpeg -f dshow -i video="HP HD Camera" -f rtsp -rtsp_transport tcp rtsp://localhost:8080/channel1

While the RTSP stream is active, run the client to read it and send the output stream

python3 client.py --grpc_address localhost:9000 --input_stream 'rtsp://localhost:8080/channel1' --output_stream 'rtsp://localhost:8080/channel2'

The results can be examined with ffplay utility which reads and display the altered content.

ffplay -pixel_format yuv420p -video_size 704x704 -rtsp_transport tcp rtsp://localhost:8080/channel2

Using gRPC unary calls

The helper class StreamClient supports using unary gRPC calls. In that case it should be initialized with a parameter streaming_api=False. It sends the frames to the model server asynchronously but each of them is stateless and each request can be processed independently. The key advantage of that mode is easier load balancing and scalability, because each request could be routed to a different instance of the model server or a different compute node.

Such use case with the unary calls with a horizontal text analysis can be followed based on this document.

Note Depending on the output format, there might be needed a custom postprocessing function implementation.