Real Time Stream Analysis Demo¶
This demo demonstrates how to write an application running AI analysis using OpenVINO Model Server. In the video analysis we can deal with various form of the source content. Here, you will see how to take the source of the video from a local USB camera, saved encoded video file and an encoded video stream.
The client application is expected to read the video source and send for the analysis every frame to the OpenVINO Model Server via gRPC connection. The analysis can be fully delegated to the model server endpoint with the complete processing pipeline arranged via a MediaPipe graph or DAG. The remote analysis can be also reduced just to inference execution but in such case the video frame preprocessing and the postprocessing of the results must be implemented on the client side.
In this demo, reading the video content from a local USB camera and encoded video file is straightforward using OpenCV library. The use case with encoded network stream might require more explanation. We will present using RTSP stream transferred by the server component and encoded using FFMPEG utility.
Below is depicted such a configuration:
All the client scenarios mentioned below can read the input content from mentioned 3 sources and also send the results to 3 destinations: local screen, encoded video file or RTSP output stream.
In the demo will be used two gRPC communication patterns which might be advantageous depending on the scenario.
gRPC streaming - recommended for MediaPipe graphs especially for stateful analysis
gRPC unary calls - recommended for inference only on DAG graphs
on the client side it could be Windows, Mac or Linux. FFMPEG should be preinstalled in order to follow the scenario with RTSP client. Python3.7+ is needed.
the server can be deployed on Linux, MacOS (only with CPU execution on x86_64 arch) or inside WSL on Windows operating system.
images sent over gRPC are not encoded, so there should be good network connectivity between the client and the server. At least 100Mb/s for realtime video analysis at high rate.
gRPC streaming with MediaPipe graphs¶
gRPC stream connection is allowed for served MediaPipe graphs. It allows sending asynchronous calls to the endpoint all linked in a single session context. Responses are sent back via a stream and processed in the callback function.
The helper class StreamClient provides a mechanism for flow control and tracking the sequence of the requests and responses. In the StreamClient initialization the streaming mode is set via the parameter
Using the streaming API has the following advantages:
good performance thanks to asynchronous calls and sharing the graph execution for multiple calls
support for stateful pipelines like object tracking when the response is dependent on the sequence of requests
Preparing the model server for gRPC streaming with a Holistic graph¶
The holistic graph is expecting and IMAGE object on the input and returns an IMAGE on the output. As such it doesn’t require any preprocessing and postprocessing. In this demo the returned stream will be just visualized or sent to the target sink.
The model server with the holistic use case can be deployed with the following steps:
git clone https://github.com/openvinotoolkit/model_server.git
docker run -d -v $PWD/mediapipe:/mediapipe -v $PWD/ovms:/models -p 9000:9000 openvino/model_server:latest --config_path /models/config_holistic.json --port 9000
Check more info about this use case
Note All the graphs with an image on input and output can be applied here without any changes on the client application.
Start the client with real time stream analysis¶
Prepare the python environment by installing required dependencies:
pip install -r ../../common/stream_client/requirements.txt
For the use case with RTSP client, install also FFMPEG component on the host.
Alternatively build a docker image with the client with the following command:
docker build ../../common/stream_client/ -t rtsp_client
python3 client.py --help
usage: client.py [-h] [--grpc_address GRPC_ADDRESS]
[--model_name MODEL_NAME] [--input_name INPUT_NAME]
-h, --help show this help message and exit
Specify url to grpc service
Url of input rtsp stream
Url of output rtsp stream
Name of the model
Name of the model's input
--verbose Should client dump debug information
--benchmark Should client collect processing times
Limit how long client should run
Limit how many frames should be processed
Reading from the local camera and visualization on the screen¶
python3 client.py --grpc_address localhost:9000 --input_stream 0 --output_stream screen
--input_stream 0 indicates the camera ID
Reading from the encoded video file and saving results to a file¶
wget -O video.mp4 "https://www.pexels.com/download/video/3044127/?fps=24.0&h=1080&w=1920"
python3 client.py --grpc_address localhost:9000 --input_stream 'video.mp4' --output_stream 'output.mp4'
Inference using RTSP stream¶
The rtsp client app needs to have access to RTSP stream to read from and write to. Below are the steps to simulate such stream with the video.mp4 and the content source.
Example rtsp server mediamtx
docker run --rm -d -p 8080:8554 -e RTSP_PROTOCOLS=tcp bluenviron/mediamtx:latest
Then write to the server using ffmpeg, example using video or camera
ffmpeg -stream_loop -1 -i ./video.mp4 -f rtsp -rtsp_transport tcp rtsp://localhost:8080/channel1
ffmpeg -f dshow -i video="HP HD Camera" -f rtsp -rtsp_transport tcp rtsp://localhost:8080/channel1
While the RTSP stream is active, run the client to read it and send the output stream
python3 client.py --grpc_address localhost:9000 --input_stream 'rtsp://localhost:8080/channel1' --output_stream 'rtsp://localhost:8080/channel2'
The results can be examined with ffplay utility which reads and display the altered content.
ffplay -pixel_format yuv420p -video_size 704x704 -rtsp_transport tcp rtsp://localhost:8080/channel2
Using gRPC unary calls¶
The helper class
StreamClient supports using unary gRPC calls. In that case it should be initialized with a parameter
It sends the frames to the model server asynchronously but each of them is stateless and each request can be processed independently.
The key advantage of that mode is easier load balancing and scalability, because each request could be routed to a different instance of the model server or a different compute node.
Such use case with the unary calls with a horizontal text analysis can be followed based on this document.
Note Depending on the output format, there might be needed a custom postprocessing function implementation.