Real Time Stream Analysis Demo¶
For object classification, detection and segmentation we use CV (Computer Vision) models that take visual data on the input and return predictions like classification results, bounding boxes parameters etc. By visual data we often mean video stream generated in real time by different kinds of cameras.
In this demo you’ll see how to analyze RTSP (Real Time Streaming Protocol) stream using OpenVINO Model Server for inference.
The stream analysis app is started with
real_time_stream_analysis.py script. It reads frames from the provided stream URL, runs pre and post processing and requests inference on specified model served by OVMS.
Specific use case actions are defined in use case implementation - some notifications could be sent to another service if object of interest has been detected etc.
As part of postprocessing, inference results can be visualized. The demo can optionally start Flask server that will host inference preview as defined for the use case.
In order to make this demo work you need to:
use Python 3.6+
have access to live RTSP stream
have access to OpenVINO Model Server with your model of choice deployed
have a use case implementation
The stream analysis app needs to have access to RTSP stream to read from and OVMS to run inference on. Apart from that you need use case implementation that defines pre and postprocessing. Some examplary use cases are available in use cases catalog.
Start the real time stream analysis¶
Select use case¶
To select pre and post processing routines that will be run on video stream you need to make a small modification in real_time_stream_analysis.py script.
While creating IOProcessor instance, you have to provide a class that implements use case interface and runs routines specific to your use case. Simply change the first argument to the use case you want in the IOProcessor constructor in this one line :
io_processor = IOProcessor(<use_case_class>, visualizer_frames_queue)
The use case class must first be imported. Examplary use cases can be imported from
use_cases module like:
from use_cases import PersonVehicleBikeDetection
You can use your custom use cases the same way. Implement use case interface and add it to use_cases catalog. For simpler imports from the main script consider adding you class in use_cases package init file
Run the app¶
Clone the repository and enter real_time_stream_analysis directory
git clone https://github.com/openvinotoolkit/model_server.git
Once you adjusted the main script to your use case, got RTSP source and OVMS up and running, you can launch the stream analysis app.
python3 real_time_stream_analysis.py -h
usage: real_time_stream_analysis.py [-h] --stream_url STREAM_URL --ovms_url
OVMS_URL --model_name MODEL_NAME
Client for real time video stream analysis
-h, --help show this help message and exit
URL to the video stream. Example:
--ovms_url OVMS_URL Address of OVMS gRPC endpoint. Example: localhost:9000
Name of the target model
Version of the taget model. Default: latest available
Port of the inferece results visualizer webservice. If
not specified, visualizer will not be launched
--binary_input Convert frames to binary format before sending them to
OVMS. Default: False
Number of inference executor threads. Default: 4
Number of frames the have been received from the
stream and are awaiting inference or visualization.
When buffer is full, frames are dropped. Default: 100
Logging level - available values: INFO, DEBUG, ERROR.
python3 real_time_stream_analysis.py --stream_url rtsp://localhost:8554/mystream --ovms_url localhost:9000 --model_name person-vehicle-detection
Running with visualizer¶
Streaming analysis app can run with live browser preview that helps to quickly estimate model accuracy. It’s done by starting tiny Flask server that displays visualized inference results on HTTP endpoint. To be able to use visualizer, use case must
return True on supports_visualization call and implement visualize method.
Enabling visualizer is done by providing
--visualizer_port parameter that specifies on which port the HTTP Server will display inference results. If not provided or set to 0, visualizer will not start.
python3 real_time_stream_analysis.py --stream_url rtsp://localhost:8554/mystream --ovms_url localhost:9000 --model_name person-vehicle-detection --visualizer_port 5000
Visualizer does not use effecient streaming techniques to display results. It’s main goal is to help during development and provide simple solution for quick health/accuracy checks.
In this section let’s review three parameters:
--buffer_size. They are relevant in terms of performance and resource usage.
This parameter specifies the amount of threads that will separately interact with OpenVINO Model Server. Single inference executor handles one video frame at a time as it sends request to the model server and awaits response, therefore it’s blocked until prediction results are ready. This way OpenVINO Model Server will process only one frame at a time.
When using for example multiple CPU streams, OpenVINO can effectively infer multiple frames in parallel. To make use of that while using OpenVINO Model Server you need to send multiple requests in parallel to feed the model server with enough amount of frames to gain best performance.
Streaming analysis app can create multiple threads that send requests with video frames in parallel. Number of such threads is specified by
inference_executors parameter. Make sure to set this number to the value that will make application capable of sending as many frames per second as model can process. Especially if you use multiple streams in your model configuration.
OpenVINO Model Server can receive video frames either in an array format or in binary encoded format. When data in array format is received, model server can read and pass it as-is to the OpenVINO engine. When data in binary encoded format is received, model server needs to convert it to array format first.
Using binary input feature puts conversion burden on the model server, but the amount of data exchanged between the client and server is smaller. This can have a particularly big impact for models with large input.
Depending on the location of stream analysis app and the model server in your deployment, you may consider setting
binary_input flag and make the application send binary encoded data to OpenVINO Model Server. Consider that when your model has large input, the stream speed is really fast or the network between stream analysis app and model server is a bottleneck.
This parameter does not have a direct impact on inference performance, but is important in terms of the resources used by the whole application. Stream analysis app uses internal queues to buffer input frames as well as receiving the results and displaying them via visualizer.
All this data is kept in application memory. The
buffer_size parameter allows you to set the maximum amount of frames kept in application internal buffers. Depending on the available memory and a single frame size you may want to adjust this parameter not to consume too much, or even run out of memory.
Deploy with Helm¶
In deploy catalog you can find set of helm charts that simplify application deployment. Steps for deployment are following:
Build Docker image with the application using provided Dockerfile
docker build . -t real-time-stream-analysis:latest
Put the image in a repository accessible from your cluster
Deploy with helm
helm install rt-stream-analyzer ./deploy --set stream_url=rtsp://localhost:8554/mystream,ovms_url=localhost:9000,model_name=person-vehicle-detection,visualizer_service.port=5000
To learn about all available parameters check out values.yaml