Aaron Boxer
May 12, 2025
Reading time:
Creating powerful video analytics pipelines is easy if you have the right tools. In this post, we will show you how to effortlessly build a broad range of machine learning (ML) enabled video pipelines using just two components, GStreamer and Python. We will focus on simplicity and functionality, deferring performance tuning to a future deep dive.
The core of our pipeline is GStreamer, everyone's favorite multimedia framework. Over the past few years, Collabora has contributed extensive ML capabilities to upstream GStreamer, adding support for ONNX and LiteRT inference and introducing a fine-grained, extensible metadata framework to persist model outputs.
We now take the next step by unleashing gst-python-ml: a pure Python framework that can easily build powerful ML-enabled GStreamer pipelines using standard Python packages. With just a few lines of Python, or a single gst-launch-1.0 command, you can now run complex models across multiple streams, complete with tracking, captioning, speech and text processing, and much more.
The framework is composed of a set of base classes that can be easily extended to create new ML elements, and a set of tested, fully functional elements that support the following features and models:
For a taste of the ease and simplicity of gst-python-ml, we present a few sports analytics sample pipelines.
1. Here are all the steps needed to run a Yolo tracking pipeline on Ubuntu:
apt install -y gstreamer1.0-plugins-base gstreamer1.0-plugins-base-apps \ gstreamer1.0-plugins-good gstreamer1.0-plugins-bad \ gir1.2-gst-plugins-bad-1.0 python3-gst-1.0 gstreamer1.0-python3-plugin-loader pip install pygobject pycairo torch torchvision transformers numpy ultralytics gst-python-ml gst-launch-1.0 filesrc location=/path/to/video ! decodebin ! videoconvert ! \ pyml_yolo model-name=yolo11m track=True ! pyml_overlay ! videoconvert ! autovideosink
2. Here is a soccer match processed with this pipeline:
3. Multiple video sources are also supported.
4. Another supported sports analytics feature is the creation of a bird's eye view of a game, to show a quick overview of the field:
5. gst-python-ml shows its true power when using hybrid vision + language models to enable features that are simply not available in any other GStreamer-based analytics framework, whether open source or commercial. For example, video captioning is supported using the Phi3.5 Vision model. Each video frame can be automatically captioned, and these captions can be further processed to automatically summarize a game or to detect significant events such as goals.
These are just a few of the features we have built with gst-python-ml - the possibilities are endless.
gst-python-ml is distributed as a PyPI package. All elements are first class GStreamer elements that can be added to any GStreamer pipeline, and they will work with any Linux distribution's GStreamer packages, from version 1.24 onward.
Development takes place on our GitHub repository — we welcome contributions, feedback and new ideas.
As we continue building gst-python-ml we are actively looking for collaborators and partners. Our goal is to make ML workflows in GStreamer powerful and accessible — whether for real-time media analysis, content generation, or for intelligent pipelines in production environments.
If you would like to know more about Collabora's work on GStreamer ML, please contact us.
12/05/2025
Powerful video analytics pipelines are easy to make when you're well-equipped. Combining GStreamer and Machine Learning frameworks are the…
06/05/2025
Gustavo Noronha helps break down C++ and shows how that knowledge can open up new possibilities with Rust.
29/04/2025
Configuring WirePlumber on embedded Linux systems can be somewhat confusing. We take a moment to demystify this process for a particular…
24/04/2025
Collabora's Board Farm demo, showcasing our recent hardware enablement and continuous integration efforts, has undergone serious development…
27/02/2025
If you are considering deploying BlueZ on your embedded Linux device, the benefits in terms of flexibility, community support, and long-term…
15/01/2025
With VirGL, Venus, and vDRM, virglrenderer offers three different approaches to obtain access to accelerated GFX in a virtual machine. Here…
Comments (0)
Add a Comment