We're hiring!

A framework to share analytics data in GStreamer

Daniel Morin avatar

Daniel Morin
February 13, 2024

Share this post:

Reading time:

GStreamer has long been the best framework to build pipelines to handle video streams, and in particular, live ones. It's no coincidence that it has been adopted widely by engineers wishing to build video analytics pipelines.

What do we mean by analytics?

Within computers, we represent media data as a series of discreet samples over time, and in the case over images, over space. We generally don't care about the meaning of those samples, as the goal is to display them back to humans. This data is unstructured. Sometimes, we instead want to structure the content of this data to extract a meaning. For example, instead of just reddish pixels, we want to know that it's a strawberry. There exist a number of different type of algorithms to do this, from traditional computer vision to the latest trends in deep learning. But they all have in common that they produce some structured data describing the content of the input.

A typical example of object detection and classification using strawberries and leaves. More examples available here: https://col.la/gstanalyticsexamplesmodels.

GStreamer is a natural choice to handle this kind of metadata describing the underlying media data. It has a flexible system to attach arbitrary bits of data to a media buffer. Many companies have built their machine learning analysis framework around GStreamer, but no one had made the effort to contribute upstream, until now.

Introducing the Analytics Metadata Framework

Our goal was to create an analytics framework for GStreamer that decouples analysis steps from each other, leverages platform-specific acceleration where available, defines generic elements that function across platforms, and scales to large amounts of data and detections.

GStreamer has a feature called a GstMeta which is a way to attach an arbitrary structure to a buffer (such as a video frame). In particular, there is also a region of interest meta that allows defining a rectangle in the image and attaching some data to it. Our first idea was to extend this, but we realized that it couldn't scale. For example, in a wide shot of a crowd, you could detect hundreds of people. The other thing we wanted make it easier to do the analysis in multiple steps, for example by having one step that detect objects, then further steps that find more information about specific objects.

We defined a new GstAnalyticsRelationMeta that stores an array of metadata structures along with a graph of relations between those. This enables us to have an object at a specific location, then define a class of objects and have a "this object belongs to this class" type of relationship. For example, we can have a "car" class and a "tire" class, so we can define a relationship between object 1 as a car and object 2 as a tire. Furthermore, we can include a relationship between objects, such as object 2 being part of object 1 - the tire is part of the car.

In this example, there are 2 types of metadata, classification and object dectection. The classification further describes the objects.

We've also defined some base classes of metadata: objects, classification and tracking. But more classes can be defined in the future, and plugins can even define their own.

We hope that this will be a first step to foster more collaboration between everyone using GStreamer as a common language for video analysis. Please don't hesitate to contact us if you want to discuss your GStreamer projects, or want help building media analytics into your products.


Comments (0)

Add a Comment

Allowed tags: <b><i><br>Add a new comment:


Search the newsroom

Latest News & Events

Monado stays ahead: Keeping pace with OpenXR 1.1 for cross-platform, open source XR


Monado, the cross-platform open source XR runtime, has recently received significant updates to align with the features and specifications…

Blast from the past at Embedded World: Atari plays for Linux


Adhering to the fundamentals of open source, the Atari VCS OS is based on Debian using the Apertis infrastructure, and the graphics rely…

Up close and personal with STMicroelectronics' STM32MP2 at Embedded World


Using TensorFlow Lite models optimized for the STM3MP2 NPU along with an upstream-ready H.264 encoder (Video4Linux2), this demo showcases…

Open Since 2005 logo

We use cookies on this website to ensure that you get the best experience. By continuing to use this website you are consenting to the use of these cookies. To find out more please follow this link.

Collabora Ltd © 2005-2024. All rights reserved. Privacy Notice. Sitemap.