Posted on 22/03/2018 by Olivier Crête
After a particularly long cycle of over 10 months, the GStreamer community had accumulated a lot of improvements that are now widely available in the 1.14 release. The release notes contain a good explanation of everything the community has produced, but I'd like to highlight some of the contributions from Collabora's engineers that we're particularly proud of.
One of the key areas in which we've focused is various tweaks to improve the life of embedded developers by making it easier to create multimedia processing pipelines that fully utilize the platform capabilities without needing to do manual tweaking or configuration. That means not only improving the components that interact with the hardware APIs such as gst-omx and v4l2, but it also means improving the GStreamer framework itself to negotiate the right formats and memory allocation constraints between the different hardware blocks used by the pipeline.
Our biggest contribution in this cycle is to the GStreamer OpenMAX IL layer, Guillaume has been integrating all of the features expected on codec in 2018, targeting the VCU (Video Codec Unit) inside the newest Xilinx Zynq® UltraScale+™ MPSoC. This work required over 80 patches just for the gst-omx component. This first meant adding a number of features that were lacking, for example, support for H.265 decoders and encoders was added following the almost standard Android extension (as the latest OpenMAX IL standard pre-dates H.265 and has not been updated since). A number of standard properties were also added to configure the encoder as well as support for 10-bit color formats. A new target specific to the Zynq UltraScale+ was also added to the gst-omx project, which, when enabled, adds a number of new features. Among those features, we can note support for the exchange of DMABuf buffers with other elements, exporting the latency of the codecs to the pipeline, and exposing a number of codec specific properties.
To make sure that the encoder can catch up if there is a glitch, the "QoS" property was added to the encoder to drop late frames before encoding them to maintain a low latency of live pipelines, this also required updating the encoder base class to provide the necessary infrastructure. Guillaume also implemented the section of the now abandoned OpenMAX 1.2 draft that allows the allocation of buffers after the component has been started, that makes it easier to interact with the dynamic nature of the GStreamer pipeline. He also recently added support for "Region of Interest" allowing the encoder to give a lower (or higher) compression to a specific part of each frame, for example, to keep a low bitrate while keeping the subtitles readable, or to pair it with face recognition to make people recognizable. Another interesting feature is that the H.264 and H.265 parsers now produce enough information in their caps to allow the following decoder to allocate the required output buffers before the first frames is received, this enables fast start on systems where memory allocation is a slow operation.
OpenMAX was popular, but, over the last decade, the Linux kernel has also grown an API for hardware codecs in the Video4Linux2 (v4l2) framework. GStreamer has for years supported decoders, but support for encoders was merged by Nicolas into this release. The v4l2 plugin supports encoders for the VP8, VP9, MPEG 4 Part 2 Video, H.263 and H.264 codecs. The decoders have been updated to support dynamic resolution change which should make them much better at handling DASH & HLS streams. The device provider, the API that allows enumerating cameras and other devices on the system has been improved to be massively faster at probing v4l2 devices. The plugin also no longer uses libv4l2 by default, which fixes a number of problems with modern APIs like DMABuf.
One major blocker to non-trivial hardware accelerated pipelines had up to now been the fact that the "tee" element blocks allocation queries, which are used to carry important information like the acceptance of the video "meta" by downstream elements which is used to specify non-default strides. Nicolas has added this support. As we had been using the "tee" element as a cheap way to drop the allocation queries (and force the fallback codepath for testing), Nicolas added a "drop-allocation" property to the identity element to make it possible to drop the allocation query on purpose. A new fakesink element for video was created called "fakevideosink", the main addition is that it claims to support video specific things such as GstVideoMeta, GstVideoCropMeta and GstVideoOverlayCompositionMeta, which means that it will allow video buffers to reach the sink as if it was a fully hardware accelerated making it easier to test zero-copy pipelines without having a real sink connected.
Speaking of sinks, Nicolas has also been working hard on the KMS sink that allows displaying directly to the Linux Kernel's KMS infrastructure without using a compositor systems like Wayland or X, he's added support for the VC4 driver on the Raspberry Pi as well as the Xilinx DRM driver. He also added properties that can be used from gst-launch to put the video in a specific location on the screen instead of always being fullscreen. Support for the crop meta has also been added which allows the sink to "crop" the video by itself without having to do a copy in the element that wants to crop. He also updated the videocrop element to attach a GstVideoCropMeta instead of doing the cropping itself if the downstream element supports it, giving us zero-cost cropping with the right sink. Another useful update to the KMS sink is that it can now export "dumb" buffers which are the lowest level of buffer compatibility to elements upstream in the pipeline, this is useful if the upstream elements can't allocate DMAbuf buffers on their own.
Collabora helped Haivision Open Source it's Secure Reliable Transport (SRT) protocol which enables low latency transfer of professional grade video over regular Internet connections. The SRT project also released libsrt, which is the reference implementation of SRT and Collabora's Justin Kim created a set of sink & source elements can speak this new exciting protocol. For more details, see my blog post on SRT in GStreamer. We've already demonstrated them at the IBC tradeshow and we will also be demonstrating them at NAB in Las Vegas in two weeks. Speaking of Justin, he also spearheaded an effort to finally fix the situation with changing SPSes in RTP streams. GStreamer now has a correct behaviour, meaning that it forwards the updates down the pipeline and elements that create formats that don't support changes in the SPS, such as MP4 and Mastroska will now cleanly error out instead of creating invalid files. To support those formats, one should instead use the splitmuxsink elements which allows splitting into a new file if the format changes.
George also finished and merged the ipcpipeline plugins, they allow an application developer to split the multimedia pipeline into two separate processes, a master process which handles the first part of the pipeline and a slave which handles the second part. The behaviour of the slave is totally controlled by the master and the master pipeline looks like a regular GStreamer pipeline to the application. This is useful to segregate network access in the first part from access to the hardware decoder in the second part, see all the details in George's blog post, ipcpipeline: Splitting a GStreamer pipeline into multiple processes.
A common effort of the GStreamer community, the GstAggregator and GstAudioAggregator base classes were finally moved to the GStreamer core and now have fixed APIs. I spent quite some time fixing all kind of little corner cases to make them rock solid. The main feature they bring is the ability of elements that aggregate data, such as multiplexers, compositors and mixers, to work with live sources that may have gaps in their input streams. Vincent and me ported the flvmux element to this base class, the FLV (Flash Video) format is used to send live video to YouTube and Facebook, meaning that good support for live sources was essential to make use-cases.
We also did a bunch of little improvements. Nicolas added a fast-start mode the RTP jitterbuffer, which makes the assumption that there will not be packet re-ordering in the first couple milliseconds of a RTP stream allowing for faster starts. Guillaume & Nicolas worked to improve the latency tracer to give more accurate values. And the whole team also did a significant number of bug fixes throughout the GStreamer framework.
Looking ahead to the next 1.16 cycle, we can expect more improvements. In particular, we're looking at how to deal with passing slices of video frames, so an element can start processing a frame before the previous one is finished producing it to have the lowest latency possible in the pipeline.
Linus Torvalds has now released the official Linux 4.17, so it’s time for our traditional blog post summing up our contributions to the…
This weekend, we're headed to Gothenburg, Sweden, to meet the Nordic FOSS community at foss-north, a free / open source conference covering…
Attending the NAB Show in Las Vegas? Make sure to stop by Collabora's booth, #N2908VR in the North Hall, and get a firsthand look at the…