March 20, 2018
The latest enhancements to the DRM subsystem have made mainline Linux much more attractive, making drivers easier to write, applications portable, and a much more friendly and collaborative community than we've ever had.
Over the past couple of years, Linux's low-level graphics infrastructure has undergone a quiet revolution. Since experimental core support for the atomic modesetting framework landed a couple of years ago, the DRM subsystem in the kernel has seen roughly 300,000 lines of code changed and 300,000 new lines added, when the new AMD driver (~2.5m lines) is excluded. Lately Weston has undergone the same revolution, albeit on a much smaller scale.
Daniel Vetter's excellent two-part series on LWN covers the details quite well, but in short atomic has two headline features. The first is better display control: by grouping all configuration changes together, it is possible to change display modes more quickly and more reliably, especially if you have multiple monitors. The second is that it allows userspace to finally use overlay planes in the display controller for composition, bypassing the GPU.
A third, less heralded, feature is that the atomic core standardises user-visible behaviour. Before atomic, drivers had very wide latitude to implement whatever user-facing behaviour they liked. As a result, each chipset had its own kernel driver and its own X11 driver as well. With the rewrite of the core, backed up by a comprehensive test suite, we no longer need hardware-specific drivers to take full advantage of hardware features. With the substantial rework of Weston's DRM backend, we can now take full advantage of these. Using atomic gives us a smoother user experience, with better performance and using less power, whilst still being completely hardware-agnostic.
This has made mainline Linux much more attractive: the exact same generic codebases of GNOME and Weston that I'm using to write this blog post on an Intel laptop run equally well on AMD workstations, low-power NXP boards destined for in-flight entertainment, and high-end Renesas SoCs which might well be in your car. Now that the drivers are easy to write, and applications are portable, we've seen over ten new DRM drivers merged to the upstream kernel since atomic modesetting was merged. These drivers are landing in a much more friendly and collaborative community than we've ever had.
One of the headline features of atomic is the ability to use hardware planes for composition. To show windows to the user, display servers like Weston need to composite the content of multiple windows together into a single image - hence why they are also known as 'compositors'. With the exception of mouse cursors and fullscreen windows, the entire content of each output is a single flat image, created by using OpenGL ES, Pixman, or similar, to combine all the client images together.
Using the GPU for composition isn't exactly as complex as rendering scenes from a game, but there is a real cost. If your GPU is already straining at its limits - say you are playing the latest new game in a window, or running ShaderToy in your browser - then adding more load to the GPU with composition is the last thing you want to do. Conversely, if you aren't using the GPU at all, then GPU composition will stop your GPU from switching off entirely, adding a surprising amount of power consumption.
Display controllers can display more than just one image and a mouse cursor, though. Most hardware has a couple of overlay planes, logically positioned between the primary image and the mouse cursor. These planes are also known as 'sprites', which will ring a bell for those familiar with 1980s games. These games did exactly the same as we want to: display one static background image, with smaller images shown on top of it, without having to redraw the whole scene every time.
Using these overlay planes not only frees up the GPU for more client work (or allows it to power down), but in many cases gives us better image quality. Hardware vendors put a lot of work into their display controllers, with better quality for image scaling, YUV to RGB colourspace conversion, as well as general quality optimisations. This is especially true on dedicated hardware like set-top boxes.
Weston has had limited support for overlay planes since the very early days, but these were disabled as the legacy KMS API made it near-on unusable. Long before the atomic KMS API was marked stable, we begun work on an atomic DRM patchset for Weston to help push this work forward. Quite some time later, after a rewrite of most of Weston's KMS backend, we have finally managed to land support for Weston 4.0. This support enables us to use the core atomic API with solid and reliable state tracking.
We need such solid state tracking in order to brute-force a configuration. In spite (or because) of their capability, overlay planes have a number of surprising limits to how they can be used. These include per-plane scaling limits ('no more than 8x, or no less than 1/4x'), global bandwidth limits, shared scaler and compression units, even down to limits on the number of overlay planes which can be active on a single scanline (row) of the display. Android often has per-platform HWComposer modules, which analyse a scene and produce the most optimal result.
Without this hard-coded platform-specific knowledge, the best we can do is try, try again. Atomic gives us the ability to compute and test a configuration, to see ahead of time if it will work. Weston uses this to come up with the best configuration it can, by repeatedly testing all the different possibilities.
The patches on top of our core state-tracking and atomic API work are still in review, though with largely positive comments. So Weston 4.0 will use atomic where possible, and exercise our new state-tracking paths. On top of this, for the next release of Weston, we will add the code to use overlay planes where we can, finally delivering on the promise of display hardware to be able to use display hardware to the fullest extent possible. We expect that release will also include support for DRM leases, allowing time- or safety-critical clients such as VR or automotive information to directly drive the display hardware without intervention from the compositor.
Did you know you could run a permissively-licensed MTP implementation with minimal dependencies on an embedded device? Here's a step-by-step…
Earlier this year, the Rust compiler gained support for LLVM source-base code coverage. In this post we'll explain how to setup a CI job…
Over the past few months, I've been working on a side project to improve Meson sub-project support. The best stress test is to build projects…
The most complete automated testing and continuous integration tool for the Linux kernel continues to evolve at a rapid pace. Here's a look…
In the embedded world, many modern SoCs such as the ST Microelectronics STM32MP1 now include coprocessor cores which can be used for a wide…
Our recent efforts on the Hantro kernel driver have resulted in the addition of H.264 decoding support and multiple performance improvements.…