Adding mainline Arm Frame Buffer Compression support for Rockchip

Adding mainline Arm Frame Buffer Compression support for Rockchip

Andrzej Pietrasiewicz
April 08, 2020

Share this post:

Reading time:

Rockchip SoCs, notably the RK3399, are popular in devices such as Chromebooks and single-board computers. Indeed, they bring some interesting features, one of them being the Arm Frame Buffer Compression (AFBC).

Why AFBC?

To understand that, let's have a look at a typical display pipeline. There usually is a GPU for 3D rendering and ultimately its output data must be sent to the actual display, and to do so it must be formatted suitably by a Display Processor. Nowadays each frame weighs at least a few megabytes and sometimes you may access multiple frames at once. Obviously, this amount of memory cannot be provided internally by the SoC, so we need to be accessing memory which is external to it, and we need to do this a lot! This, in turn, directly translates into memory bandwidth (of which there is a limit: you can only transfer so much in a unit of time), battery power usage and heat generation. These reasons alone justify looking for a way to reduce memory bandwidth usage in a display pipeline.

Arm created AFBC exactly to mitigate the problems mentioned in the previous paragraph. The trick is that the image is split into blocks (of e.g. 16 x 16 pixels) and each block's data is compressed. Each block also has a fixed length header associated with it to store compression metadata. All the headers are sent first, followed by blocks.

AFBC frame

So, theoretically more memory is used by a compressed frame, but read on to understand how the reduction of memory bandwidth becomes possible.

AFBC will allocate slightly more memory to hold each block's data (headers plus usual amount for the frame) than a normal uncompressed frame would . However, if the block's data compresses well (and you can compress by up to 50%), much of that allocation won't actually be used. So, depending on the dynamic compression ratio, you can save up to 50% memory bandwidth, improving both efficiency and performance.

The compression is lossless, which wouldn't be that important if frames were to be only displayed to human users, but compression being lossless is crucial if decompressed frames serve as reference frames, because there are no inaccuracies which would otherwise keep accumulating. And despite the compression, the scheme even allows randomly accessing image data down to 4px x 4px! Great? Wait...

A spoonful of tar in a barrel of honey

The said compression scheme is proprietary, so unfortunately we don't know how it works. The AFBC-related parts of hardware in the GPU and Display Processor are total black boxes, thus limiting the potential of Open Source community to write great software utilizing AFBC. Nevertheless, AFBC-aware IP blocks should be able to exchange AFBC buffers even if we don't fully understand the buffer contents, so it still is an interesting option for the moment. It does prove useful because of memory bandwidth reduction - but it has trade-offs, including a complete lack of knowledge of its internals. You should know that other vendors have competing implementations of similar schemes, too -- so here's to a proper, open standard for frame buffer compression in the future.

So, what can be done?

We can still use AFBC in our display pipelines without knowing "what's inside": if the components of our pipeline understand AFBC, we can make them use it and enjoy the benefits. To do so we need to control the involved components and that can be done even without understanding what they do under the hood.

Some insight

That being said, we can try understanding what's inside an AFBC-compressed frame. You can have a look here, branch afbc-test to see how an all-red frame can be generated. It doesn't take a rocket scientist to notice that for each 16x16 block there is a fixed-length header of 16 bytes. All the headers go first and then all the blocks follow. In the header the first 4 bytes specify where in the buffer its corresponding block data is, and the fifth byte is 0x07:

AFBC header

Inside the block the first 6 bytes contain specific values and the rest is filled with zeros. So, for an all-red frame, we use only 6 bytes out of 1024! (16 * 16 * 4 bpp):

AFBC block

Of course the all-red frame is kind of an extreme case, but you get the idea of how we are limiting memory accesses with AFBC.

Upstreaming AFBC for RK3399

Since late 2018, the Mali-DP display drivers, malidp and komeda, have had the ability to use AFBC, and now support for Rockchip (RK3399) is also on its way. While the initial work was done by Rockchip in 2014, it unfortunately wasn't upstreamed. Efforts to provide AFBC support for Rockchip in mainline have recently concluded and the feature is available in drm-misc-next tree. You can find the actual patches here [1][2][3][4][5].

Some design considerations in Linux kernel DRM.

The DRM subsystem follows the "library instead of midlayer" approach, which is nicely described in this lwn article from 2009. So, consequently, there is DRM core and DRM drivers. The latter are free to use so called DRM helpers, but there is nothing preventing them from opting out. During the work on AFBC, an idea has crystalized in the mailing list in November 2019 to put AFBC handling in helpers. This indeed proves a good design decision, because thanks to it the core does not deal with a very specific (and proprietary!) extension and, in fact, the use of helpers is purely optional. The patch series first lays the foundation for drivers to allocate struct drm_afbc_framebuffer explicitly in order to be able to do special AFBC-related checks and now the drivers can opt-in to use the new helpers.

The future

We should be expecting that AFBC support for Rockchip will be landing in the next Linux kernel release. It is worth mentioning that existing AFBC users can also benefit from the newly added helper functions (patch [2]). f you have any questions on how to these new functions, or if you would like to try to bring AFBC support to another AFBC-enabled SoC, please contact us! While we obviously can't create AFBC per se, we can help control the hardware so that it starts using it.

Panfrost & Wayland on a Rockchip board

Adding stateless support to vicodec

A new era for Linux's low-level graphics - Part 2

Panfrost & Wayland on a Rockchip board

Adding stateless support to vicodec

A new era for Linux's low-level graphics - Part 2

Comments (2)

Andy:
Mar 12, 2022 at 11:28 AM

Thanks for you great work, would you please share some information about how to enable afbc on weston or mesa?

Reply to this comment

Reply to this comment
1. Daniel Stone:
  Mar 14, 2022 at 04:21 PM
  
  There's no enablement required: if you have a new enough version of the kernel for the KMS driver to declare support for the AFBC modifiers, and a new enough version of Mesa for it to use the AFBC modifiers, then it should just work out of the box.
  
  Reply to this comment
  
  Reply to this comment

Add a Comment

Search the newsroom

Latest Blog Posts

PipeWire workshop 2025: Updates on video transport, Rust efforts, TSN networking, and Bluetooth support

03/07/2025

As part of the activities Embedded Recipes in Nice, France, Collabora hosted a PipeWire workshop/hackfest, an opportunity for attendees…

Coccinelle for Rust progress report

25/06/2025

In collaboration with Inria, the French Institute for Research in Computer Science and Automation, Tathagata Roy shares the progress made…

Linux Media Summit 2025 recap

23/06/2025

Last month in Nice, active media developers came together for the annual Linux Media Summit to exchange insights and tackle ongoing challenges…

Constructor acquires, destructor releases

09/06/2025

In this final article based on Matt Godbolt's talk on making APIs easy to use and hard to misuse, I will discuss locking, an area where…

What if C++ had decades to learn?

21/05/2025

In this second article of a three-part series, I look at how Matt Godbolt uses modern C++ features to try to protect against misusing an…

Unleashing gst-python-ml: Python-powered ML analytics for GStreamer pipelines

12/05/2025

Powerful video analytics pipelines are easy to make when you're well-equipped. Combining GStreamer and Machine Learning frameworks are the…

About Collabora

Whether writing a line of code or shaping a longer-term strategic software development plan, we'll help you navigate the ever-evolving world of Open Source.

한국의 국기 한국어 버전의 Collabora.com 보기