December 20, 2019
|Panfrost, Lima, and Arm developers together in Montreal during XDC 2019. From left to right: Lyude Paul (Panfrost), Ryan Houdek (Panfrost), Tomeu Vizoso (Panfrost, Collabora), Alyssa Rosenzweig (Panfrost, Collabora), John Einar Reitan (Arm), Rohan Garg (Panfrost, Collabora), Boris Brezillon (Panfrost, Collabora), Erico Nunes (Lima), Connor Abbott (Lima, Panfrost), Rob Herring (Panfrost)|
If you have a device with a Mali T720 or T820 GPU, you’re in luck – your device is now supported in upstream Mesa at feature parity with other GPUs. Get out your Allwinner H6 or Amlogic S912 board, grab the latest Mesa, and enjoy a match of SuperTuxKart with fully free and open source mainline drivers!
When Panfrost began, we focused on the highest performance Mali GPUs found in Chromebooks. By contrast, Mali GPUs like T720 are designed for simplicity, where minimizing size is more important than maximizing performance.
Simplicity for the hardware, that is. For us, those changes mean new complexity – but we’re up to the challenge. Over the past month, Collaboran Tomeu Vizoso and I reverse-engineered the Mali T720 and adapted Panfrost for the new devices.
Much of our work focused on the tiler. As I blogged about over the summer, Mali GPUs are “tiling” architectures, meaning they divide the screen into many small “tiles” or “bins” and operate on those smaller sections of the screen to save memory bandwidth and improve power efficiency. The fastest Mali architectures use “hierarchical tiling”, where many different sizes of tiles are used at once. But this tiler is simplified, with no support for hierarchical tiling. Instead, the driver selects a single tile size used for the entire screen; the new model requires new driver changes. Fortunately after my work on hierarchical tiling over the summer, we were able to figure out the non-hierarchical tiler and then implement our findings in Panfrost with ease.
On the compiler side, these GPUs feature another simplification. Most instruction sets, including Midgard, are based on “registers”, where data can be written and read for computation. On Midgard, there are three types of registers: work registers, load/store registers, and texture registers. Work registers are general purpose, used for arithmetic. Load/store registers and texture registers, however, are special, used with load/store and texture instructions respectively. On most Mali chips, there are three separate sets of registers for each of the three types. But the simplified GPUs are a bit special, diffusing the texture registers into the work and load/store register spaces – a surprising and rather confusing discovery at first. Nevertheless, once we understood this unique phenomenon known as “interpipe register aliasing”, we were able to modify our compiler accordingly, fixing assorted issues relating to textures.
One final focus area surrounded Mali’s framebuffer descriptors. OpenGL features “multiple render target” support, allowing an app to render to different render targets (surfaces) at once, useful for effects like deferred shading. Mali GPUs support this feature in hardware since Mali T760, via the “multiple framebuffer descriptor”. Nevertheless, earlier Mali GPUs do not support this feature in hardware, instead emulating support in software. These GPUs use the simplified “single framebuffer descriptor”. We improved support for handling these simplified descriptors, reverse-engineering and integrating features like transaction elimination. As a bonus, this work also benefits anyone with T6xx GPUs.
With these improvements and many other minor features and bug fixes, we brought T720 and T820 up to feature parity with our existing boards, and added these into our continuous integration infrastructure to ensure Panfrost continues to work beautifully. Panfrost is now ready for daily use on Mali GPUs from T720 to T860. All of the source code is upstream… so happy hacking :-)
Syzkaller is much needed tool for Linux kernel testing and debugging. With some work, it can also be enhanced to find bugs in specific drivers,…
Previously, we discussed about how Rust can be a great language for embedded programming. In this article, we'll explain an easy setup to…
Adaptive streaming is a technique to provide flexibility and scalability by offering variable bit-rate streams to the client. Here's a quick…
With only free software, a Mali G31 chip can now run Wayland compositors with zero-copy graphics, including GNOME 3. We can run every scene…
Device drivers can support more revisions and SoC platforms by abstracting away specific hardware interface layouts. Let's examine a specific…
gst-build is one of the main build systems used by the community to develop the GStreamer platform. In my last blog post, I presented gst-build…