Daniel Almeida
November 19, 2025
Reading time:
A few months ago, we introduced Tyr, a Rust driver for Arm Mali GPUs that continues to see active development upstream and downstream. As the upstream code awaits broader ecosystem readiness, we have focused on a downstream prototype that will serve as a baseline for community benchmarking and help guide our upstreaming efforts.
Today, we are excited to share that the Tyr prototype has progressed from basic GPU job execution to running GNOME, Weston, and full-screen 3D games like SuperTuxKart, demonstrating a functional, high-performance Rust driver that matches C-driver performance and paves the way for eventual upstream integration!

I previously discussed the relationship between user-mode drivers (UMDs) and kernel-mode drivers (KMDs) in one of my posts about how GPUs work. Here's a quick recap to help get you up to speed:
One thing to be understood from the previous section is that the majority of the complexity tends to reside at the UMD level. This component is in charge of translating the higher-level API commands into lower-level commands that the GPU can understand. Nevertheless the KMD is responsible for providing key operations such that its user-mode driver is actually implementable, and it must do so in a way that fairly shares the underlying GPU hardware among multiple tasks in the system.
While the UMD will take care of translating from APIs like Vulkan or OpenGL into GPU-specific commands, the KMD must bring the GPU hardware to a state where it can accept requests before it can share the device fairly among the UMDs in the system. This covers power management, parsing and loading the firmware, as well as giving the UMD a way to allocate GPU memory while ensuring isolation between different GPU contexts for security.
This was our initial focus for quite a few months while working on Tyr, and testing was mainly done through the IGT framework. These tests would mainly consist of performing simple ioctls() against the driver and subsequently checking whether the results made sense.
By the way, those willing to further understand the relationship between UMDs and KMDs on Linux should watch a talk given at Kernel Recipes by my colleague Boris Brezillon on the topic!
Once the GPU is ready to accept requests and userspace can allocate GPU memory as needed, the UMD can place all the resources required by a given workload in GPU buffers. These can be further referenced by the command buffers containing the instructions to be executed, as we explain in the excerpt below:
With the data describing the model and the machine code describing the shaders, the UMD must ask the KMD to place this in GPU memory prior to execution. It must also tell the GPU that it wants to carry out a draw call and set any state needed to make this happen, which it does by means of building VkCommandBuffers, which are structures containing instructions to be carried out by the GPU in order to make the workload happen. It also needs to set up a way to be notified when the workload is done and then allocate the memory to place the results in.
In this sense, the KMD is the last link between the UMD and the GPU hardware, providing the necessary APIs for job submission and synchronization. It ensures that all the drawing operations built at the userspace level can actually reach the GPU for execution. It is the KMD's responsibility to ensure that jobs only get scheduled once its dependencies have finished executing. It also has to notify (in other words, signal to) the UMD when jobs are done, or the UMD won't really know when the results are valid.
Additionally, before Tyr can execute a complex workload consisting of a vast amount of simultaneous jobs, it must be able to execute a simple one correctly, or debugging will be an unfruitful nightmare. For this matter, we devised the simplest job we could think of: one that merely places a single integer in a given memory location using a MOV instruction on the GPU. Our IGT test then blocks until the KMD signals that the work was carried out.
Reading that memory location and ensuring that its contents match the constant we were expecting shows that the test was executed successfully. In other words, it shows that we were able to place the instructions in one of the GPU's ring buffers and have the hardware iterator pick it up and execute correctly, paving the way for more complex tests that can actually try to draw something.
The test source code for this dummy job is here.
With job submission and signalling working, it was time to attempt to render a scene. We chose kmscube, which draws a single rotating cube on the screen, as the next milestone.
It was a good candidate owing to its simple geometry and the fact that it is completely self-contained. In other words, no compositor is needed and rendering takes place in a buffer that's directly handed to the display (KMS) driver.
Getting kmscube to run would also prove that we were really enforcing the job dependencies that were set by the UMD or we would get visual glitches. To do so, we relied on a slightly updated version of the Rust abstractions for the DRM scheduler posted by Asahi Lina a few years ago. The result was a rotating cube that was rendered at the display's refresh rate.

Using offscreen rendering lets us go even faster, jumping from 30 or 60fps to more than 500 frames per second, matching the performance of the C driver. That's a lot of frames being drawn!
The natural progression would be to launch Weston or GNOME. As there is quite a lot going on when a DE like GNOME is running; we were almost expecting it not to work at first, so it came as a huge surprise when GNOME's login page was rendered.
In fact, you can log in to GNOME, open Firefox, and...watch a YouTube video:

Running vkcube under weston also just works!

The last 3D milestone is running a game or another 3D-intensive application. Not only would that put the GPU through a demanding workload, but it would also allow us to gauge the KMD's performance more accurately. Again, the game is rendered correctly and is completely playable, without any noticeable hiccups or other performance issues, so long as it is run on full screen. Unfortunately, windowed mode still has some glitches: it is a prototype, after all.

It's important to clarify what this means and how this plays into the long-term vision for the project.
In fact, it's easier to start by what we are not claiming with this post: Tyr is not ready to be used as a daily-driver, and it will still take time to replicate this upstream, although it is now clear that we will surely get there. And as a mere prototype, it has a lot of shortcuts that we would not have in an upstream version, even though it can run on top of an unmodified (i.e., upstream) version of Mesa.
That said, this prototype can serve as an experimental driver and as a testbed for all the Rust abstraction work taking place upstream. It will let us experiment with different design decisions and gather data on what truly contributes to the project's objective. It is a testament that Rust GPU KMDs can work, and not only that, but they can perform on par with their C counterparts.
Needless to say, we cannot make any assumptions about stability on an experimental driver, it might very well lock up and lose your work after some time, so be aware.
Finally, this was tested on a Rock 5B board, which is fitted with a Rockchip RK3588 system-on-chip and it will probably not work for any other device at the moment. Those with this hardware at hand should feel free to test our branch and provide feedback. The source code can be found here. Make sure to enable CONFIG_TYR_DRM_DEPS and CONFIG_DRM_TYR. Feel free to contribute to Tyr by checking out our issue board!
Below is a video showcasing the Tyr prototype in action. Enjoy!
19/11/2025
The Tyr prototype has now progressed from basic GPU job execution to running GNOME, Weston, and full-screen 3D games like SuperTuxKart,…
05/11/2025
As a trusted partner of industry leaders like CLAAS, Ag Leader, and CCI, we are delighted to exhibit for the first time at one of the world’s…
16/10/2025
Collabora and MediaTek are advancing upstream Linux support for the latest Genio IoT boards and Chromebook Plus laptops, enabling full hardware…
Comments (0)
Add a Comment