Daniel Almeida
July 07, 2025
Reading time:
The last year has seen substantial progress on the DRM infrastructure required to write GPU drivers in Rust. While a great part of it was fueled by the development of Nova (the new driver for GSP-based NVIDIA GPUs), and by AGX (the driver for the GPUs on Apple's M-series chip that preceded Nova), a few components were being worked on to cater to a then undisclosed driver that was being prototyped behind the scenes. A driver that we now introduce to the community at large in a very early stage.
Tyr is a new Rust-based DRM driver for CSF-based Arm Mali GPUs, making Collabora the first consultancy to formally join the Rust-for-Linux initiative, a testament to our commitment to advancing Rust development within the kernel community. It is a port of Panthor - which is a mature driver written in C for the same hardware - and a joint effort between Collabora, Arm, and Google.
In this sense, Tyr aims to eventually implement the same userspace API offered by Panthor for compatibility reasons, so that it can be used as a drop-in replacement in our Vulkan driver, called PanVK. We foresee Panthor being used - and of course supported - for a relatively long time, as it is a mature driver with a large adoption in the ecosystem. It will probably take a couple of years for Tyr to fully pick up.
Over the course of the next few weeks, we will be releasing a series of blog posts that explain in detail the inner workings of Tyr and its components. This will be a walk-through for GPU drivers on Linux, so stay tuned! For now, we will address why we are submitting a patch to add an initial version of Tyr upstream and what exactly is contained in that submission.
This question will be easier to answer once we discuss how GPU drivers work. As we uncover the components that make up a kernel-mode driver and how they interact with the rest of the kernel, it will become clear that we simply cannot write a Rust GPU driver with the infrastructure that is currently available upstream. We will discuss, for example, the central role of the Micro Controller Unit (or MCU) within the Arm Mali architecture and then explain why we cannot make it boot without landing some pre-requisite code first, as well as discuss various other components that also cannot be written before landing even more code.
The issues outlined above led us to develop Tyr on a downstream branch. Doing so lets us prototype much faster, which is the reason why we are much further ahead in our downstream code. In fact, we are able to submit small parcels of work to the GPU already.
We will also address this later, but I believe that we will achieve a working driver in our downstream code soon, for a given definition of "working" anyway. In any case, our downstream branch will be invaluable to test the surrounding, work-in-progress code that is needed to support not only Tyr, but also other Rust DRM drivers.
Unfortunately, reconciling the current development model with upstream has been somewhat challenging recently, so a change in the overall strategy that we have been using to develop Tyr was needed.
By submitting small parts of the driver upstream iteratively, we hope to evolve alongside Nova and rvkms, which should help us avoid regressions from upstream changes that might otherwise break our code simply because we weren't part of the upstream conversation yet. This approach also lets us prove that work-in-progress abstractions actually function correctly on real hardware with a real driver, rather than just in theory.
Perhaps most importantly, having Tyr as a concrete user of these abstractions gives the community a compelling reason to work on and review them. After all, it's much easier to motivate development when there's an actual driver that depends on the code. Lastly, developing code in the open and with the upstream community also aligns with our philosophy here at Collabora overall.
The current submission can power up the GPU and probe the device on an RK3588 system-on-chip. This lets us read a few sections of ROM in the GPU, which in turn lets us provide this information to userspace by means of a DRM_IOCTL_PANTHOR_DEV_QUERY
call.
This is all that can be done for now, at least until the MCU can be made to work. More on that will be discussed during the subsequent posts.
This is just the first installment in a series that will dive much further into GPU driver development. We will begin by discussing the role of GPU drivers in general by exploring a simple yet very instructional application known as VkCube
, which is available on the Vulkan-Tools repository, as we need to build this background before we can segue into the specifics of Mali's CSF architecture. Feel free to compile and explore the code beforehand if you would like to follow along.
19/08/2025
Collabora is heading to Amsterdam with talks, demos, and workshops covering Embedded Linux, KernelCI, Bluetooth & Auracast, mainline video…
13/08/2025
The Mesa 25.2 release introduces support for AFBC compressed YUV textures in the Panfrost driver for ARM Mali GPUs, enabling more efficient…
04/08/2025
Starting with Mesa 25.2, NVK will now advertise support for Blackwell (RTX 50xx series) and Kepler (most GT and GTX 600 series, most GTX…
Comments (3)
q4a:
Jul 08, 2025 at 05:52 PM
Hi. I've been following PanVK development in Mesa and in kernel for quite some time, so I'd like to ask: what is the key advantage of rewriting mature PanVK kernel driver in Rust?
I understand that sometimes you can prototype faster in Rust or if the project is completely new and (maybe?) for some developers it will be easier to write a new driver in Rust.
I can see 2 big advantages of the current C driver:
1) It already exists and is quite reliable/mature
2) It does not require any additional bindings/infrastructure that have yet to be added for Rust.
I can see several potential advantages of the Rust driver, but they do not seem very important to me yet (please point out where I'm wrong or if I forgot something):
1) The new driver can be clearer, simpler, more concise thanks to Rust (in theory, it is possible to improve the C driver).
2) The Rust driver can be safer when using memory, etc. (in theory, you can add static code analyzers to check the driver in C).
3) There is a group of developers who really don't like C, are ready to rewrite PanVK (and all the necessary infrastructure) in Rust, but do not want to use C. So far in current upstream submission I have found 5 people who have worked on the Tyr: Daniel Almeida, Alice Ryhl, Beata Michalska, Carsten Haitzler, Rob Herring
4) It will be possible to split: leave all the old HW (before CSF/before v10 Mali arch) for the PanVK kernel driver and the new HW (CSF-based/v10+ Mali arch) for the Tyr? And then there will be no need to support legacy HW in Tyr and it will has better designe?
I am writing a lot to show that I want to understand what the advantage of Tyr will be and what is the motivation for rewriting PanVK in Rust?
In any case, I wish you good luck and hope that this will attract new developers to the kernel driver and allow others (like Boris Brezillon) to focus on the Mesa part.
Reply to this comment
Reply to this comment
Daniel Almeida:
Jul 08, 2025 at 07:16 PM
Hi,
> key advantage of rewriting mature PanVK kernel driver in Rust?
Let's clear this up a bit, Panthor is one of the kernel-mode drivers
that can be used by panvk, which is a user-mode driver that implements
Vulkan support. For your next question, I will assume you're referring
to Panthor, since that is what is being ported to Rust.
At this point, I think the benefits of using Rust are quite clear in
terms of safety against undefined behavior. This is one of the core
selling points of the language, and a _lot_ of problematic programs
don't even compile in Rust, so a lot of errors are caught at
compile-time.
In general, this is what the Rust-for-Linux initiative is trying to
address: the endless number of vulnerabilities that are usually present
in programs written in unsafe languages like C. Vulnerabilities which
tend to be exploited to the detriment of users.
You have mentioned static analyzers. Frankly, the kernel is full of
those, and yet it is clear that they are not capable of eliminating a
whole class of bugs. In this sense, adopting Rust is about attacking the
problem from multiple fronts instead of relying only on static analyzers
or any other tools for that matter.
But most importantly, what I want to convey to you is that this is not a
mere rewrite. We are not taking a finished product and merely
transcribing it into another programming language, because Panthor
itself, like most other software, will never be "done".
You have to maintain the code and address the issues or the software
rots. When new hardware variants come out, you have to support that too
or the driver stagnates in time. This means that Panthor is also an
evolving platform, and you can see it for yourself by looking at all the
code that we keep writing in order to improve it.
By understanding the evolving nature of our current C drivers, you can
think of Tyr as the next step in that process. We are bringing the
platform to Rust so we can iterate on it from the much stronger set of
guarantees offered by the language.
There will naturally be a transition period, but this is a fact of life
and not something that should constrain us per se, so long as we believe
that what emerges from that will benefit the community, and we do.
So this is not about "a group of developers that really don't like C",
but rather about transitioning our efforts into what we believe will
constitute a better platform for the future, which is a trend that is
being adopted by more and more players in our industry.
Besides, note that some a modern GPU driver depends on a set of distinct
components, and some of those will probably be redesigned from scratch
in Rust in order to address issues that are hard to fix in C.
A lot of those issues revolve around tracking the lifetime of different
entities in the code at runtime, or maybe completely parting ways with
designs that just do not fit our current goals anymore. In this sense,
we get to benefit from these new Rust components by using them from our
Rust driver as they become available, so this is good in terms of
future-proofing too.
I strongly recommend that you keep following this series of blog posts.
It will go into more detail on what I said above.
> It will be possible to split: leave all the old HW (before CSF/before
> v10 Mali arch) for the PanVK kernel driver and the new HW
> (CSF-based/v10+ Mali arch) for the Tyr? And then there will be no need
> to support legacy HW in Tyr and it will has better designe?
As I said, our current plan is to support CSF hardware in Tyr. We may
reassess this in the future, but nothing else has been decided for now.
> I am writing a lot to show that I want to understand what the
> advantage of Tyr will be and what is the motivation for rewriting
> PanVK in Rust?
Hopefully what I said so far addresses the first part of your question,
but again, note that PanVK itself (i.e. the Vulkan UMD) is _not_ getting
rewritten in Rust.
-- Daniel
Reply to this comment
Reply to this comment
q4a:
Jul 09, 2025 at 03:44 AM
Sorry, my mistake. It was long day, so I said "PanVK kernel driver" instead Panthor and then continued to make mistakes in the name. I understand the difference between PanVK (user-mode driver) and Panthor (kernel-mode driver) and I was talking about Panthor.
I compleatly understand that software like KMD for Mali GPU will never be "done", because there is v10-v13+ Mali arch (CSF-based) and there will be more.
For me (and probably others) it is quite difficult to understand the problems of further use of C or the advantages of using Rust (about tracking the lifetime of entities and similar), so I will follow the next posts.
Thanks.
Reply to this comment
Reply to this comment
Add a Comment