Detlev Casanova
February 25, 2026
Reading time:
Support for the VDPU381 and VDPU383 Rockchip video decoders has been merged into the Upstream Linux kernel. These decoders are found on modern SoCs, respectively the RK3588 and RK3576, and bring improved hardware decoding capabilities for H.264 and HEVC to mainline Linux.
This post highlights what we added, how we fixed a subtle IOMMU reset issue, a deliberate design choice in how we program hardware registers, and the introduction of new V4L2 UAPI controls required specifically for this hardware.
One of the more subtle issues encountered during development was related to IOMMU state restoration, and it stems from how VDPU381/383 integrates the decoder's IOMMU core.
On these IP cores, the IOMMU core is embedded inside the decoder itself. As a result, when the decoder is reset, typically to recover from a decoding error, the internal IOMMU is also reset. This reset clears all address mappings that had previously been programmed by the driver.
From the kernel’s point of view, those mappings were still valid and cached. In reality, the hardware had lost them entirely, leading to failed memory accesses or stalled decoding after error recovery.
The fix was to explicitly restore cached IOMMU mappings after a decoder reset by programming another empty IOMMU domain, then reprogramming the default IOMMU domain, avoiding any changes in the IOMMU driver:
[PATCH] media: rkvdec: Restore iommu addresses on errors
This change ensures reliable recovery from decoding errors and avoids subtle failures that only show up after a reset. It also has been applied to other affected Rockchip IP cores, like RGA, Rockchip's Raster Graphics Accelerator hardware.
Supporting VDPU381/383 also required extending the V4L2 stateless HEVC UAPI with two new controls. These decoders rely on explicit Reference Picture Set (RPS) programming, split into:
Most HEVC decoders can manage decoding frames based on the Sequence Parameter Set (SPS) data without the LT and ST reference sets, but the Rockchip ones do not. As opposed to the VeriSilicon decoders, the Rockchip decoders also don't implement a skip method to ignore those. As a result, new UAPI controls were introduced to allow userspace to pass fully described short-term and long-term RPS tables to the kernel.
Additionally, we added support in the visl driver that shows ftraces with all the controls parameters, useful when working on a userspace implementation.
We added support for the new controls in GStreamer 1.28 and preliminary work has been done for FFmpeg.
A key design goal for these controls was compatibility with the Vulkan Video Decode API.
The data structures closely mirror Vulkan’s HEVC reference picture descriptions, which means:
This alignment helps ensure that Linux media APIs evolve in a consistent direction and reduces friction for projects supporting multiple decode stacks.
struct for register programmingThe VDPU381/383 driver uses C structs to represent the full register layout, instead of relying on ad-hoc writel() calls or regmap. This decision was driven by specific hardware requirements rather than style preferences.
For these decoders, it is safer to write all registers, even those that match their documented default values. Skipping a write because its default value is correct can leave the hardware in an inconsistent internal state and cause decoding to fail.
Using a struct makes it straightforward to define a complete register image and guarantees that every register is programmed explicitly using a memcpy() flavor.
Even when all registers are written, the order of writes is significant. Writing the correct values in the wrong sequence can still break the decoder. This is mainly because Rockchip uses its own multimedia library mpp to test the hardware. That library writes all registers in order, making the hardware less robust against random register access.
Struct-based programming enforces a deterministic and reviewable ordering of register writes. In contrast, scattered writel() calls provide no structural guarantees and make it easy to accidentally reorder writes during refactoring.
One of the commits in the series explicitly documents these constraints and explains why ordering and default writes are required.
More details can be found in the struct-switching patch.
regmap?While regmap is often a good fit for register-heavy drivers, we want to have flexibility for later multi-core support where registers could be prepared while the hardware is still working on the previous frame. To do that, the registers need to not be attached to a specific core address, so that they can be used on the first available core, which is not possible with regmap.
The RK3588 features two VDPU381 cores, enabling parallel decoding in hardware. However, Upstream multi-core support is not yet enabled.
Supporting multi-core decoding correctly requires:
The struct-based register approach was chosen in part to prepare for this future work. With it, the driver can:
Multi-core support for RK3588 is actively being worked on and is one of the next steps for the driver.
The main difficulty regarding multi-core decoding is that the frames of H.264 or HEVC video stream will usually depend on previous frames being fully decoded already. That means scheduling jobs from 1 stream across multiple decoders is quite a hard task and may not yield a significant performance boost.
Our current implementation, which has not been upstreamed yet, doesn't do that.
Instead, it parallelizes the decoding of multiple streams, so that frames do not depend on each other. One of the main complications was the management of the IOMMU cores, but more on that later.
The upstreaming of VDPU381 and VDPU383 support required more than just enabling new hardware:
The result is a more robust and maintainable driver that aligns with Upstream expectations while accommodating the realities of modern Rockchip media hardware.
The decoders support other codecs, such as:
Christian Hewitt added support for the VDPU346 on RK356X SoCs, which is a variant of VDPU381.
As mentioned earlier, multi-core support is on its way for RK3588.
25/02/2026
Support for Rockchip’s VDPU381 and VDPU383 decoders is now upstream in Linux, bringing mainline H.264/HEVC decode support, robust IOMMU-reset…
19/02/2026
Weston 15.0 has arrived, bringing a brand new Lua-based shell for fully customizable window management, an experimental Vulkan renderer,…
18/02/2026
Collabora is excited to see Monado at the heart of the new OpenXR runtime for Android XR, a major milestone for Open Source XR interoperability.
Comments (1)
Mecid Urganci:
Feb 25, 2026 at 08:57 PM
Awesome achievement! I hope there’s a demo of this at Embedded World to see it in person :)
Reply to this comment
Reply to this comment
Add a Comment