Advocating a better Kernel Integration for all

Advocating a better Kernel Integration for all

Gustavo Padovan
December 01, 2023

Share this post:

Reading time:

How can we help developers and maintainers integrate code more efficiently? How can we help the Linux-based tech industry continuously integrate the latest Linux kernel in their system faster, and with less hassle? How can we identify, report, and fix kernel regressions as fast as possible? How can we mitigate maintainer burnout?

We recently returned from Linux Plumbers Conference 2023 in Richmond, Virginia, where we explored these questions with the community, identified common problems, and envisioned how existing solutions can evolve to complement each other to solve bigger problems for the Linux Kernel Ecosystem.

As many know, there are a lot of interesting tools and services to help developers and maintainers to integrate code in the mainline kernel. 0-day, syzbot, kunit, kselftests, regzbot, KernelCI, smatch, kdevops, kvm-xfstests, BPF-CI, drm/ci, and Linux Test Project(LTP), can all add value to the community in one way or another.

However, having such a validation infrastructure along with all the data it can generate does not necessarily mean that we can make the most out of them. There simply is not enough cross-coordination between all the tools and services, making it hard for maintainers to benefit from all the available infrastructure.

With maintainers already burnt out, getting data from different sources may worsen things as it adds extra time to look into these various sources, while also making it hard for them to learn where the highest priorities are. We must address this situation so maintainers can benefit from such an infrastructure and reduce their workload rather than amplify their burnout.

Of course, there is not a one-size-fits-all solution for this situation. Step by step, we must evolve and grow the entire testing and validation infrastructure we have in the community, and improve the overall quality to bring more benefits to the entire ecosystem.

In today's article and follow-up articles in our coming-soon Kernel Integration series, we will bring different discussions not only on the work Collabora is doing, but also on connecting distinct pieces of work happening across the community to design solutions.

Where we began, where we are, and where we are going

Collabora has been contributing to KernelCI for a few years already. We also created Mesa CI and drm/CI. In order to run the tests, we assembled a test laboratory in one of our offices that runs hundreds of thousands of automated tests per month in real hardware.

However, once we started looking at the amount of test data produced, we realized that it was still quite hard to organize and evaluate all the kernel test results available. We lacked the proper tools or knowledge to efficiently do so, so we began a research project.

Test systems for the kernel, as it is the case for KernelCI today, do not have the concept of tracking a test regression across different kernel and hardware configurations. It also does not know how to track the regression over time to understand if it has been reported, or if a fix has been proposed already.

At the time, we began some experiments by developing knowledge about matching different test regressions into an unique kernel regression, identifying flakiness in the tests (or the hardware), and more. We will discuss this topic in more detail in future articles of the Kernel Integration series. If you want to learn more, check out the LPC discussion from Gustavo Padovan and Ricardo Cañuelo, Unifying and improving test regression reporting and tracking (video).

One of the things that we learned in these experiments was that the quality of the tests are not always great. Let’s take the device driver probing for example. The existing test to verify that a device had successfully probed relied on unstable kernel interfaces and would often break between kernel versions, producing flaky results.

To attempt to address that issue, we started working together with the kernel community to develop tests that can give us finer grained insights about the potential location of the failure. We merged a kselftest upstream for device probe on device-tree based hardware and are also working on similar upstream kselftest for ACPI based devices. There is also an effort around USB and PCI. We will have a dedicated blog post about it quite soon, however if you would like a head start, take a look at Nicolas Prado and Laura Nao’s LPC talk, Detecting failed device probes (video).

To conclude for today, for Collabora, beyond just technical achievement, there is a mindset shift that we need to carry out. Kernel Integration is a large and expensive problem that affects the entire Linux-based tech industry. It is not just maintainers and developers who suffer from lacking proper support to do their jobs more efficiently. It is also an entire industry that faces huge challenges trying to keep up with upstream to deliver stability, security and new features to their products and services.

In the coming weeks and months Collabora will share more articles in the Kernel Integration series as we make progress and touch new areas of work. Stay tuned!

Patch Ready for Linux Plumbers 2023

Pushing testing laboratory performance limits by benchmarking LAVA - Part 1

KernelCI now testing Linux Rust code

Patch Ready for Linux Plumbers 2023

Pushing testing laboratory performance limits by benchmarking LAVA - Part 1

KernelCI now testing Linux Rust code

Search the newsroom

Latest Blog Posts

Simplifying Bluetooth qualification for Linux/BlueZ: New upstream documentation

26/05/2026

New upstream BlueZ documentation helps simplify Bluetooth qualification for Linux-based products by mapping supported profiles, test requirements,…

Building Tyr in Rust: CSF architecture and booting the MCU

14/05/2026

See how Tyr moves beyond MCU firmware boot to build the group, queue, VM, submission, and completion paths needed to run real Vulkan workloads…

Optimizing memory access in NIR

07/05/2026

A complete breakdown of Mesa’s NIR compiler detailing how it optimizes shader memory access with SSA promotion, deref analysis, copy propagation,…

BlueZ-powered Auracast broadcasting on Genio 700

05/05/2026

Collabora brought Bluetooth Auracast broadcasting to MediaTek Genio 700 for Embedded World 2026. Here's the complete, fully Open Source…

Making the invisible audible: Building an OpenXR experience for ocean protection

22/04/2026

Using our XR expertise, Collabora created a standalone XR experience for our 1% for the Planet partner, SOMAR, to showcase the direct impact…

Bringing BitNet to ExecuTorch via Vulkan

17/04/2026

BitNet-style ternary brings LLM inference to ExecuTorch via its Vulkan backend, enabling much smaller, bandwidth-efficient models with portable…

About Collabora

Whether writing a line of code or shaping a longer-term strategic software development plan, we'll help you navigate the ever-evolving world of Open Source.

한국의 국기 한국어 버전의 Collabora.com 보기