We're hiring!
*

Automatic regression handling and reporting for the Linux Kernel

Ricardo Cañuelo Navarro avatar

Ricardo Cañuelo Navarro
March 14, 2024

Share this post:

Reading time:

In continuation with our series about Kernel Integration (check out part 1, part 2, and part 3), this post will go into more detail about how regression detection, processing, and tracking can be improved to provide a better service to developers and maintainers.

Traditionally, regressions are detected automatically by CI systems by running the same test cases on different versions of the software to test (in this case, the Linux kernel) and checking if a test that used to pass starts failing after a specific kernel commit. In the most ideal and straightforward case, this should be enough to point to the commit that introduced the bug. The CI system can then generate a regression report and send it to a mailing list or to the appropriate maintainers and developers, if they can be deduced from the suspicious commit.

In practice, though, very rarely do we find this ideal scenario. There are several circumstances that make this process much harder in different ways:

  • Normally there's no guarantee that there'll be a test run for each repo commit, so most of the time there isn't a single suspicious commit for a reported regression.
  • Tests that involve booting and running a machine are significantly more complicated than tests that simply run a software process in isolation. The more moving parts in a setup, the more things can go wrong.
  • This means that not all test failures are caused by bugs introduced in the kernel.
  • Test code isn't infallible, they too can contain bugs that may surface for multiple reasons.

As a consequence of these, it's always necessary to have a certain amount of human intervention when reporting a regression to the community. Normally, this human intervention means doing some initial filtering of results, triaging them depending on their importance and feasibility, narrowing down the possible causes, and providing additional information that's not always evident from the initial data provided by the CI system.

There are obvious downsides to this process, the most important of all being that it's not scalable: as the test space grows, more people will be needed to keep up. Automating this process as much as possible is crucial to grow the kernel test ecosystem from a useful tool to an integral and prevalent part of the development workflow.

How can the appropriate tools help us with this task? Here are some ideas:

Post-processing of regression data

The information provided by a CI system about a regression is, most of the time, a snapshot of what happened with that test when it failed. However, further processing of that result and other neighbor data across time can reveal more information that's usually hidden to the naked eye. For instance:

  • Detection of unstable tests: when a test is found to fail intermittently over different kernel versions, there's a higher probability that the test is unstable due to a bug in the test code, a timing issue, race conditions, or other external circumstances rather than because a commit introduced a bug in every pass-to-fail transition. Implementing smart filters and heuristics may help detect this type of scenario.
  • Detection of configuration-specific, target-specific or test setup issues: collecting information about similar tests, or about the same test on different kernel configurations, or on different target platforms may highlight if a test failed on a specific scenario that could help a human inspector either filter out possible causes or narrow down the bug investigation.
  • Detection of known patterns in the test output: there's a myriad of possible post-processing options to apply to a test output log to categorize and detect specific issues. These range from the simplest text parsing to find known messages, automatically diagnosing a failure (for example, a failure to boot because of a problem mounting the rootfs, a timeout while waiting for a DHCP request, etc.), to advanced ML-based analysis to profile a bug from a console log so that it can be matched against other known instances of the same (or similar) bug in other regressions.

Tracking the regression's life cycle

Even if the data provided by the CI systems included all of these improvements, there's still the issue of following up on the status of a reported regression.

Regressions are not static entities, they have a well defined life cycle: they're detected, reported, and investigated, then they're either filed as a non-issue (false positive, intended behavior, etc) or are fixed. The fixing process involves submitting a patch, reviewing it, testing it and, ultimately, merging it and then checking that the regression has cleared up after the patch was merged.

All of this happens with almost no visibility of who's working on what and at which stage of the process a regression is in. Thorsten Leemhuis created regzbot to help with this, it keeps track of the status of reported regressions by checking mailing lists and repos automatically. A way to vastly improve this would be to integrate these features into the CI systems themselves so that anyone could get the current status of any discovered regression and update it as needed, solving common user questions like:

  • "Has anyone claimed and started to work on this regression?"
  • "Does this regression have an associated patch submitted already?"
  • "When was this fixed? where can I find a link to the patch review?"

Better integration of bisection processes with regressions

Bisections are already an important part of many CI systems and they provide an automatic way of pointing to the commit that caused a regression, assuming that the repo history is linear and that the test is stable.

In some cases, however, bisections are triggered and managed as a separate process from testing. Making sure they are fully integrated into the test generation and report infrastructure would make it easier to match a regression with its related bisection process and vice-versa. This allows anyone to check right away after getting a regression report if the regression was bisected already and if there's a good candidate commit to investigate. In the best case, if the test is stable and the bisection process is trustworthy, the results can be automatically reported to the commit author.

Conclusion

As we continue working on kernel regressions we're still finding ideas for improvements and new features. A big part of the effort is to bring these topics to the community, find a way of providing these features in a manner that's useful for all of us, and align the different projects in the ecosystem together toward the same goal. Hopefully we'll get to a point where regression checking as a process is seamlessly integrated into every kernel developer workflow.

Comments (0)


Add a Comment






Allowed tags: <b><i><br>Add a new comment:


Search the newsroom

Latest Blog Posts

Re-converging control flow on NVIDIA GPUs - What went wrong, and how we fixed it

25/04/2024

While I managed to land support for two extensions, implementing control flow re-convergence in NVK did not go as planned. This is the story…

Automatic regression handling and reporting for the Linux Kernel

14/03/2024

In continuation with our series about Kernel Integration we'll go into more detail about how regression detection, processing, and tracking…

Almost a fully open-source boot chain for Rockchip's RK3588!

21/02/2024

Now included in our Debian images & available via our GitLab, you can build a complete, working BL31 (Boot Loader stage 3.1), and replace…

What's the latest with WirePlumber?

19/02/2024

Back in 2022, after a series of issues were found in its design, I made the call to rework some of WirePlumber's fundamentals in order to…

DRM-CI: A GitLab-CI pipeline for Linux kernel testing

08/02/2024

Continuing our Kernel Integration series, we're excited to introduce DRM-CI, a groundbreaking solution that enables developers to test their…

Persian Rug, Part 4 - The limitations of proxies

23/01/2024

This is the fourth and final part in a series on persian-rug, a Rust crate for interconnected objects. We've touched on the two big limitations:…

Open Since 2005 logo

We use cookies on this website to ensure that you get the best experience. By continuing to use this website you are consenting to the use of these cookies. To find out more please follow this link.

Collabora Ltd © 2005-2024. All rights reserved. Privacy Notice. Sitemap.