Improving the reliability of file system monitoring tools

Improving the reliability of file system monitoring tools

Gabriel Krisman Bertazi
March 14, 2022

Share this post:

Reading time:

A fact of life, one that almost every computer user has to face at some point, is that file systems fail. Whether it is for an unknown reason, usually explained to managers as Alpha particles flying around the data center, or a more mundane (and way more likely) reason - a software bug - users don't usually enjoy losing their data for no reason. This is why file system developers put a huge effort in not only testing their code, but also in developing tools to recover volumes when they fail. In fact, all persistent file systems deployed in production are accompanied by check and repair tools, usually exposed through the fsck front-end. Some even go a step further with online repair tools.

fsck, the file system check and repair tool, is usually run by an administrator when they suspect the volume to be corrupted, sometimes following a mount command that failed. It is also run at boot-time on every few boots in almost every distro, through the systemd-fsck service, or equivalent logic.

Indeed, fsck is quite efficient in recovering from errors of several file systems, but it sometimes requires placing the file system offline and either walking through the disk to check for errors, or poking the super block for an error status. It is not the right tool to monitor the health of a file system in real-time, raising alarms and sirens when a problem is detected.

This kind of real-time monitoring is quite important to ensure data consistency and availability in data centers. In fact, it is essential that administrators or recovery daemons be notified as soon as an error occurs, such that they can start emergency recovery procedures, like kickstarting a backup, rebuilding a RAID, replacing a disk or maybe just running fsck. And, once one needs to watch over a large quantity of machines, like in a cloud provider with hundreds of machines, a reliable monitoring tool is essential.

The problem is that Linux didn't really expose a good interface to notify applications when a file system error happened. There wasn't much going on other than the error code returned to the application that executed the failed operation, which doesn't tell much about the cause of the error, nor is useful for a health monitoring application. Therefore, the approach taken by the existing monitoring tools was to either watch the kernel log, which is a risky business, since it might be wrapped by newer messages, or to query file system specific sysfs files, which register the last error. Both approaches are polling mechanisms, subject to missing messages that would cause the notification to be lost.

This is why we worked on a new mechanism for closely monitoring volumes and notifying recovery tools and sysadmins in real-time that an error occurred. The feature, merged in kernel 5.16, won't prevent failures from happening, but will help reduce the effects of such errors by guaranteeing any listener application receives the message. A monitoring application can then reliably report it to system administrators and forward the detailed error information to whomever is unlucky enough to be tasked with fixing it.

The new mechanism leverages the fanotify interface by adding a new FAN_FS_ERROR event type, which is issued by the file systems code itself, whenever an error is detected. By leveraging fanotify, the event is now tracked on an dedicated event queue to the listener, and it won't get overwritten by further errors. We also made sure that there is always enough memory to report it, even on low memory conditions.

The kernel documentation explains how to receive and interpret a FAN_FS_ERROR event . There is also an example tracer implementation in the kernel tree.

The feature, which is already on the upstream Linux kernel, will soon pop up in distribution kernels, and be taken up by distros around the globe. Soon enough, we will have better file system error monitoring tools on data centers, and also on our Linux desktops.

Kernel 5.16: A new release for a new year

Landing a new syscall: What is futex?

Asymmetric Multi Processing with Linux & Zephyr on the STM32MP1

Kernel 5.16: A new release for a new year

Landing a new syscall: What is futex?

Asymmetric Multi Processing with Linux & Zephyr on the STM32MP1

Comments (0)

Add a Comment

Search the newsroom

Latest Blog Posts

PipeWire workshop 2025: Updates on video transport, Rust efforts, TSN networking, and Bluetooth support

03/07/2025

As part of the activities Embedded Recipes in Nice, France, Collabora hosted a PipeWire workshop/hackfest, an opportunity for attendees…

Coccinelle for Rust progress report

25/06/2025

In collaboration with Inria, the French Institute for Research in Computer Science and Automation, Tathagata Roy shares the progress made…

Linux Media Summit 2025 recap

23/06/2025

Last month in Nice, active media developers came together for the annual Linux Media Summit to exchange insights and tackle ongoing challenges…

Constructor acquires, destructor releases

09/06/2025

In this final article based on Matt Godbolt's talk on making APIs easy to use and hard to misuse, I will discuss locking, an area where…

What if C++ had decades to learn?

21/05/2025

In this second article of a three-part series, I look at how Matt Godbolt uses modern C++ features to try to protect against misusing an…

Unleashing gst-python-ml: Python-powered ML analytics for GStreamer pipelines

12/05/2025

Powerful video analytics pipelines are easy to make when you're well-equipped. Combining GStreamer and Machine Learning frameworks are the…

About Collabora

Whether writing a line of code or shaping a longer-term strategic software development plan, we'll help you navigate the ever-evolving world of Open Source.

한국의 국기 한국어 버전의 Collabora.com 보기