Gabriel Krisman Bertazi
August 27, 2020
Linux 5.2 was released over one year ago and with it, a new feature was added to support optimized case-insensitive file name lookups in the Ext4 filesystem - the first of native Linux filesystems to do it. Now, one year after this quite controversial feature was made available, Collabora and others keep building on top of it to make it more and more useful for system developers and end users. Therefore, this seems like a good time as any to take a look on why this was merged, and how to put it to work.
More recently, f2fs has started to support this feature as well, following the Ext4 implementation and framework, thanks to an effort led by Google. Most, if not all, of the information described here also applies to f2fs, with small changes on the commands used to configure the superblock.
A file name is a text string used to uniquely identify a file (in this context, "directory" is the same as a file) at a specific level of the directory hierarchy. While, from the operating system point of view, it doesn't matter what the file name is, as long as it is unique, meaningful file names are essential for the end user, since it is the main key to locate and retrieve data. In other words, a meaningful file name is what people rely upon to find their valuable documents, pictures and spreadsheets.
Traditionally, Linux (and Unix) filesystems have always considered file names as an opaque byte sequence without any special meaning, requiring users to submit the exact match of the file to find it in the filesystem. But that is not how humans operate. When people write titles, "important report.ods" and "IMPORTANT REPORT.ods" usually mean the same piece of data, and you don't care how it was written when creating it. We care about the content and the semantics of the words IMPORTANT and REPORT.
In English, the only situation where different spelling of a word mean the same thing is when dealing with uppercase and lowercase, but for other languages, that is not the case. Some languages have different scripts to represent the same information and it makes sense, for a user, to not care about which different writing system the file was titled originally, when retrieving the data later.
Most of these linguistic differences have been solved by userspace applications in the past, but bringing this knowledge into the kernel allow us to resolve important bottlenecks for applications being ported from other operating systems, like windows Games, who cannot be simply recompiled to understand it is running on Linux and that the filesystem is now case-sensitive. In fact, making the kernel understand the process of language normalization and casefolding allow us to optimize our disk storage, such that the system can quickly retrieve the information requested. The end result is clear: a much more user-friendly Linux experience for end-users and a much better platform to run beloved Windows games with Steam on Linux.
⚠ This is very important.
Before enabling it, make sure your kernel supports case-insensitive Ext4, and that the encoding version you plan to use is supported.
The kernel supports case-insensitive Ext4 if it was built with CONFIG_UNICODE=y. If you are not sure, you can verify it on a booted kernel by reading the sysfs file below. If it doesn't exist, case-insensitive was not compiled into your kernel.
$ cat /sys/fs/ext4/features/casefold
Currently, the kernel supports UTF-8 up to version 12.1. mkfs will always choose the latest version, but attempting to run a filesystem with a more recent UTF-8 version than the kernel supports is risky, and to preserve your data, the kernel will refuse to mount such filesystem. To solve this issue, a kernel update is required or mkfs can be configured to use an older version.
A patch is queued for next release for the kernel to report on sysfs the latest supported revision of unicode. Notice that the following file might not be available in your system, even if CONFIG_UNICODE exists.
$ cat /sys/fs/unicode/version
First of all, make sure you've read the section "Before enabling". Failing to follow those instructions may render your filesystem unmountable in your current kernel.
To enable the feature, it takes two steps: one is to enable the filesystem-wide casefold feature on the volume's superblock. This doesn't immediately make any directories case-insensitive, so don't worry, but it prepares the disk to support casefolded directories. It also configures what encoding will be used.
The second step is to configure a specific directory to be case-insensitive. But first, let's see how to create a disk supporting case-insensitive.
When creating a filesystem, you need to set the casefold feature in mkfs:
$ mkfs -t ext4 -O casefold /dev/vda
After that, when mounting the filesystem, you can verify that the filesystem correctly has the feature:
$ dumpe2fs -h /dev/vda | grep 'Filesystem features' dumpe2fs 1.45.6 (20-Mar-2020) Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent 64bit flex_bg casefold sparse_super large_file huge_file dir_nlink extra_isize metadata_csum
The feature is enabled in the filesystem in /dev/vda if the line above includes the feature 'casefold'.
Alternatively, you can mount the filesystem and check dmesg for the mount line:
$ mount /dev/vda /mnt $ dmesg | tail EXT4-fs (vda): Using encoding defined by superblock: utf8-12.1.0 with flags 0x0 EXT4-fs (vda): mounted filesystem with ordered data mode. Opts: (null)
From the output above, vda was mounted with case-insensitive enabled and utf8-12.1.0.
Historically, any byte, other than the trailing slash ('/') and the null byte ('\0'), is a valid part of filenames. This is because Unix filesystems see filenames as a sequence of slash-separated components that are just opaque byte sequences, without any meaning assigned to them. Higher level userspace software give them meaning by seeing them as characters for rendering. When talking about case-insensitive, nevertheless, the kernel needs to inspect and understand what a character really is and what the rules are for case-folding. That is the reason we adopt an encoding in the kernel, like we did with UTF-8. But, for any encoding one may choose, the requirements of what a valid name is, is much more strict. In fact, there are several sequences that are simply invalid text in UTF-8. When a program asks the kernel to create a file with those names, the kernel needs to decide whether pretend the name is valid somehow or to throw an error to the application.
The vast majority of applications don't care about case-insensitiveness, and expect a filename to just be accepted, as long as it is a valid Unix name. These applications will fail if the kernel throws an error on what they expect is a valid name, so by default, if an application tries to use an invalid name on a case-insensitive directory, the kernel will just let it happen, and treat that single file as an opaque byte sequence. This is fine, but case-insensitive will not work for that file only.
There are cases, on the other hand, were we want to be strict on what is accepted by the filesystem. Having bad filenames mixed with good ones is confusing, and open space for programs to misbehave. For those users, though, ext4 has an strict mode, which causes any attempt to create or rename a file with a bad name to fail and return an error to the application.
To build an Ext4 filesystem with strict mode enabled, use:
$ mkfs -t ext4 -O casefold -E encoding_flags=strict /dev/vda
If everything went fine and tune2fs returned without any errors, next time you mount this filesystem your kernel log will show something like the line below:
$ mount /dev/sda1 /mnt $ dmesg | tail EXT4-fs (sda1): Using encoding defined by superblock: utf8-12.1.0 with flags 0x0
It has two important pieces of information. The first, is the encoding used which, in the example above, is UTF-8 supporting the version 12.1.0 of the Unicode specification. The second piece information is the flags argument, in this case 0x0, which modifies the behavior of the filesystem when dealing with casefolding directories.
At the time of this writing, the only flag supported is the Strict mode, in which case the flag mask would be 0x1.
After mounting a case-insensitive enabled filesystem, it is now possible to flip the 'Casefold' inode attribute ('+F') in empty directories to make the lookup of files inside them case-insensitive:
$ mkdir CI_dir $ chattr +F CI_dir
With that setting enabled, the following should succeed, instead of the last command returning "No such file or directory."
$ touch CI_dir/hello_world $ ls CI_dir/HELLO_WORLD
The directory case-sensitiveness can be verified using lsattr. For instance, in the example below, the F letter indicates that the CI_dir directory is case-insensitive.
$ lsattr . -------------------- ./CS_dir ----------------F--- ./CI_dir
To revert the setting, and make CIdir case-insensitive once again, the directory must be emptied, and then, the Casefold attribute removed:
$ rm CI_dir/* $ chattr -F CI_dir $ lsattr . -------------------- ./CS_dir -------------------- ./CI_dir -------------------- ./lost+found
It is a bit annoying to require the directory to be empty to flip the case-insensitive flag, but that is a technical requirement at the moment and unlikely to change in the future. In fact, to make the data of a case-insensitive directory accessible in a case-sensitive manner, it would be much easier to move it to a new directory:
$ mkdir CS_dir $ mv CI_dir/* CS_dir/ $ rm -r CI_dir
Would have a similar effect, from a simple point of view.
The Casefold flag recurses into nested directories. Therefore:
$ mkdir CI_dir $ chattr +F CI_dir $ mkdir CI_dir/foo $ lsattr CI_dir ----------------F--- CI_dir/foo
It is possible to mix case-insensitive and case-sensitive directories in the same tree:
$ mkdir CI_dir $ chattr +F CI_dir $ mkdir CI_dir/foo $ chattr -F foo $ lsattr . ----------------F--- CI_dir $ lsattr CI_dir -------------------- CI_dir/foo
Remember however, in the examples above, the order of commands matter, since a directory cannot have its Casefold attribute flipped if it is not empty.
Currently, only UTF-8 encoding is supported, and I am not aware of plans to expand it to more encodings. While different encodings make a lot of sense for Eastern languages speakers for encoding compression reasons, I'm not aware of anyone currently working on it for Linux.
With that said, the Linux implementation performs the Canonical Decomposition normalization process before comparing strings. That means that canonically equivalent characters can be correctly searched using a different normalized name. For instance, in some languages like German, the upper-case version of the letter ß (Eszett), is SS (or U+1E9E ẞ LATIN CAPITAL LETTER SHARP S). Thus, it makes sense for a German speaker to look for a file named "floß" (raft, in English), using the string "FLOSS":
$ touch CI_dir/floß $ CI_dir/FLOSS
There are also multiple ways to combine accented characters. Our method ensures, for instance that multiple encodings of the word café (coffee, in portuguese) can be interchangeable on a casefolded lookup.
Let's see something cool. For this to work, you might want to copy-paste the command below, instead of typing it. Let's create some files:
$ touch CI_dir/café CI_dir/café CS_dir/café CS_dir/café
How many files where created? Can you explain it?
The case-insensitive feature as implemented in Ext4 is a non-intrusive mechanism to support this feature for those who need it, while minimizing impact to other applications. Given the per-directory nature, it is safe to enable the feature bit filesystem-wide and let applications enable it on directories as needed. It is simple to use and should yield higher performance for user space applications that previously had to emulate it in userspace.
Hopefully, we will soon see this feature being enabled by default for distro kernels.
In the embedded world, many modern SoCs such as the ST Microelectronics STM32MP1 now include coprocessor cores which can be used for a wide…
Our recent efforts on the Hantro kernel driver have resulted in the addition of H.264 decoding support and multiple performance improvements.…
Hwangsaeul, or H8L, a remote surveillance streaming solution, utilizes the capability of libsrt to collect statistics from open SRT sockets…
Complex, real-world correctness tests and performance analysis are now possible thanks to gltrim, a new tool recently added to apitrace,…
Earlier this week, WebRTC became an official W3C and IETF standard. GStreamer has a powerful and rapidly maturing WebRTC implementation.…
Last year, from June to September, I worked on the kernel development tool Coccinelle under Collabora. I implemented a performance boosting…