Tathagata Roy
June 25, 2025
Reading time:
In collaboration with Inria (the French Institute for Research in Computer Science and Automation), Tathagata Roy shares the progress made over the past year on the CoccinelleForRust project, co-sponsored by Collabora.
Coccinelle is a tool for automatic program matching and transformation that was originally developed for making large-scale changes to the Linux kernel source code (i.e., C code). Matches and transformations are driven by user-specific transformation rules in the form of abstracted patches, referred to as semantic patches. As the Linux kernel—and systems software more generally—is starting to adopt Rust, we are developing Coccinelle for Rust to make the power of Coccinelle available to Rust codebases.
This diff illustrates a patch in which the type_of function was being called before confirming that the target item’s trait was implemented. A straightforward CfR-based fix is to find every expression of the form
expression.type_of(impl_id)
and replace it with
expression.type_of(impl_id).subst_identity()
There are roughly fifty occurrences of this pattern in the diff, so updating them all by hand would be quite tedious. The accompanying Semantic Patch to perform this transformation automatically is:
@change_ty_of@
expression exp1, impl_id;
@@
-exp1.type_of(impl_id)
+exp1.type_of(impl_id).subst_identity()
While the above example could be achieved with a complicated and unreadable regex pattern, things can quickly become more complex.
The following patch changes a function signature and all the related calls.
@change_sig@
expression x;
identifier fname, sname;
@@
impl sname {
...
pub(crate) fn fname(&self,
- guard: &RevocableGuard<'_>
) -> Result { ... }
...
}
@modify_calls@
expression x, guard;
identifier change_sig.fname;
@@
x.fname(
...
- guard
)
Rule change_sig finds all the occurrences of functions which take a guard of type &RevocableGuard<'_> and removes that parameter. The rule modify_calls updates the calls to that method.
This semantic patch can be used on a whole code-base where once a guard variable is no longer needed, it can be removed. It can also serve as an integration test to check that no such code is present in new pull requests.
RefCells and hash tables..., +, -, disjunctions)....)AU.Note: ... when not yet supported.
Computational Tree Logic (CTL) is the heart of Coccinelle, which takes semantic patches and generalizes them over Rust files. Prior to using this engine, CfR used an ad-hoc method for matching patterns of code. This engine is the same as the one used for Coccinelle for C, with a few minor changes. Most of the changes were idiomatic but to the same effect. More information on the engine and its language (CTL-VW) can be found in the POPL Paper. With a standard engine, each step of the matching process can be logged, allowing us to learn and reuse the same design patterns from Coccinelle for C, including critical test cases.
The expression-dominated nature of Rust makes the matching and transformation process a bit different from that of C. For example, in the following semantic patch:
@@
expression e
@@
-foo(e);
for C, foo(e) would be guaranteed to be present as an immediate child of a block, i.e.:
{ // <- start of a block
foo(e); // <- this statement
}
Blocks in C are present only in specific parts of the Abstract Syntax Tree, like in function definitions, loops, or conditional blocks. However, in Rust, blocks are expressions, which can appear anywhere an expression is allowed. For example:
while { f(&mut a); a > 1 } {
//
}
This makes searching much more computationally intensive. Thus, several optimizations were implemented in CfR to address this problem, including replacing lists in the CTL engine with RefCells and hash tables.
While developing the parser for SmPL, we decided not to reinvent the wheel by writing a parser for the Rust language from scratch. SmPL contains custom syntax such as dots (...), disjunctions, and modifiers (+ and -). In the latest version, we parse only these constructs ourselves and hand off the rest to Rust Analyzer.
A rule refers to a set of changes given an environment. Multiple rules can inherit values from one another to transform code in different parts of a file.
Used in SmPL as ..., ellipses connect two blocks of code:
@@
expression q;
@@
drop_queue(q);
...
pop(q);
This is implemented in CTL using the AU term.
Disjunctions allow for conditional matching:
f1(10);
( // <--- disjunction start
foo(1);
|
bar(10);
) // <--- disjunction end
f2();
Matches either:
f1(10);
foo(1);
f2();
or
f1(10);
bar(10);
f2();
Transforming macros posed a problem because of their non-standard nature. For example, should the following semantic patch match
@@
expression e;
@@
foo!(
- e
+ 2
);
this code?
foo!(a b c)
AND
foo!(a)
To avoid discrepancies, we support only macros which look like function calls. For example, foo!(a, b, c) or foo![a; b; c].
rustfmt and it is then compared with the formatting from the original code. This way only the transformed code is formatted without messing up the original file formatting. Note: Pretty printing is still a work-in-progress for rust macros as they are notoriously hard to deal with and rustfmt thus leaves them alone.Our current aim is to bring Coccinelle For Rust at par with Coccinelle For C in terms of basic functionalities. In the following months we are going on to work on:
If you want to try out CoccinelleForRust it is available on Gitlab. Please feel free to reach out to us at the email addresses on our website CoccinelleForRust, we would be happy to answer your questions!
02/12/2025
As an active member of the freedesktop community, Collabora was busy at XDC 2025. Our graphics team delivered five talks, helped out in…
24/11/2025
LE Audio introduces a modern, low-power, low-latency Bluetooth® audio architecture that overcomes the limitations of classic Bluetooth®…
17/11/2025
Collabora’s long-term leadership in KernelCI has delivered a completely revamped architecture, new tooling, stronger infrastructure, and…
11/11/2025
Collabora extended the AdobeVFR dataset and trained a FasterViT-2 font recognition model on millions of samples. The result is a state-of-the-art…
31/10/2025
Collabora has advanced Monado's accessibility by making the OpenXR runtime supported by Google Cardboard and similar mobile VR viewers so…
27/10/2025
By resolving critical synchronization bugs in Zink’s Vulkan–OpenGL interop, Faith Ekstrand paved the way for Zink+NVK to become the default…
Comments (0)
Add a Comment