We're hiring!
*

Coccinelle for Rust progress report

Tathagata Roy avatar

Tathagata Roy
June 25, 2025

Share this post:

Reading time:

In collaboration with Inria (the French Institute for Research in Computer Science and Automation), Tathagata Roy shares the progress made over the past year on the CoccinelleForRust project, co-sponsored by Collabora.

Coccinelle is a tool for automatic program matching and transformation that was originally developed for making large-scale changes to the Linux kernel source code (i.e., C code). Matches and transformations are driven by user-specific transformation rules in the form of abstracted patches, referred to as semantic patches. As the Linux kernel—and systems software more generally—is starting to adopt Rust, we are developing Coccinelle for Rust to make the power of Coccinelle available to Rust codebases.

Example usage

This diff illustrates a patch in which the type_of function was being called before confirming that the target item’s trait was implemented. A straightforward CfR-based fix is to find every expression of the form

expression.type_of(impl_id)

and replace it with

expression.type_of(impl_id).subst_identity()

There are roughly fifty occurrences of this pattern in the diff, so updating them all by hand would be quite tedious. The accompanying Semantic Patch to perform this transformation automatically is:

@change_ty_of@
expression exp1, impl_id;
@@

-exp1.type_of(impl_id)
+exp1.type_of(impl_id).subst_identity()

While the above example could be achieved with a complicated and unreadable regex pattern, things can quickly become more complex.

The following patch changes a function signature and all the related calls.

@change_sig@
expression x;
identifier fname, sname;
@@

impl sname {
    ...
    pub(crate) fn fname(&self,
-       guard: &RevocableGuard<'_>
    ) -> Result { ... }
    ...
}

@modify_calls@
expression x, guard;
identifier change_sig.fname;
@@

x.fname(
    ...
-   guard
)

Rule change_sig finds all the occurrences of functions which take a guard of type &RevocableGuard<'_> and removes that parameter. The rule modify_calls updates the calls to that method.

This semantic patch can be used on a whole code-base where once a guard variable is no longer needed, it can be removed. It can also serve as an integration test to check that no such code is present in new pull requests.

Developments (2024–Present)

CTL Engine

  1. Core pattern-matching engine using CTL.
  2. Shared with C version, adapted for Rust’s expression-heavy syntax.
  3. Performance optimized with RefCells and hash tables.

SmPL Parsing

  1. SmPL includes Rust + custom syntax (.., +, -, disjunctions).
  2. Custom parser for SmPL constructs; uses Rust Analyzer for Rust code.

Rules

  1. Define code transformations scoped to an environment.
  2. Support for rule inheritance.
  3. Rule dependencies not yet implemented.

Language Features

Ellipses (...)

  1. Connects code segments within the same control flow.
  2. Uses CTL AU.
  3. Higher complexity due to Rust’s flexible syntax.

Note: ... when not yet supported.

Disjunctions

  1. Alternative code match options.
  2. Can combine with ellipses for complex patterns.

Developments in detail: 2024–Present

CTL Engine

Computational Tree Logic (CTL) is the heart of Coccinelle, which takes semantic patches and generalizes them over Rust files. Prior to using this engine, CfR used an ad-hoc method for matching patterns of code. This engine is the same as the one used for Coccinelle for C, with a few minor changes. Most of the changes were idiomatic but to the same effect. More information on the engine and its language (CTL-VW) can be found in the POPL Paper. With a standard engine, each step of the matching process can be logged, allowing us to learn and reuse the same design patterns from Coccinelle for C, including critical test cases.

The expression-dominated nature of Rust makes the matching and transformation process a bit different from that of C. For example, in the following semantic patch:

@@
expression e
@@

-foo(e);

for C, foo(e) would be guaranteed to be present as an immediate child of a block, i.e.:

{           // <- start of a block
    foo(e); // <- this statement
}

Blocks in C are present only in specific parts of the Abstract Syntax Tree, like in function definitions, loops, or conditional blocks. However, in Rust, blocks are expressions, which can appear anywhere an expression is allowed. For example:

while { f(&mut a); a > 1 } {
    //
}

This makes searching much more computationally intensive. Thus, several optimizations were implemented in CfR to address this problem, including replacing lists in the CTL engine with RefCells and hash tables.

Semantic Patch Language (SmPL)

While developing the parser for SmPL, we decided not to reinvent the wheel by writing a parser for the Rust language from scratch. SmPL contains custom syntax such as dots (...), disjunctions, and modifiers (+ and -). In the latest version, we parse only these constructs ourselves and hand off the rest to Rust Analyzer.

Rules

A rule refers to a set of changes given an environment. Multiple rules can inherit values from one another to transform code in different parts of a file.

Language features

Dots / Ellipses

Used in SmPL as ..., ellipses connect two blocks of code:

@@
expression q;
@@

drop_queue(q);
...
pop(q);

This is implemented in CTL using the AU term.

Disjunctions

Disjunctions allow for conditional matching:

f1(10);
(   // <--- disjunction start
foo(1);
|
bar(10);
)   // <--- disjunction end
f2();

Matches either:

f1(10);
foo(1);
f2();

or

f1(10);
bar(10);
f2();

Macros

Transforming macros posed a problem because of their non-standard nature. For example, should the following semantic patch match

@@
expression e;
@@

foo!(
- e
+ 2
);

this code?

foo!(a b c)

AND

foo!(a)

To avoid discrepancies, we support only macros which look like function calls. For example, foo!(a, b, c) or foo![a; b; c].

Miscellaneous

  1. Pretty Printing has been improved. The transformed code is formatted using rustfmt and it is then compared with the formatting from the original code. This way only the transformed code is formatted without messing up the original file formatting. Note: Pretty printing is still a work-in-progress for rust macros as they are notoriously hard to deal with and rustfmt thus leaves them alone.
  2. Better tests have been added.
  3. A more robust UI has been implemented, with various debugging flags.

Conclusion

Our current aim is to bring Coccinelle For Rust at par with Coccinelle For C in terms of basic functionalities. In the following months we are going on to work on:

  • Rule Dependance - It lets a rule run only if a condition is satisfied by the rules before it.
  • Scripting - Lets the user run arbitrary code for each match, allowing them to perform more things that are out of scope for now. This includes counting instances, matching with regex and performing other arbitrary operations.
  • Performance Improvements

If you want to try out CoccinelleForRust it is available on Gitlab. Please feel free to reach out to us at the email addresses on our website CoccinelleForRust, we would be happy to answer your questions!

Comments (0)


Add a Comment






Allowed tags: <b><i><br>Add a new comment:


Search the newsroom

Latest Blog Posts

Coccinelle for Rust progress report

25/06/2025

In collaboration with Inria, the French Institute for Research in Computer Science and Automation, Tathagata Roy shares the progress made…

Linux Media Summit 2025 recap

23/06/2025

Last month in Nice, active media developers came together for the annual Linux Media Summit to exchange insights and tackle ongoing challenges…

Constructor acquires, destructor releases

09/06/2025

In this final article based on Matt Godbolt's talk on making APIs easy to use and hard to misuse, I will discuss locking, an area where…

What if C++ had decades to learn?

21/05/2025

In this second article of a three-part series, I look at how Matt Godbolt uses modern C++ features to try to protect against misusing an…

Unleashing gst-python-ml: Python-powered ML analytics for GStreamer pipelines

12/05/2025

Powerful video analytics pipelines are easy to make when you're well-equipped. Combining GStreamer and Machine Learning frameworks are the…

Matt Godbolt sold me on Rust (by showing me C++)

06/05/2025

Gustavo Noronha helps break down C++ and shows how that knowledge can open up new possibilities with Rust.

Open Since 2005 logo

Our website only uses a strictly necessary session cookie provided by our CMS system. To find out more please follow this link.

Collabora Limited © 2005-2025. All rights reserved. Privacy Notice. Sitemap.