Gustavo Noronha
May 21, 2025
Reading time:
In my previous article I discussed a number of shortcomings of C++ that are essentially solved by Rust when it comes to making your code easy to use and hard to misuse in very basic things. No memory safety involved, just making sure the user of a function cannot swap arguments for quantity and price by mistake.
What inspired me to do that write up was a great talk by Matt Godbolt explaining several things you can do to make your C++ interfaces more robust — Correct by Construction: APIs That Are Easy to Use and Hard to Misuse. You should watch it!
In that article I said that part of the reason Rust was so much better at helping you is probably because it had decades to learn. After all, C++ was first released in the early 80s, Rust in the early 2010s. If you gave C++ decades to learn, certainly the new designs that came out would also be high quality and hard to misuse.
Right?
Today we are looking at an example Matt gave on his talk for preventing misuse of a class after destructive state transition. Let me lay out the context. Imagine you have a class for managing the compilation of shaders. You can create the object, add as many shaders as you want, compile them, and then query the object for the compiled shaders:
#include #include #include class CompiledShader { public: explicit CompiledShader(std::string name) : m_name(name) {}; std::string m_name; }; class ShaderRegistry { public: // can only be called before compiling void add(const char *shader); // once all shaders are added, compile // can only be called once void compile(); // get a compiled shader by name. // must be compiled and linked! const CompiledShader &get_compiled(const char *name) const; std::vector m_names; std::vector m_compiled; };
There is a smell coming from the code above, and it's the comments. He points out these "apologetic" comments are a good indication that this API is easy to misuse.
You may imagine such a registry would be holding on to certain resources that are dumped or have their ownership moved after compilation, making the object unfit for a new compilation step. More importantly, as a user you can try to get the compiled shader before it has been compiled. His solution to the latter is to split this class in 2:
class CompiledShaders { public: explicit CompiledShaders(const std::vector &names); const CompiledShader &get_compiled(const char *name) const; std::vector m_compiled; }; class ShaderCompiler { public: void add(const char *shader); // Resources used in compilation are // transferred to the CompiledShaders: // you cannot call compile() twice!! CompiledShaders compile() const; std::vector m_names; };
This is a much nicer API, a lot clearer! You simply cannot call get_compiled() before compiling now, as that method is only available in the object you get from compiling. However, there are still a couple of problems. You still have the ShaderCompiler laying around after calling compile(), so nothing stops you from trying to use it again:
// Wrong ShaderCompiler compiler; compiler.add("alice"); auto registry = compiler.compile(); compiler.add("bob"); auto wat = compiler.compile(); auto shader = registry.get_compiled("bob"); std::cout << shader.m_name << std::endl;
Nothing the compiler can do, really. It doesn't know how to read your comments and nothing in the class definition enforces the rules that add() cannot be called after compile(), or that compile() cannot be called twice. But we should be able to improve on that! And Matt goes for it.
It's C++'s turn to shine. With decades upon decades of lessons learned in language design and almost 3 whole decades of learning with C++ itself, the committee has a great opportunity to address the problem as described. They worked for many years on a new major version of the standard that became known as C++11, as it was released in 2011 — only a few months before Rust's very first public release.
Something that would be amazing for the case we are discussing is a way to express that an object has "moved" — that's one way to conceptually think about what we are doing, we are moving state out of ShaderCompiler into CompiledShaders.
That should not only make it easier to reason about this, as the compiler would know that from that point on the object state is conceptually gone, but it should also help with performance, as the compiler can use that knowledge to move certain data rather than copying — more control over what happens to memory, a big selling point for system languages.
And guess what C++ added in C++11? A major new feature called move semantics. The first building block is the && operator, which specifies an rvalue reference — in simple terms, a reference to a value that is about to be destroyed when the current statement ends. On top of that they added std::move(), which, as the name implies, moves the… no, wait, that's not right.
What it actually does is cast whatever you give it to && — an rvalue reference. Phew, that's ok then, since we know rvalue references are a way of saying "this will be immediately destroyed, so you can move from it". The solution becomes simple:
class ShaderCompiler { public: void add(const char *shader) { m_names.push_back(shader); }; CompiledShaders compile() && { CompiledShaders shaders = CompiledShaders(m_names); return shaders; } std::vector m_names; }; int main(void) { // Correct ShaderCompiler compiler; compiler.add("alice"); compiler.add("bob"); auto registry = std::move(compiler).compile(); auto shader = registry.get_compiled("bob"); std::cout << shader.m_name << std::endl; }
Matt changes the definition for compile() so that it becomes an "rvalue method", by suffixing it with &&, and we std::move() the object when calling it, which is now required because of that method suffix. Awesome. So now the compiler should protect us from misusing ShaderCompiler like this:
// Wrong ShaderCompiler compiler; compiler.add("alice"); auto registry = std::move(compiler).compile(); compiler.add("bob"); auto wat_bang = std::move(compiler).compile(); auto shader = registry.get_compiled("bob"); std::cout << shader.m_name << std::endl;
We called compile() explicitly calling std::move() on the object. Doing it a second time would be conceptually equivalent to a use after free. We are being intentional and explicit; the compiler should reject this code or, at the very least, complain loudly about use after move. Here's the full compiler output:
kov@couve > clang++ -std=c++23 -Wall -Wextra -Werror -Wpedantic move/move-3.cpp kov@couve >
That's right, it compiles without a single warning. If you go digging you'll learn that at most what you get from this whole thing is the potential for the compiler to notice an opportunity to avoid a copy — it's not even guaranteed.
It's actually very likely not to happen once you know all the "buts". More likely you'll do a whole lot of work, add a bunch of new constructors, sprinkle && and std::move() everywhere and get no benefit at all from it, while still being able to use objects after they have potentially been emptied by the move optimization, when it does happen. Not only that, there have been cases in which using std::move() will prevent optimization from happening!
Well done, C++, it takes a lot of skill to miss such an obvious opportunity by so much. Might as well stick to the smelly comment, to be quite honest.
Matt points out a static checker like clang-tidy will complain, but really this should be the compiler's job, especially on such an obvious case.
When I think of the C++ committee, I picture a group of people focused on making sure new features will be hard to use correctly and with as many pitfalls as possible, ensuring CppCon gets a consistent stream of hour-long talks about how things will mostly not work as you expect most of the time. To be clear, I'm not angry at Nicolai, I'm angry at him for having to do that talk!
Can you tell this is my main C++ pet peeve?
Let's be honest, improving on what we just saw is not that hard. But this is where we start getting into the more well-known aspect of Rust: its ownership and borrowing model. Rust makes it explicit in the way you refer to a value what is going to happen with it.
This is very easy to see when looking at methods we attach to a struct:
impl MyType { fn read_something(&self) -> Something { ... } fn modify_something(&mut self) { ... } fn consume(self) { ... } }
Just looking at the definitions you have all the information you need about ownership and type of access. read_something() takes a regular reference, known as a borrow, which means non-exclusive, potentially shared access, ownership is not transferred. It's a bit like putting the const marker at the end of a C++ method declaration.
We also know the Something that is being returned is a value that we, as the caller, own, since there is no & attached to it. It's ours to do whatever we need, including handing its ownership to some other function or data structure. If we leave the scope without doing that, it's freed.
On modify_something() we use a mutable reference to self, also known as a mutable borrow, which means the function has exclusive access — we know there are no other references out there, checked by the borrow checker at compile time. Ownership is also not transferred in this case.
Finally, on consume() we get a naked self. This tells us, like in the Something return value discussed above, that that function now owns the value it's being called on and it will be destroyed when its scope ends — unless it passes it on, or returns it.
Note that this is not something special to methods and self. Any naked variable or type is an explicit move with transfer of ownership. These rules make it very straightforward to ensure we get the protection we want:
struct CompiledShaders { ... } impl CompiledShaders { fn get(&self, name: &str) -> Option<&CompiledShader> { ... } } struct ShaderCompiler { ... } impl ShaderCompiler { fn new() -> Self { ... } fn add(&mut self, name: String) { ... } fn compile(self) -> CompiledShaders { ... } } fn main() { let mut shader_compiler= ShaderCompiler::new(); shader_compiler.add("alice".to_owned()); let registry = shader_compiler.compile(); // Wrong! shader_compiler.add("bob".to_owned()); }
Since we have a naked self as the argument to compile, we know the value represented by shader_compiler is being moved to that function. In other words, the ownership is being transferred into that scope. Once it ends the value is gone and cannot be used. And as you've come to expect, if we try to use it after that, the compiler will be very clear about it not being allowed:
error[E0382]: borrow of moved value: `shader_compiler` --> move/move-1.rs:57:9 | 51 | let mut shader_compiler = ShaderCompiler::new(); | ------------ move occurs because `shader_compiler` has type `ShaderCompiler`, which does not implement the `Copy` trait ... 55 | let registry = shader_compiler.compile(); | --------- `shader_compiler` moved due to this method call 56 | 57 | shader_compiler.add("bob".to_owned()); | ^^^^^^^^^^^^^^^ value borrowed here after move | note: `ShaderCompiler::compile` takes ownership of the receiver `self`, which moves `shader_compiler` --> move/move-1.rs:30:16 | 30 | fn compile(self) -> CompiledShaders { | ^^^^ error: aborting due to 1 previous error
Simple, explicit, predictable. How is that for move semantics?
This is where Rust goes from being good to being great, in my opinion. It is also where you transition, as a beginner, from enjoying the guard rails to the infamous fighting the borrow checker.
As C/C++ developers we are used to just throw pointers around and make sure constraints are met based on our understanding of the code — which is the main source of the memory and concurrency bugs, of course. Modern C++ has plenty of tools to help, but you need to be proactive in using them, and use them correctly, as Matt's talk shows. With C you're essentially on your own. Rust, on the other hand, forces you to be intentional and spell things out not just in a correct way, but in a way that it can prove to be correct.
It's outside the scope of this article to go into all the strategies you need to learn to properly structure your code and turn the fight with the borrow checker into a friendly collaboration. But if you are at this stage of the journey I would highly recommend reading Learning Rust With Entirely Too Many Linked Lists. It worked very nicely for my C programmer-wired brain.
There are many videos by the awesome Jon Gjengset going in depth on many of the concepts. He also has videos implementing various libraries and applications that work very well as more concrete examples of how to reason about your designs. How awesome is it that today we can learn from watching experts work and explain? I’d have loved that when I first started learning C on my own back in 1999. It's definitely a great way to follow up on reading the Rust Book.
On the final article that builds on Matt's talk I will discuss one more case that Rust solves very well but seems cumbersome for beginners — at least initially. For this next one C++ has some good tools that only fall short due to being optional to use, so it won't be a downer like this one — promise!
Finally, I just want to say: C++ standards committee, come on. Get your act together. You have some good things going, but this? This is just sad.
And while I'm at it, C standards committee, please give us a proper string type and bounds-checked slices already? There are some great ideas out there. Thanks in advance.
A note on regular Rust references: many people would refer to this type of reference as read-only, but that is not quite true and you don't really know for sure there are no changes being made to some internal state; the model is actually shared vs exclusive references.
Rust provides several tools to mutate shared state. One way to do that is what's called interior mutability, which allows you to control with more granularity what can be changed, often moving the exclusive access guarantees to runtime. Jon Gjengset has a video going into a lot of detail about this, if you're curious. Just in case you are wondering, Rust makes sure you are intentional and explicit while doing that, as well.
21/05/2025
In this second article of a three-part series, I look at how Matt Godbolt uses modern C++ features to try to protect against misusing an…
12/05/2025
Powerful video analytics pipelines are easy to make when you're well-equipped. Combining GStreamer and Machine Learning frameworks are the…
06/05/2025
Gustavo Noronha helps break down C++ and shows how that knowledge can open up new possibilities with Rust.
29/04/2025
Configuring WirePlumber on embedded Linux systems can be somewhat confusing. We take a moment to demystify this process for a particular…
24/04/2025
Collabora's Board Farm demo, showcasing our recent hardware enablement and continuous integration efforts, has undergone serious development…
27/02/2025
If you are considering deploying BlueZ on your embedded Linux device, the benefits in terms of flexibility, community support, and long-term…
Comments (0)
Add a Comment