Persian Rug, Part 2 - Other ways to make object soups in Rust

Persian Rug, Part 2 - Other ways to make object soups in Rust

Edmund Smith
September 27, 2023

Share this post:

Reading time:

In the first part of this series, we looked at a basic pattern, where two types of objects refer to one another, and claimed that it is hard to model in Rust. Patterns like these are common when modeling things from the real world. In this part we'll follow up in more detail and discuss a wider range of approaches.

Why is this hard?

We can solve problems like this easily in C++, it seems clear and straightforward how to do so. Let's go back to the C++ example we gave:

struct group;
struct user {
    std::vector<std::shared_ptr> groups;
};
struct group {
    std::optional<std::shared_ptr> leader;
};

Why doesn't a Rust analogue of this exist? Look at this snippet of compilable C++ which uses these types:

auto u = std::make_shared();
auto g1 = std::make_shared();
u->groups.push_back(g1);
auto & first_group = u->groups[0];
auto g2 = std::make_shared();
u->groups.push_back(g2);
first_group->leader = u;

First we take a reference to g1 via the shared_ptr held inside u, and call it first_group. Then we push to the vector of groups inside u, which might reallocate. Now our reference to first_group may dangle, and then using it on the next line will crash (or even worse, corrupt memory). Rust will not allow this to happen: mutable references to data are only ever available when no other references exist, so the call to push_back on line 6, requiring mutable access, would not compile.

But when we interact with an object in a soup, each object is likely to have at least one other handle (something like a reference or a pointer) pointing to it from other objects. That in turn makes it very difficult for Rust to provide a mutable reference to the object, because nothing prevents you from obtaining a second reference to it from one of the other handles.

Approaches

Use shared references

As we showed in the previous part, you could use std::sync::Arc` as the closest correlate of C++'s std::shared_ptr:

use std::sync::Arc;
struct User {
    groups: Vec<Arc<Group>>,
}
struct Group {
    leader: Option<Arc<User>>,
}

Using Arc on its own, we can represent immutable trees of objects, provided we construct the nodes in the right order, because making trees doesn't require previously constructed objects to be edited. With care we can do slightly more, because there's the possibility of mutation of an object when only one Arc pointing to that object exists.

So, to use Arc, we may have to remove some links between objects, like removing the leader field in Group for example. The more links we try to save, the more careful we'll have to be about the order of construction. In the end, we'll still have a largely immutable graph, and no amount of care will allow us to represent everything we might like.

Duplicate your data

Trees of objects are far more amenable to the sort of safety guarantees that Rust provides. There's one clear owner of a value, and you can only access data through a reference to that value. So we could write:

struct User {
    groups: Vec<Group> 
}
struct Group {
    leader: Option<User>
}

Now we have expanded our graph into a tree, by duplicating Users and Groups as needed. Mutation is allowed, but different copies may skew.

Use Mutexes

In Rust, a Mutex permits converting a shared reference to _itself_ into a mutable reference to its _contents_. A Mutex will block the thread rather than return a mutable reference when other references exist. This also explains why a Rust Mutex is not reentrant: if it were, we could create dangling references.

struct User {
    groups: Vec<Arc<Mutex<Group>>>,
}
struct Group {
    user: Option<Arc<Mutex<User>>>
}

This approach gives a fully mutable collection of objects, with arbitrary links between them. There will be no unchecked memory errors. So why not use this?

Let's imagine traversing our collection. To look at each node's contents, even to read it, we need to obtain its mutex. That means during traversal we'll build up a chain of mutexes that we hold. If we visit the same node twice, we'll get deadlock. If we have two threads traversing the collection and they obtain locks in different orders, we'll get deadlock. What's more, in a soup there is no obvious lock order we can try to maintain.

Try interior mutability

The final basic approach to this problem is to use *interior mutability* via `RefCell`. If you think of `Mutex` blocking threads to enforce Rust's safety rules, then `RefCell` crashes your program to achieve the same thing. The code might look like this:


struct User {
    groups: Vec<Rc<RefCell<Group>>>,
}
struct Group {
    user: Option<Rc<RefCell<User>>>
}

Just like with a Mutex, with RefCell we can ask for references to the data wherever we wish, and the code will compile. But if we ask for a mutable reference to an object when there are other references in existence, our program will panic. Like everything well-formed in Rust, RefCell cannot permit a mutable reference to data to co-exist with other references to that data.

Once again, there will be no unchecked errors. But now we are responsible for ensuring that we never try to create a mutable reference to data that is referenced elsewhere. This will not be checked for us by the compiler. We have again committed ourselves to a difficult contract to follow in our code.

In the next part, we'll look in more detail at persian-rug: how it works and the trade-offs it offers at present. After that, we'll talk about ways to improve upon its limitations in the future.

Persian Rug - It really ties the Rust room together

A roadmap for VirtIO Video on Chrome OS, part 1

Linux kernel 6.5: USB4v2 and Wifi7 have arrived

Persian Rug - It really ties the Rust room together

A roadmap for VirtIO Video on Chrome OS, part 1

Linux kernel 6.5: USB4v2 and Wifi7 have arrived

Comments (0)

Add a Comment

Search the newsroom

Latest Blog Posts

Coccinelle for Rust progress report

25/06/2025

In collaboration with Inria, the French Institute for Research in Computer Science and Automation, Tathagata Roy shares the progress made…

Linux Media Summit 2025 recap

23/06/2025

Last month in Nice, active media developers came together for the annual Linux Media Summit to exchange insights and tackle ongoing challenges…

Constructor acquires, destructor releases

09/06/2025

In this final article based on Matt Godbolt's talk on making APIs easy to use and hard to misuse, I will discuss locking, an area where…

What if C++ had decades to learn?

21/05/2025

In this second article of a three-part series, I look at how Matt Godbolt uses modern C++ features to try to protect against misusing an…

Unleashing gst-python-ml: Python-powered ML analytics for GStreamer pipelines

12/05/2025

Powerful video analytics pipelines are easy to make when you're well-equipped. Combining GStreamer and Machine Learning frameworks are the…

Matt Godbolt sold me on Rust (by showing me C++)

06/05/2025

Gustavo Noronha helps break down C++ and shows how that knowledge can open up new possibilities with Rust.

About Collabora

Whether writing a line of code or shaping a longer-term strategic software development plan, we'll help you navigate the ever-evolving world of Open Source.

한국의 국기 한국어 버전의 Collabora.com 보기