We're hiring!
*

Persian Rug, Part 2 - Other ways to make object soups in Rust

Edmund Smith avatar

Edmund Smith
September 27, 2023

Share this post:

Reading time:

In the first part of this series, we looked at a basic pattern, where two types of objects refer to one another, and claimed that it is hard to model in Rust. Patterns like these are common when modeling things from the real world. In this part we'll follow up in more detail and discuss a wider range of approaches.

Why is this hard?

We can solve problems like this easily in C++, it seems clear and straightforward how to do so. Let's go back to the C++ example we gave:

struct group;
struct user {
    std::vector<std::shared_ptr> groups;
};
struct group {
    std::optional<std::shared_ptr> leader;
};

Why doesn't a Rust analogue of this exist? Look at this snippet of compilable C++ which uses these types:

auto u = std::make_shared();
auto g1 = std::make_shared();
u->groups.push_back(g1);
auto & first_group = u->groups[0];
auto g2 = std::make_shared();
u->groups.push_back(g2);
first_group->leader = u;

First we take a reference to g1 via the shared_ptr held inside u, and call it first_group. Then we push to the vector of groups inside u, which might reallocate. Now our reference to first_group may dangle, and then using it on the next line will crash (or even worse, corrupt memory). Rust will not allow this to happen: mutable references to data are only ever available when no other references exist, so the call to push_back on line 6, requiring mutable access, would not compile.

But when we interact with an object in a soup, each object is likely to have at least one other handle (something like a reference or a pointer) pointing to it from other objects. That in turn makes it very difficult for Rust to provide a mutable reference to the object, because nothing prevents you from obtaining a second reference to it from one of the other handles.

Approaches

Use shared references

As we showed in the previous part, you could use std::sync::Arc` as the closest correlate of C++'s std::shared_ptr:

use std::sync::Arc;
struct User {
    groups: Vec<Arc<Group>>,
}
struct Group {
    leader: Option<Arc<User>>,
}

Using Arc on its own, we can represent immutable trees of objects, provided we construct the nodes in the right order, because making trees doesn't require previously constructed objects to be edited. With care we can do slightly more, because there's the possibility of mutation of an object when only one Arc pointing to that object exists.

Decimating a graph into a frozen DAG


So, to use Arc, we may have to remove some links between objects, like removing the leader field in Group for example. The more links we try to save, the more careful we'll have to be about the order of construction. In the end, we'll still have a largely immutable graph, and no amount of care will allow us to represent everything we might like.

Duplicate your data

Trees of objects are far more amenable to the sort of safety guarantees that Rust provides. There's one clear owner of a value, and you can only access data through a reference to that value. So we could write:

struct User {
    groups: Vec<Group> 
}
struct Group {
    leader: Option<User>
}

Now we have expanded our graph into a tree, by duplicating Users and Groups as needed. Mutation is allowed, but different copies may skew.

Use Mutexes

In Rust, a Mutex permits converting a shared reference to _itself_ into a mutable reference to its _contents_. A Mutex will block the thread rather than return a mutable reference when other references exist. This also explains why a Rust Mutex is not reentrant: if it were, we could create dangling references.

struct User {
    groups: Vec<Arc<Mutex<Group>>>,
}
struct Group {
    user: Option<Arc<Mutex<User>>>
}

This approach gives a fully mutable collection of objects, with arbitrary links between them. There will be no unchecked memory errors. So why not use this?

Let's imagine traversing our collection. To look at each node's contents, even to read it, we need to obtain its mutex. That means during traversal we'll build up a chain of mutexes that we hold. If we visit the same node twice, we'll get deadlock. If we have two threads traversing the collection and they obtain locks in different orders, we'll get deadlock. What's more, in a soup there is no obvious lock order we can try to maintain.

Try interior mutability

The final basic approach to this problem is to use *interior mutability* via `RefCell`. If you think of `Mutex` blocking threads to enforce Rust's safety rules, then `RefCell` crashes your program to achieve the same thing. The code might look like this:


struct User {
    groups: Vec<Rc<RefCell<Group>>>,
}
struct Group {
    user: Option<Rc<RefCell<User>>>
}

Just like with a Mutex, with RefCell we can ask for references to the data wherever we wish, and the code will compile. But if we ask for a mutable reference to an object when there are other references in existence, our program will panic. Like everything well-formed in Rust, RefCell cannot permit a mutable reference to data to co-exist with other references to that data.

Once again, there will be no unchecked errors. But now we are responsible for ensuring that we never try to create a mutable reference to data that is referenced elsewhere. This will not be checked for us by the compiler. We have again committed ourselves to a difficult contract to follow in our code.

Next

In the next part, we'll look in more detail at persian-rug: how it works and the trade-offs it offers at present. After that, we'll talk about ways to improve upon its limitations in the future.

Comments (0)


Add a Comment






Allowed tags: <b><i><br>Add a new comment:


Search the newsroom

Latest Blog Posts

Thoughts on PipeWire 1.0 and beyond

06/12/2023

We can now confidently say that PipeWire is here to stay. But of course it is not the end of the journey. There are many new areas to explore…

Persian Rug, Part 3 - The warp and the weft

05/12/2023

Our look at the Rust crate for interconnected objects continues, as we examine how persian-rug really does tie the room together by providing…

Advocating a better Kernel Integration for all

01/12/2023

The testing ecosystem in the Linux kernel has been steadily growing, but are efforts sufficiently coordinated? How can we help developers…

WirePlumber: Exploring Lua scripts with Event Dispatcher

30/10/2023

With the upcoming 0.5 release, WirePlumber's Lua scripts will be transformed with the new Event Dispatcher. More modular and extensible…

A roadmap for VirtIO Video on ChromeOS: part 2

02/10/2023

This second installment explores the Rust libraries Collabora developed to decode video and how these libraries are used within ARCVM to…

Persian Rug, Part 2 - Other ways to make object soups in Rust

27/09/2023

Why is creating object graphs hard in Rust? In part 1, we looked at a basic pattern, where two types of objects refer to one another. In…

Open Since 2005 logo

We use cookies on this website to ensure that you get the best experience. By continuing to use this website you are consenting to the use of these cookies. To find out more please follow this link.

Collabora Ltd © 2005-2023. All rights reserved. Privacy Notice. Sitemap.