Writing Pythonic Rust

Over the past several weeks I have been attempting to reimplement the API of an existing python library as a wrapper for an equivalent library in Rust.

tl;dr: this ended up being much harder than I expected it to be, partly because of important differences in the behaviour of the two languages, and partly because of the (self-imposed) obligation to match an existing (idiomatic) python API.

Motivation

Python is the traditional language of choice for font tools. Popular font editors generally support extensions written in python, and type designers and foundries frequently have extensive collections of scripts and tools for doing font QA, producing proofs, and generating compiled font files.

As we explore writing font tooling in Rust, we would like to be able to continue to support these existing workflows; ideally an existing python script would need only minimal modification in order to continue to work as expected even though the library code it was interacting with would now be written in Rust.

Language differences

The main challenge faced with this project is working around the fundamental differences between Rust and python, particularly around ownership and mutability. A UFO font object can be thought of as a collection of layers, each of which contains a collection of glyphs, which in turn contain ‘contours’ (bezier paths) and possibly references to other glyphs. Each of these types (Font, Layer, Glyph, Contour, Point) is an object, with reference semantics. This means you can do the following:

font = Font.open("MyFont.ufo")
glyphA = font.layers.defaultLayer["A"]
point = glyphA.contours[0].points[0]
point.x = 404;
assert point.x == glyphA.contours[0].points[0].x

More succinctly: python expects you to share references to things. When you create a binding to a point in a contour, that binding refers to the same data as the original point, and modifying the binding modifies the collection.

This doesn’t really translate to Rust: Rust is much more restrictive about handing out references.

Interior mutability

My initial plan was to just make extensive use of interior mutability, which is a pattern available in Rust for dealing with these sorts of situations. This would require each of the Rust types I would like to expose to python to be behind a shared pointer, with some mechanism for ensuring that access is unique at any given time.

This means converting from something that looks like this,

struct Font {
    layers: Map<String, Layer>,
}

struct Layer {
    glyphs: Map<String, Glyph>,
}

struct Glyph {
    contours: Vec<Contour>,
    components: Vec<Component>,
}

struct Contour {
    points: Vec<Point>,
}

struct Component {
    base_glyph: String,
    transform: AffineTransformation,
}

To something that looks like this:

struct SharedVec<T>(Arc<Mutex<T>>);
struct SharedMap<T>(Arc<Mutex<Map<String, T>>>);

struct Font {
    layers: SharedMap<Layer>,
}

struct Layer {
    glyphs: SharedMap<Glyph>,
}

struct Glyph {
    contours: SharedVec<Contour>,
    components: SharedVec<Component>,
}

// etc.

This was my initial approach, but it started to become pretty verbose, pretty quickly. In hindsight it may have ultimately been simpler than where I did end up, but, well, that’s hindsight.

Proxy objects

Ultimately, I settled on a different approach. Instead of having actual shared objects that are actually mutated, you have a ‘proxy object’. This is a reference to a single shared object (in this case, the Font object) and then a mechanism (something like a keypath, or something like a lens) to retreive whatever other object we’d like to reference (such as a layer, glyph, or point) from that object.

In this world, our Layer object looks more like this:

struct FontProxy(Arc<Mutex<Font>>);

struct LayerProxy {
    font: FontProxy,
    layer_name: String,
}

And then we just need some way of retrieving the inner layer object from the font as needed.

Ideally this would look something like this:

impl LayerProxy {
    fn get(&self) -> &Layer {
        self.font.0.lock().unwrap().layers.get(&self.layer_name)
    }

    fn get_mut(&mut self) -> &mut Layer {
    self.font.0.lock().unwrap().layers.get_mut(&self.layer_name)
    }
}

But this doesn’t quite work, for two reasons. First, because each access of the object represented by the proxy requires acquiring a lock on the underlying font object, we can’t just return a reference. The lock is only held for the scope of the function, so the minute we return we lose the lock, and our reference would be invalid. Instead, we need to do whatever work is required inside this function, which we can do easily enough by passing a closure that takes a &Layer as a reference. The second problem is that a proxy can become invalid; for instance if the user is holding on to a proxy object that refers to a layer, but the layer is then deleted, the proxy won’t be able to find the object it refers to. We handle this case by returning an error, which we can convert to an Exception on the python side. Put together, our proxy code becomes,

impl LayerProxy {
    fn with<R>(&self, f: impl FnOnce(&Layer) -> R) -> Result<R, ProxyError> {
        self.font.0.lock().unwrap().layers.get(&self.layer_name)
            .map(f)
            .ok_or_else(|| ProxyError::MissingLayer(self.layer_name.clone()))
    }

    fn with_mut<R>(&mut self, f: impl FnOnce(&mut Layer)) -> Result<R, ProxyError> {
        self.font.0.lock().unwrap().layers.get_mut(&self.layer_name)
            .map(f)
            .ok_or_else(|| ProxyError::MissingLayer(self.layer_name.clone()))
    }
}

With this in place, we can implement our API on top of the proxy object:

impl LayerProxy {
    fn len(&self) -> Result<usize, ProxyError> {
        self.with(|layer| layer.len())
    }

    fn remove_glyph(&mut self, glyph_name: &str) -> Result<(), ProxyError> {
        self.with_mut(|layer| layer.remove_glyph(name))
    }
}

A nice property of these proxy objects is that they can be implemented in terms of each other. Just as a glyph is contained by a layer, a GlyphProxy can be represented as a LayerProxy and a glyph name:

struct GlyphProxy {
    layer: LayerProxy,
    glyph_name: String,
}

impl GlyphProxy {
    fn with<R>(&self, f: impl FnOnce(&Glyph) -> R) -> Result<R, ProxyError> {
        self.layer.with(|layer| {
            layer.get(&self.glyph_name)
            .map(f)
            .ok_or_else(|| ProxyError::MissingGlyph(self.glyph_name.clone()))
        })?
    }

    fn with_mut<R>(&mut self, f: impl FnOnce(&mut Glyph)) -> Result<R, ProxyError> {
        self.layer.with_mut(|layer| {
            layer.get_mut(&self.glyph_name)
            .map(f)
            .ok_or_else(|| ProxyError::MissingGlyph(self.glyph_name.clone()))
        })?
    }
}

This… mostly works? but it gets complicated shortly, when we start to deal with lists.

The next object we want to deal with isn’t a single object, but rather the list of contours in a glyph. This is easy enough; we can just reuse the GlyphProxy, wrapping it in a type that represents the field; and then for an individual Contour, we can just store the proxy list, plus the index of a particular contour…

struct GlyphContoursProxy(GlyphProxy);

impl GlyphContoursProxy {
    fn with<R>(&self, f: impl FnOnce(&Vec<Contour>) -> R) -> Result<R, ProxyError> {
        self.0.with(|glyph| f(&glyph.contours))
    }

    fn with_mut<R>(&mut self, f: impl FnOnce(&mut Vec<Contour>)) -> Result<R, ProxyError> {
        self.0.with_mut(|glyph| f(&mut glyph.contours))
    }
}

Struct ContourProxy {
    contours: GlyphContoursProxy,
    idx: usize,
}

And… things start to get a bit tricky here.

Consider the following code:

glyph = myFont["a"]
contour = glyph.contours[0]
glyph.contours.insert(0, Contour())

If we’re using indexes to identify our objects, we now have a problem: the contour at index 0 exists, but it is not the contour that we expect.

Headaches

This issue with ‘proxy validity’ was one of several annoying and slightly subtle issues I ran into during this project. They were all more-or-less addressable, but they ended up requiring more code and more bookkeeping than I had expected.

Some of the more interesting complications:

This issue with index validity

In this particular case, the solution is to augment all of our types with an additional identifier; this is just a token that uniquely identifies a particular object. When we create a proxy, we copy over this token, and then when we access the object, we check to make sure that the tokens match and return a ProxyError if not. This works, but it requires an additional level of care, and also introduces an additional layer of complexity: we have to ensure that we assign new identifiers when objects are cloned, as well as ensuring we use the same identifiers for new proxy objects that are supposed to refer to the same underlying object.

Not all objects are proxy objects

This proxy object approach works fine if you’re just loading a font and manipulating it, but what if you’re creating new objects? It is totally reasonable to have python code that looks something like:

glyphA = Glyph("A")
glyphB = Glyph("B")
layer = Layer(glyphs=[glyphA, glyphB])
font.addLayer("extra.layer", layer)

In this code, neither the glyphs nor the layer can be a proxy object when they’re initialized, because what font would they belong to? This was one of the next major complications: most objects need to be either a proxy object or a concrete object.

This means that our code for GlyphProxy starts to look more like this:

enum GlyphInner {
    Layer { layer: LayerProxy, name: String },
    Concrete(Arc<Mutex<Glyph>>),
}

struct GlyphProxy(GlyphInner);

impl GlyphProxy {
    fn with<R>(&self, f: impl FnOnce(&Glyph) -> R) -> Result<R, ProxyError> {
        match &self.inner {
            GlyphInner::Layer { layer, name } => {
                layer
                    .get(&name)
                    .map(f)
                    .ok_or_else(|| ProxyError::MissingGlyph(self.glyph_name.clone()))
            }
            GlyphInner::Concrete(glyph) => Ok(f(&glyph.lock().unwrap()),
        }
    }
    // etc
}

Fortunately, much of this code can be produced by macro, but… it’s still starting to get pretty complicated.

Being pythonic is tricky.

A fundamental goal of this project was matching the existing API, to the point where the main development goal was trying to pass the existing test suite, with only minimal modifications (for instance giving up on tests that required object identity, which doesn’t work with proxy objects.)

the existing api makes extensive use of various python idioms: various objects have dictionary semantics, other objects have list semantics; some methods accept either a concrete type or a python dictionary. In the context of the original library, these are entirely reasonable decisions, but each one of them is a hassle in the context of API compatibility.

Collections are hard

Lets say you have a simple type in rust, that contains a Vec<u32>. Using the excellent pyo3 library, you can easily write python bindings that exposes a getter and setter for this type:

#[pyclass]
struct Point {
    x: i32,
    y: i32,
}

#[pyclass]
struct Thing {
    items: Vec<u32>,
}

#[pymethods]
impl Thing {
    #[new]
    fn new(items: Option<Vec<Point>>) -> Self {
        SubThing {
            items: items.unwrap_or_default(),
        }
    }
    #[getter]
    fn get_items(&self) -> Vec<Point> {
        self.items.clone()
    }

    #[setter]
    fn set_items(&mut self, items: Vec<Point>) {
        self.items = items;
    }
}

#[pymethods]
impl Point {
    #[new]
    fn new(x: i32, y: i32) -> Self {
        Point {
            x, y
        }
    }

    #[getter]
    fn get_x(&self) -> i32 {
        self.x
    }

    #[setter]
    fn set_x(&mut self, val: i32) {
        self.x = val;
    }

    #[getter]
    fn get_y(&self) -> i32 {
        self.y
    }

    #[setter]
    fn set_y(&mut self, val: i32) {
        self.y = val;
    }
}

This looks nice: pyo3 handles the conversions for us, so that, from the python side, we can just do:

thing = Thing([Point(42, 5)])
assert thing.items[0].x = 42
thing.items = [Point(0, 0), Point(1, 1)]
assert thing.items[-1].y == 1

The nice thing here is that we use a List[int] on the python side, and it is converted to a Vec<i32> on the rust side.

Unfortunately, this doesn’t really behave like we would expect. In particular, trying to mutate items through the getter won’t work:

thing = Thing[Point(42, 5)]
thing.items[0].x += 5
assert thing.items[0].x == 47 # fails!

The problem is that thing.items is always returning a new list, containing new objects.

Unless I’m missing something, it doesn’t feel like there’s a great solution to this, unless you use proxy objects of some kind to represent the collection.

Value versus reference semantics, more generally

Collections are an illustration of a bigger issue, which is around value versus reference semantics. I think this is probably the most important thing to consider when designing a python API on top of rust: when do you want things to behave like values (where creating a new binding copies the object) and when do you want things to act like references (where new bindings reference the same underlying object.)

The semantic mismatch between the two languages really encourages value types. Reference types are a pain, but they are important for things like collections. In some cases you can avoid exposing collections altogether, and just provide methods like setItem(idx, item), removeItem(idx) and getItem(idx). This won’t feel quite as pythonic as the alternatives, but it can save a lot of headaches; it becomes clear that mutating some item involves first getting the item, then doing the mutation, and then setting the item again.

Learnings

Ultimately, the thing I was trying to achieve (fully reimplement an existing idiomatic python library on top of an existing rust library) is not something very many people should be attempting. Most people who are trying to use rust from python have a more specific goal: speeding up some particular piece of code, for instance.

Getting this working was annoying, and I’m not very happy with the result. I haven’t written much python in the past five years or so, and if I were more comfortable there I think I would probably have made certain better choices, and that might have made things easier; but probably only marginally easier.

My main conclusion is pretty straightforward. If you wish to expose a python API from rust, you should think carefully about the design of that API ahead of time. Some good questions to ask:

How much API do I need to expose? The less API you need to write, the easier your life will be.
How much does my API need to use python collections? python collections really are a challenge, especially if you want to cross the FFI barrier with them; you will either need to use interior mutability on the rust side, or you will need to use some sort of proxy object.
Can I limit the depth of my object graph? If you have an object that contains a list of other objects, and those inner objects also have child objects, then you will need interior mutability or a proxy type at each of those levels. If you have an object with fields and the fields are value types, things are much easier.
What should be in python, and what should be in rust, and what should the contact points be? I have spent ~5 years writing python and I have spent ~5 years writing rust. I like rust a lot! But I do not think it should be controversial to say that python is the better language for writing python. Where possible, you should limit the use of rust to those specific places where it is helpful. Equally important, you should limit the points at which you need to move between the two languages: each time you need to convert a python type to rust, or convert a rust type back to python, you are going to incur both computational overhead as well as cognitive overhead. Ideally you would be able to find one or two places where you needed to move some state into the rust side, and then one or two places where you needed to move that state back into python.

Finally

This was definitely a mixed experience. on the positive side, it is extremely easy and ergonomic to write a python module in Rust. On the downside, it is much harder than I had expected to expose an interface that felt truly at home in python.

Thanks

I found this work frustrating enough that when I finally had (mostly) finished this writeup, I was most eager to just forget about it and move on to something else; perhaps an example of the general phenomenon of publication bias. I did share a draft with a few collaborators, and their encouragement was enough to motivate me to polish it up into this blog post, so thank you to Dave Crossland, Behdad Esfahbod, and Fredrick Brennan for their encouragement, and thank you to Google Fonts for funding this work.