Episode #263: Immutable Enumerators with Tom Stuart

Upgrade to download episode video.

Episode Script

I think the first time Tom Stuart blew my mind it was with his article “programming with nothing”, which showed how to derive an entire programming system—including numbers and control flow operators—using nothing but Ruby procs taking single arguments. He then went on to write the book “Understanding Computation”, which I highly recommend.

Today he's out to blow minds once again, with a fascinating exploration of how we can use Ruby's enumerators to expose and operate on immutable collections. Enjoy!

Immutability is the default in some functional programming languages. But in Ruby most objects are mutable, and that's by design, because mutability is baked into the way most of us think about object-oriented programming.

Sometimes it's useful for values to be immutable. Knowing that an object won't change can make our code easier to reason about, and shared mutable state can be hard to get right in general, even in single-threaded programs.

When our objects are mutable, we always need to be aware of the possibility of them changing. To give one example: when we pass an argument into a method, it can be changed without us realizing. Here the string 'cherry' is being modified in-place by the #print_pie method:

If we want to be sure a value won't get mutated, we can clone it and pass the clone around instead. Now the original value remains unchanged:

Or we can freeze our value, and then we'll get an exception if anything tries to modify it:

In this episode we're going to try a different way of making objects immutable in Ruby — specifically, a way of making immutable collections.

The simplest way to make an immutable collection is to call #freeze on an existing array. As we just saw, this will disable all the methods that allow the array to be changed:

But there's another way. We can create a more naturally immutable collection by using an enumerator, because out of the box an enumerator only supports reading from a collection, not modifying it, so we don't need to disable anything to get the behavior we want.

One option is to expose an existing mutable collection through an enumerator by using the #to_enum method. The resulting enumerator provides an interface that lets us iterate over the underlying collection but doesn't give us a way to modify it:

Alternatively, we can make an enumerator from scratch by generating its contents with a block. Here we yield a series of strings inside the block, and those strings become the contents of the collection:

It's more obvious why this collection must be immutable. There's not even any underlying data structure to modify; its contents are being generated on the fly by an unchanging block of Ruby code.

So we can base an enumerator on an existing array or create it from scratch with a block. Either way, the structure of the collection can't be modified through the enumerator, but we can still implement operations that create a new collection by adding, removing, reordering or otherwise changing the contents of this one.

The Enumerator class includes the Enumerable module, so we can do all the usual Enumerable stuff with an immutable collection, like mapping a block over it, filtering it and so on.

But note that those operations return a new mutable array, not another immutable collection represented as an enumerator:

More generally, we can make a modified copy of an immutable collection by writing a new enumerator that iterates over it and yields different values. Here's how to add an element:

This new enumerator iterates over the old one, yielding every value it finds, and afterwards yields an extra value. As a result, it behaves like the old collection with an extra element appended:

We can use a similar technique to remove an element:

This enumerator iterates over the old one and yields all of its elements except one. So, it behaves like the old collection with an element removed:

Enumerators are lazy, so this technique also works on infinite collections. For example, here's an infinite collection of even numbers:

We can take as many elements as we like from this collection, as long as we don't try to take all of them. The enumerator will generate more elements as they're needed:

By making a new enumerator that wraps this collection and yields different values, we can make a modified copy. Here's a version that includes the unlucky number thirteen at the appropriate position:

Another advantage of using enumerators instead of calling #clone or #freeze is that those methods are shallow: they protect the structure of the collection at the top level, but not any of the objects inside it.

So cloning an array doesn't prevent modification of the string objects it contains, and those objects are shared between the original collection and its clone. Here we can see that the original fruits array ends up containing the string 'pineapple' even though it looks like we're only modifying the clone:

Similarly, freezing the array doesn't make the strings themselves frozen, so there's nothing to stop us modifying them in-place:

But an enumerator backed by code instead of a concrete data structure is regenerated every time we iterate over it, so even if its generated contents get mutated, we can arrange for the block to provide fresh copies next time we look. In this case, the string literals inside the block will create new string objects every time they're evaluated, so even if we pull one out and mutate it, the mutated copy won't show up next time we iterate over the collection:

And that's how to use enumerators to build immutable collections. Thanks for having me as your guest chef today, and happy hacking!