Why immutable collections?
April 27, 2014
People keep asking me why I’m making an immutable vector library for JavaScript (and excited to start on an immutable hashmap after that). What’s so good about a collection you can’t change?
To me, immutable collections provide three main benefits:
Improved caching
Let’s say you have a user profile, represented as a JavaScript object:
var profile = {
name: 'Scott',
user_id: 42,
pic: 'https://scott.mn/me.jpg',
friends: [... array of objects ...],
latest_tweets: [... array of objects ...],
...
};
The profile is a tree of nested hashmaps and vectors (“objects” and “arrays”). When one piece of data, let’s say the profile pic, changes:
profile.pic = 'https://scott.mn/me2.jpg';
We need to redraw our user profile view. To achieve good
performance, we would like to avoid doing unnecessary work. Let’s say
rendering the friends list is a bottleneck. In this case,
profile.friends
hasn’t changed. But how do we take advantage of
that?
// non-working code
var FriendsView = {
render: function(profile) {
if (profile.friends === this._lastFriendsValue) {
return;
}
this._lastFriendsValue = profile.friends;
// ... actually update the DOM ...
}
}
This code illustrates a pattern we maybe would like to use, but, with
mutable collections, can’t. If we change an element in
profile.friends
, it will still be ===
to the old value of
profile.friends
, because we’re comparing object identity, not
value.
We would need to clone the friends array, and iterate over each element to check if they’re the same:
function arrayEqual(a, b) {
if (a.length !== b.length) return false;
for (var i = 0; i < a.length; i++) {
if (a[i] !== b[i]) return false;
}
return true;
}
var FriendsView = {
render: function(profile) {
if (arrayEqual(profile.friends, this_lastFriendsValue)) {
return;
}
// use .slice() to make a copy
this._lastFriendsValue = profile.friends.slice();
// ... actually update the DOM ...
}
}
OK, this just got a lot less efficient. We now have to clone the array, an O(n) operation, so we have it around to compare to next time. And when we compare, that’s also an O(n) operation. And we still haven’t fully solved the problem. Consider:
profile.friends[3].pic = 'http://img.example.com/new_pic.jpg';
We only made a copy (and comparison) one level deep, so with
profile.friends[3]
still being the same object as before, our
algorithm won’t detect that change.
If we use immutable hashmaps and vectors for the profile, instead of mutable objects and arrays, then the first version of the code, labeled “non-working”, works perfectly. Object equality implies deep equality of all the values inside, no matter how far they’re nested.
There’s a caveat. If you put a regular mutable object into an immutable vector, we obviously can’t make the same guarantee about value equality all the way down; the immutability guarantee goes away. In Clojure, immutable data structures are the default and you have to really go out of your way to create a mutable array, so this problem doesn’t come up much. But in JavaScript, it may happen that people do this by accident. In the future, maybe a warning when adding a non-frozen object to an immutable vector would help with this; production builds could remove the warning.
In summary, given some source data, and transformed output (such as a DOM representation), immutability makes it easy to efficiently check whether the source data has changed and save unnecessary work if not. David Nolen’s blog post The Future of JavaScript MVCs shows concrete evidence of performance improvements in JavaScript apps based on this principle.
Undo and redo
A second benefit is easy, almost trivial undo. It works like this:
- Store the entire state of your app in a tree of immutable collections.
- Keep a stack of all the states you’ve ever seen, pushing the new state into the stack when something changes.
- Pop the stack when the user wants to undo.
It’s really that simple. Because immutable collections use structural sharing (at least the ones I’m implementing, based on Clojure’s!), most bits of state that don’t change, still only get stored once even if you have a thousand states. Memory usage is not much greater than storing a diff.
David Nolen has written about this, too, in his Time Travel post, which uses the ClojureScript Om library. My current work will enable similar designs in pure JavaScript, with no dependency on ClojureScript code.
No defensive copying
The last benefit I’ll discuss is that immutable collections remove the need for defensive copying — and in the process eliminate a whole class of bugs that come from forgetting to defensively copy.
An example of where this can bite you is the options hashes many JavaScript APIs take:
var options = {foo: true};
lib.doAThing(options);
lib.doAnotherThing(options);
Innocuous enough, but what if the library is doing something like this:
exports.doAThing = function(options) {
// prepare in some way
whatever();
// then delegate to another method
options.silent = true;
exports.doAnotherThing(options);
};
exports.doAnotherThing = function(options) {
// ... behaves completely differently if options.silent is true ...
};
Your second call sent silent: true
even though you didn’t mean to.
The correct answer here is for doAThing
to clone its options
argument before changing it. But it’s easy to forget, and I’ve seen
this situation several times while debugging Backbone.js apps. A
better answer is to use immutable collections: the problem goes away
entirely.
More reasons?
If you liked this, you may enjoy Reference types: separating identities and values. The first half discusses how built-in JavaScript collections aren’t values, but immutable collections (using Clojure’s as an example) are. That’s what my library will bring to JavaScript.
My immutable vectors
currently (the API may change) have a .equals(otherVec)
method that
recurses into vectors within vectors to provide a deep, value-based
equality check. The Clojure example from that post:
(= [1 2 3] [1 2 3])
becomes
new ImmutableVector(1,2,3).equals(new ImmutableVector(1,2,3))
which is indeed true.