Playing with refs
February 9, 2014
I’ve written before about Clojure’s atoms and how they
work. Atoms provide atomic,
consistent, and (if you only update them with
swap!) isolated access
to a single identity.
Sometimes, though, we need multiple identities we can deal with in a synchronized way. That’s where refs come into play. In this post, I’ll provide a broken example you can copy-paste into a REPL, then show how refs fix it. Finally, I’ll go into a few neat add-on features refs offer.
A classic example of where you would need refs is a bank transfer. If we represent two bank accounts as atoms, and transfer between them, we’re going to have problems. Let’s try it and see.
(def account-a (atom 1000)) (def account-b (atom 1000))
To make it more obvious that this code is buggy, our transfer function will have a built-in delay. But basically its job is to check that the sending account has a high enough balance to cover the transfer, then debit the sender and credit the receiver.
(defn transfer-money [from to amount] (when (>= @from amount) (swap! from - amount) (Thread/sleep (rand-int 15)) ; To make effects more noticeable. (swap! to + amount)))
To test this code we can make a bunch of random transactions:
(defn make-random-transactions [acc1 acc2 n] (dotimes [_ n] (let [[from to] (shuffle [acc1 acc2])] (transfer-money from to (rand-int @from)))))
If we only run the random transactions in one thread, we’re fine:
user=> (make-random-transactions account-a account-b 1000) nil user=> @account-a 1616 user=> @account-b 384 user=> (+ @account-a @account-b) 2000
The individual numbers will vary for you, obviously, but the sum should still be 2000, and neither account should be negative. That’s because we didn’t use multiple threads. (For this reason, ClojureScript, which targets a single-threaded environment, lacks refs. It doesn’t need them.)
Here’s some test code to run random transactions in 10 threads
simultaneously. The meat of it is in the
function, but I have broken out
they can be redefined when we switch to using refs.
(defn reset-accounts  (reset! account-a 1000) (reset! account-b 1000)) (defn sum-accounts [& accts] (reduce + (map deref accts))) (defn test-threaded-transfers  (reset-accounts) (let [futures (doall (for [_ (range 10)] (future (make-random-transactions account-a account-b 1000))))] (while (not (every? future-done? futures)) (println "Sum of accounts:" (sum-accounts account-a account-b)) (Thread/sleep 1000)) (println "Final sum:" (sum-accounts account-a account-b))))
Every second, this will print out the sum of the two accounts. The sum
of the two accounts should always be 2000, but as you’ll see if you
test-threaded-transfers, the sum jumps all over the place:
user=> (test-threaded-transfers) Sum of accounts: 54 Sum of accounts: 35 Sum of accounts: 964 Sum of accounts: 240 Sum of accounts: 1097 Sum of accounts: -683 Sum of accounts: 449 Sum of accounts: 1634 Final sum: 2000 nil
In a second, we’ll fix that using refs.
Refs: setting and altering
Refs are created like atoms:
user=> (def r (ref 400)) #'user/r
They’re modified using
ref-set, which works like
alter, which works like
swap!. But there’s a catch. All
modifications to a ref must be wrapped in the
(dosync ...) macro.
user=> (alter r + 100) IllegalStateException No transaction running clojure.lang.LockingTransaction.getEx (LockingTransaction.java:208) user=> (dosync (alter r + 100)) 500
dosync creates a transaction. Inside a transaction, Clojure keeps
a log of all refs that were read or modified, and if one of them is
altered before the transaction completes, the whole transaction will
be retried. In addition, none of the in-transaction changes are
visible to other threads until the transaction completes (“commits”).
Then all of them are.
Refs are dereferenced using
@, just like atoms. As with atoms,
@r is reader sugar for
(deref r), and
deref can be spelled out
if you’re using a higher-order function (e.g.,
Transferring money with refs
Let’s redefine our accounts to be refs and rewrite
(def account-a (ref 1000)) (def account-b (ref 1000)) (defn transfer-money [from to amount] (dosync (when (>= @from amount) (alter from - amount) (Thread/sleep (rand-int 15)) (alter to + amount))))
We’ll also rewrite
reset-accounts from before.
(defn reset-accounts  (dosync (ref-set account-a 1000) (ref-set account-b 1000)))
sum-accounts, there’s only one minor change, and that is to
wrap the function body from before in a
(dosync ...). This ensures
we dereference all the refs at the same instant in time.
(defn sum-accounts [& accts] (dosync (reduce + (map deref accts))))
There’s no need to change
test-threaded-transfers. If you’re
copy-pasting all this into a REPL, it will still work.
user=> (test-threaded-transfers) Sum of accounts: 2000 Sum of accounts: 2000 Sum of accounts: 2000 [ ... repeated many more times ... ] Final sum: 2000
So, what changed here? By wrapping our modifications in a
we’ve given Clojure enough information to know that the changes to
to are connected, and should happen — from the view of
other threads — at the same instant in time.
However, you’ll notice the transfers now take much longer to complete.
Let’s see if we can speed them up by replacing
Commuting instead of altering
is a special version of
alter for when your update function is
commutative. Suppose you want to alter a thing and also increment a
(dosync (alter thing-a expensive-function-1 ...) (commute counter inc)) ; in another thread: (dosync (alter thing-b expensive-function-2 ...) (commute counter inc))
These transactions alter completely different things.
thing-b for the second. No conflict there. But because
they both increment the same counter, if you used
transaction would “lose” and have to be retried in its entirety. Such
The only thing we really need to do over in case of conflict is increment the counter. The increment is cheap and can be done independently.
And that’s exactly how
commute works. If the commuted ref is changed
by another thread,
commute simply re-runs the update function with
the ref’s new value at commit time. No need to repeat the expensive
We can apply this to our bank transfer example, but we have to be
commute on the addition is fine. Using it on the
subtraction would be a bug. Can you see why?
(defn transfer-money [from to amount] (dosync (when (>= @from amount) (alter from - amount) (Thread/sleep (rand-int 15)) (commute to + amount))))
On my computer, this version actually makes
(test-threaded-transfers) take slightly longer — 44 vs. 41 seconds.
So this may not be the greatest example to try
A bank balance should never be negative. We can express that restriction in code using a validator:
(defn valid-account-balance? [bal] (>= bal 0)) (set-validator! account-a valid-account-balance?) (set-validator! account-b valid-account-balance?)
Now if we try to set an account balance to a negative number, an exception is thrown and the ref’s value is unchanged:
user=> (dosync (ref-set account-a 1000)) 1000 user=> (dosync (ref-set account-a -1)) IllegalStateException Invalid reference state clojure.lang.ARef.validate (ARef.java:33) user=> @account-a 1000
You can use validators with atoms too.
With this validator in place, we should now be equipped to answer the
question I asked above, as to why
commute can’t be used when we
subtract from a balance. Try it:
(defn transfer-money [from to amount] (dosync (when (>= @from amount) (Thread/sleep (rand-int 150)) (commute from - amount) ;; BAD (Thread/sleep (rand-int 15)) (commute to + amount))))
We’ll modify our
make-random-transactions function slightly:
(defn make-random-transactions [acc1 acc2 n] (dotimes [_ n] (let [[from to] (shuffle [acc1 acc2])] (transfer-money from to (rand-int 500)))))
(rand-int @from) to
(rand-int 500). So this code will
attempt to make transfers that are in excess of the sending account’s
balance, and only the
when clause in
transfer-money prevents that
I was not able to get the above code, or any variations I tried, to
actually break. No matter what I did, it kept working. On my computer,
(test-threaded-transfers) at a REPL with the above
definitions still works fine.
But here’s how I hoped it would break, and why I believe it’s broken in theory. One of the accounts would go negative, making validation fail, like so:
- Thread 1 begins a transaction:
(transfer-money account-a account-b 20)
- Thread 1 verifies
(> @account-a 20), and decides to proceed with the transaction
- Thread 2 begins a transaction:
(transfer-money account-a account-b 10)
- Thread 2 verifies
(> @account-a 10), and decides to proceed with the transaction
- Thread 1 subtracts 20 from account A, adds 20 to account B
- Thread 1 commits. Globally,
- Thread 2 subtracts 10 from account A, adds 10 to account B. Inside
of the transaction, thread 2 thinks
@account-ais now 10
- Thread 2 tries to commit. The subtraction from
@account-ais now -10. Validation fails.
Despite many tries, I haven’t gotten this scenario to actually happen. But it might! Maybe with more threads, or on a different computer, or if you ran the test a million times. I promise the above code is bad and you better not write it if you have real money at stake.
Or maybe I goofed and there’s some reason, unknown to me, why my scenario will never happen. I’ll update this post if I find out.
Why no bang?
As a total side note, you might wonder why the functions to modify
refs don’t have a bang at the end.
reset! have the bang,
It turns out the official meaning of a bang in Clojure is “this function is not safe to use in a transaction”. It doesn’t mean the function has a side effect, like in Scheme. It means if you call it in a dosync something bad will happen.
Every function that modifies a ref has to happen in a dosync, by definition, so they don’t have the bang.
Refs implement software transactional memory. This essentially means they provide similar guarantees to databases. Transactions on refs are atomic, consistent, and isolated — the A, C, and I in ACID.
Atoms can provide the same guarantees as long as you store all relevant state in a single atom. But refs generalize atoms, allowing you to express multiple identities in a more natural way, and allowing more concurrency: if you modify different, unrelated bits of state, you don’t get any unnecessary retries.
For further reading on refs, I recommend Neale Swinnerton’s post Clojure STM: What, Why, How?, which goes more into the nitty-gritty of how they’re implemented.
Thanks to Lindsey Kuper for inspiring this post and the previous “Atoms by experiment”. Lindsey urged me to take a more experimental, demonstrative approach after I showed her a very dry, early draft post which was basically my notes from watching a Rich Hickey talk. Not that there is anything wrong with Rich’s talks.
I don’t have any immediate plans to write about agents or vars, so this is the end for now. Thanks for reading!