Playing with refs
February 9, 2014
I’ve written before about Clojure’s atoms and how they
work. Atoms provide atomic,
consistent, and (if you only update them with swap!
) isolated access
to a single identity.
Sometimes, though, we need multiple identities we can deal with in a synchronized way. That’s where refs come into play. In this post, I’ll provide a broken example you can copy-paste into a REPL, then show how refs fix it. Finally, I’ll go into a few neat add-on features refs offer.
Transferring money
A classic example of where you would need refs is a bank transfer. If we represent two bank accounts as atoms, and transfer between them, we’re going to have problems. Let’s try it and see.
(def account-a (atom 1000))
(def account-b (atom 1000))
To make it more obvious that this code is buggy, our transfer function will have a built-in delay. But basically its job is to check that the sending account has a high enough balance to cover the transfer, then debit the sender and credit the receiver.
(defn transfer-money [from to amount]
(when (>= @from amount)
(swap! from - amount)
(Thread/sleep (rand-int 15)) ; To make effects more noticeable.
(swap! to + amount)))
To test this code we can make a bunch of random transactions:
(defn make-random-transactions [acc1 acc2 n]
(dotimes [_ n]
(let [[from to] (shuffle [acc1 acc2])]
(transfer-money from to (rand-int @from)))))
Without threads
If we only run the random transactions in one thread, we’re fine:
user=> (make-random-transactions account-a account-b 1000)
nil
user=> @account-a
1616
user=> @account-b
384
user=> (+ @account-a @account-b)
2000
The individual numbers will vary for you, obviously, but the sum should still be 2000, and neither account should be negative. That’s because we didn’t use multiple threads. (For this reason, ClojureScript, which targets a single-threaded environment, lacks refs. It doesn’t need them.)
With threads
Here’s some test code to run random transactions in 10 threads
simultaneously. The meat of it is in the test-threaded-transfers
function, but I have broken out reset-accounts
and sum-accounts
so
they can be redefined when we switch to using refs.
(defn reset-accounts []
(reset! account-a 1000)
(reset! account-b 1000))
(defn sum-accounts [& accts]
(reduce + (map deref accts)))
(defn test-threaded-transfers []
(reset-accounts)
(let [futures
(doall
(for [_ (range 10)]
(future (make-random-transactions account-a account-b
1000))))]
(while (not (every? future-done? futures))
(println "Sum of accounts:" (sum-accounts account-a account-b))
(Thread/sleep 1000))
(println "Final sum:" (sum-accounts account-a account-b))))
Every second, this will print out the sum of the two accounts. The sum
of the two accounts should always be 2000, but as you’ll see if you
run test-threaded-transfers
, the sum jumps all over the place:
user=> (test-threaded-transfers)
Sum of accounts: 54
Sum of accounts: 35
Sum of accounts: 964
Sum of accounts: 240
Sum of accounts: 1097
Sum of accounts: -683
Sum of accounts: 449
Sum of accounts: 1634
Final sum: 2000
nil
In a second, we’ll fix that using refs.
Refs: setting and altering
Refs are created like atoms:
user=> (def r (ref 400))
#'user/r
They’re modified using ref-set
, which works like reset!
, and
alter
, which works like swap!
. But there’s a catch. All
modifications to a ref must be wrapped in the (dosync ...)
macro.
user=> (alter r + 100)
IllegalStateException No transaction running
clojure.lang.LockingTransaction.getEx
(LockingTransaction.java:208)
user=> (dosync (alter r + 100))
500
dosync
creates a transaction. Inside a transaction, Clojure keeps
a log of all refs that were read or modified, and if one of them is
altered before the transaction completes, the whole transaction will
be retried. In addition, none of the in-transaction changes are
visible to other threads until the transaction completes (“commits”).
Then all of them are.
Refs are dereferenced using @
, just like atoms. As with atoms,
@r
is reader sugar for (deref r)
, and deref
can be spelled out
if you’re using a higher-order function (e.g., (map deref
list-of-refs)
).
Transferring money with refs
Let’s redefine our accounts to be refs and rewrite transfer-money
accordingly:
(def account-a (ref 1000))
(def account-b (ref 1000))
(defn transfer-money [from to amount]
(dosync
(when (>= @from amount)
(alter from - amount)
(Thread/sleep (rand-int 15))
(alter to + amount))))
We’ll also rewrite reset-accounts
from before.
(defn reset-accounts []
(dosync
(ref-set account-a 1000)
(ref-set account-b 1000)))
With sum-accounts
, there’s only one minor change, and that is to
wrap the function body from before in a (dosync ...)
. This ensures
we dereference all the refs at the same instant in time.
(defn sum-accounts [& accts]
(dosync
(reduce + (map deref accts))))
There’s no need to change test-threaded-transfers
. If you’re
copy-pasting all this into a REPL, it will still work.
Now:
user=> (test-threaded-transfers)
Sum of accounts: 2000
Sum of accounts: 2000
Sum of accounts: 2000
[ ... repeated many more times ... ]
Final sum: 2000
Much better!
So, what changed here? By wrapping our modifications in a dosync
,
we’ve given Clojure enough information to know that the changes to
from
and to
are connected, and should happen — from the view of
other threads — at the same instant in time.
However, you’ll notice the transfers now take much longer to complete.
Let’s see if we can speed them up by replacing alter
with commute
.
Commuting instead of altering
commute
is a special version of alter
for when your update function is
commutative. Suppose you want to alter a thing and also increment a
counter:
(dosync
(alter thing-a expensive-function-1 ...)
(commute counter inc))
; in another thread:
(dosync
(alter thing-b expensive-function-2 ...)
(commute counter inc))
These transactions alter completely different things. thing-a
for
the first, thing-b
for the second. No conflict there. But because
they both increment the same counter, if you used alter
one
transaction would “lose” and have to be retried in its entirety. Such
a waste.
The only thing we really need to do over in case of conflict is increment the counter. The increment is cheap and can be done independently.
And that’s exactly how commute
works. If the commuted ref is changed
by another thread, commute
simply re-runs the update function with
the ref’s new value at commit time. No need to repeat the expensive
changes to thing-a
and thing-b
.
We can apply this to our bank transfer example, but we have to be
careful. Using commute
on the addition is fine. Using it on the
subtraction would be a bug. Can you see why?
(defn transfer-money [from to amount]
(dosync
(when (>= @from amount)
(alter from - amount)
(Thread/sleep (rand-int 15))
(commute to + amount))))
On my computer, this version actually makes
(test-threaded-transfers)
take slightly longer — 44 vs. 41 seconds.
So this may not be the greatest example to try commute
on.
Validators
A bank balance should never be negative. We can express that restriction in code using a validator:
(defn valid-account-balance? [bal]
(>= bal 0))
(set-validator! account-a valid-account-balance?)
(set-validator! account-b valid-account-balance?)
Now if we try to set an account balance to a negative number, an exception is thrown and the ref’s value is unchanged:
user=> (dosync (ref-set account-a 1000))
1000
user=> (dosync (ref-set account-a -1))
IllegalStateException Invalid reference state
clojure.lang.ARef.validate (ARef.java:33)
user=> @account-a
1000
You can use validators with atoms too.
Breaking commute?
With this validator in place, we should now be equipped to answer the
question I asked above, as to why commute
can’t be used when we
subtract from a balance. Try it:
(defn transfer-money [from to amount]
(dosync
(when (>= @from amount)
(Thread/sleep (rand-int 150))
(commute from - amount) ;; BAD
(Thread/sleep (rand-int 15))
(commute to + amount))))
We’ll modify our make-random-transactions
function slightly:
(defn make-random-transactions [acc1 acc2 n]
(dotimes [_ n]
(let [[from to] (shuffle [acc1 acc2])]
(transfer-money from to (rand-int 500)))))
I changed (rand-int @from)
to (rand-int 500)
. So this code will
attempt to make transfers that are in excess of the sending account’s
balance, and only the when
clause in transfer-money
prevents that
from happening.
Confession time:
I was not able to get the above code, or any variations I tried, to
actually break. No matter what I did, it kept working. On my computer,
running (test-threaded-transfers)
at a REPL with the above
definitions still works fine.
But here’s how I hoped it would break, and why I believe it’s broken in theory. One of the accounts would go negative, making validation fail, like so:
@account-a
is 20- Thread 1 begins a transaction:
(transfer-money account-a account-b 20)
- Thread 1 verifies
(> @account-a 20)
, and decides to proceed with the transaction - Thread 2 begins a transaction:
(transfer-money account-a account-b 10)
- Thread 2 verifies
(> @account-a 10)
, and decides to proceed with the transaction - Thread 1 subtracts 20 from account A, adds 20 to account B
- Thread 1 commits. Globally,
@account-a
is 0 - Thread 2 subtracts 10 from account A, adds 10 to account B. Inside
of the transaction, thread 2 thinks
@account-a
is now 10 - Thread 2 tries to commit. The subtraction from
account-a
is repeated.@account-a
is now -10. Validation fails.
Despite many tries, I haven’t gotten this scenario to actually happen. But it might! Maybe with more threads, or on a different computer, or if you ran the test a million times. I promise the above code is bad and you better not write it if you have real money at stake.
Or maybe I goofed and there’s some reason, unknown to me, why my scenario will never happen. I’ll update this post if I find out.
Why no bang?
As a total side note, you might wonder why the functions to modify
refs don’t have a bang at the end. swap!
and reset!
have the bang,
alter
and ref-set
don’t.
It turns out the official meaning of a bang in Clojure is “this function is not safe to use in a transaction”. It doesn’t mean the function has a side effect, like in Scheme. It means if you call it in a dosync something bad will happen.
Every function that modifies a ref has to happen in a dosync, by definition, so they don’t have the bang.
Conclusion
Refs implement software transactional memory. This essentially means they provide similar guarantees to databases. Transactions on refs are atomic, consistent, and isolated — the A, C, and I in ACID.
Atoms can provide the same guarantees as long as you store all relevant state in a single atom. But refs generalize atoms, allowing you to express multiple identities in a more natural way, and allowing more concurrency: if you modify different, unrelated bits of state, you don’t get any unnecessary retries.
For further reading on refs, I recommend Neale Swinnerton’s post Clojure STM: What, Why, How?, which goes more into the nitty-gritty of how they’re implemented.
Thanks to Lindsey Kuper for inspiring this post and the previous “Atoms by experiment”. Lindsey urged me to take a more experimental, demonstrative approach after I showed her a very dry, early draft post which was basically my notes from watching a Rich Hickey talk. Not that there is anything wrong with Rich’s talks.
I don’t have any immediate plans to write about agents or vars, so this is the end for now. Thanks for reading!