August 28, 2014
Summary: I go over a real-world example of how atoms and immutable values allow you to compose constructs in ways that are easy to reason about and less prone to error.
The other day I was in IRC #clojure and someone asked a good question. They had code like the following, and they couldn't understand why they couldn't modify a map.
(def state (atom {}))
(doseq [x [1 2 3]]
(assoc @state :x x))
(println @state)
What does this print? Well, the asker wanted it to print {:x 3}
. But it printed {}
. To understand what's happening, let's go step by step.
{}
creates an empty map. It's literal syntax for a constructor for a map. This one happens to be empty.
(atom {})
takes the empty map that was just created and passes it to the function atom
, which constructs a new clojure.lang.Atom
. Atoms are objects, and its current state is the empty map we just passed in.
(def state (atom {}))
defines a new var called state
in the current namespace.
At this point, we've got a variable called state
whose value is an atom that holds an empty map.
(doseq [x [1 2 3]]
loops over the numbers 1, 2, and 3. x
will be bound to each of those numbers, in turn.
@state
gets transformed into (deref state)
, which returns the current value of state
. :x
is a literal keyword, and x
is a reference to the x
bound inside the loop.
(assoc @state :x x)
creates a new map by taking the current value of state
(which happens to be {}
) and associating :x
with x
(which will be 1
, 2
, and 3
as the loop happens). The value is returned by assoc
, and then thrown away, since it isn't bound to anything.
Then (println @state)
will print the current value of state
, which still is {}
.
This code shows a common problem that beginners face in Clojure: how do immutable data structures (like maps) and the concurrency primitives (like atom
) work together to manage state?
The answer is quite simple (in the Rich Hickeyan sense) and elegant. By separating the ideas of value and state, Clojure has made it easy to express precisely the behavior you want in concurrent systems.
The value is the map. It is immutable. It cannot change. It is a single value, and it will always be the same. That means threads can share the value with no worries that one of them will change it.
The state is the atom. It's a mutable object. And being an object, it has methods that define its interface. In the code above, we saw that you can call deref
on an atom to get its current value. deref
is basically a getter.
The main way to change the value of an atom is using swap!
. swap!
takes an atom and a function (plus optional arguments) and calls the function on the current value of the atom. It then sets the value of the atom to the return value of the function. So let's use that to fix the code.
(def state (atom {}))
(doseq [x [1 2 3]]
(swap! state assoc :x x))
(println @state)
swap!
takes the atom (state
) and a function (assoc
) and some arguments (:x x
). It calls assoc
on the current value of state
with those extra arguments and sets the value of the atom to the return value of the function.
The swap!
expression is almost (but not) the same as this code:
(reset! state (assoc @state :x x)) ;; never do this
reset!
changes the state of the atom but without regard to the current value. This new code is bad because it's not thread-safe. Use swap!
if you need to use the current value to determine the new value.
So what does an atom do? What does it represent?
Atoms guarantee one very important thing: that each state is calculated from the last state. The swap!
operation is atomic. No matter how many threads are trying to change the value, each change is calculated from the previous value and no previous values are lost. That's its contract as an object and it's one of the important ways that Clojure helps with concurrency.
How can a value be lost?
If we have two threads, each trying to change state
in the same incorrect way (using reset!
), the order of evaluation will have several steps:
(deref state) ;; call this value *1
(assoc *1 :x x) ;; call this value *2
(reset! state *2)
Because the threads are running concurrently, the operations have a chance of interleaving their steps in unwanted ways. For instance, threads A and B might interleave like this:
- A:
(deref state) ;; call this value *1A
- A:
(assoc *1A :x x) ;; call this value *2A
- B:
(deref state) ;; call this value *1B
- B:
(assoc *1B :x x) ;; call this value *2B
- B:
(reset! state *2B)
- A:
(reset! state *1A)
What happened? On line 6, A set the value of state
to the value it calculated on line 2. So B's work is completely discarded. That's probably not what was intended. What's worse is that that is one of many possible interleavings, some of which work and some don't. Welcome to concurrency!
What you probably wanted was to make sure that no work is discarded. You want the operation to be atomic. That's why it's called an atom. swap!
is atomic. A swap!
to an atom occurs "all at once", instead of on three lines like the reset!
example. If two threads are doing swap!
, there are two possible interleavings.
- A:
(swap! state assoc :x x)
- B:
(swap! state assoc :x x)
And
- B:
(swap! state assoc :x x)
- A:
(swap! state assoc :x x)
These are usually what you want. If only one or neither one works, atom is not the right construct for you.
So there you go. Atomic mutable state with immutable values gives you a nice, composable concurrency semantics. You could do it with locks but it's harder to ensure you're doing it correctly. It's slightly higher-level than locks yet it provides tremendous value. Atoms are easier to reason about and less prone to errors.
If you'd like to learn the basics of Clojure, I recommend my video course called LispCast Introduction to Clojure. I don't go over concurrency, but you will learn lots of functional programming. Go check out the description to see if it's right for you.
You might also like
April 04, 2015
Leon Barrett's talk at Clojure/West is about parallelism in Clojure.
Background
Clojure is well known for its parallel programming super powers. Immutable data structures, concurrency primitives, and a few convenient constructs like future
and pmap
have been there since the beginning. But what's even cooler is how people have been able to build on the strong foundation Clojure established to create new parallel abstractions. Leon Barrett will talk about some of these. The description mentions reducers, tesser, and claypoole.
Rich Hickey gave a talk about reducers back in 2012, focusing on the ideas and abstractions they are based on. A more practical talk was given by Renzo Borgatti at Strange Loop 2013. Kyle Kingsbury gave a talk about tesser, a library which extends Clojure's parallel abstractions to execute in a distributed manner. And Leon Barrett himself wrote a recent blog post about Claypoole.
Homepage - GitHub - Google+

This post is one of a series called Pre-West Prep, which is also published by email. It's all about getting ready for the upcoming Clojure/West, organized by Cognitect. Conferences are ongoing conversations and explorations. Speakers discuss trends, best practices, and the future by drawing on the rich context built up in past conferences and other media.
That rich context is what Pre-West Prep is about. I want to enhance everyone's experience at the conference by surfacing that context. With just a little homework, we can be better prepared to understand and enjoy the talks and the hallway conversations.
Clojure/West is a conference organized and hosted by Cognitect. This information is in no way official. It is not sponsored by nor affiliated with Clojure/West or Cognitect. It is simply me (and helpers) curating and organizing public information about the conference.
You might also like
October 10, 2014
Summary: There are a few conventions in core.async that are not hard to use once you've learned them. But learning them without help can be tedious. This article presents three guidelines that will get you through the learning curve.
Introduction
The more you use core.async, the more you feel like Willy Wonka. He knew how to maximize the effectiveness of the Oomploompa. And while core.async comes with a lot of functions built in, he knew exactly which ones to use at which time.
In this extremely rare glimpse into the functioning of his mysterious factory, we take a look at the guidelines Wonka himself follows when orchestrating the work of the Oompaloompas.
When to use go
versus thread
?
Background
Each Oompaloompa is a thread. Willy Wonka has a special group of Oompaloompas he calls a thread pool. Their assigment is simple: they manage a group of tasks that Wonka calls go
blocks. Whenever Wonka has an appropriate task, he writes a go
block and hands it to the Oompaloompas to work on.
As the Oompaloompas work, they take one task and do it until the task parks. When it parks, they put it down and pick up another task that isn't parked. Tasks become unparked when they get new input from the chocolate pipes. Then the Oompaloompas can continue working on them.
At one time, Wonka used to give the thread pool all sorts of tasks. He would give them very long calculation tasks, like weighing each chocolate bean in his chocolate bean mountain. He noticed that when they did this, lots of tasks were left undone, even though they were not parked, because all of the Ooompaloompas were busy doing something else.
So he came up with a guideline.
Avoid long calculations and blocking inside go
blocks
Does your code do significant I/O, like downloading a file or writing to the network? Are you doing a very long calculation?
Then use a thread
. If it will take a long time or block, you want a dedicated thread. It can work as long as it wants, and even block. That way it doesn't slow down the work of the thread pool.
Otherwise, you can use a go
block.
When to use single- versus double-bang (!)
Background
Wonka also noticed that he needed to write different instructions for his two types of Oompaloompa. When he wrote a go
block, he needed to say "park while you wait for input". But for the other Oompaloompas created with thread
(or for his own work), he needed an instruction that said "block while you wait for input".
So he came up with a little notation convention. If you're just parking, so you're in a go
block, use one bang. If you're outside of a go
block, meaning you need to block, use two bangs.
These were his versions of his basic instructions:
>!
, <!
, and alts!
versus >!!
, <!!
, and alts!!
. The convention is easy.
Use single-bang versions in go
blocks and double-bang versions outside.
The single-bang versions of these functions are meant to park a go
block. Although they are defined as functions, they have special meaning to the go
macro. In fact, if you actually run the functions (outside of a go
block), they will throw an exception unconditionally, telling you they are meant to be inside a go block.
The double-bang versions are blocking. That means that the thread they are running on will block if the channel is not ready. They can be used outside of a go
block (anywhere) or inside of a thread
block. It's safe to block inside a thread
block since it's a dedicated thread.
put!
Background
Like all factories, Willy Wonka's needs deliveries. When the UPS truck comes, there's plenty of boxes to unload. But Wonka is busy. So he leaves a note outside for the delivery guy.
The note tells the guy where to put everything so the Oompaloompas know where to find it. When he says where to put a box, he spells it put!
. That is, it has a bang.
It's unfortunate because the other functions with a bang mean they park. But put!
does not park. Wonka was just angry one day, and the convention stuck.
But the delivery guy knows that Wonka is eccentric, so he doesn't take it personally and does his job. He puts stuff in its places, without blocking.
Use put!
to get stuff into your channels from outside.
put!
is a way to get values from outside of core.async into core.async without blocking. For instance, if you're using a callback-style, which is very common in Javascript, you will want to make your callback call put!
to get the value onto a channel.
Conclusion
That's it! Now to eat some chocolate!
core.async is really cool, but it has a learning curve. Once you learn these conventions, you will begin to feel the power they give you, whether you're making chocolate or building cars. If you'd like to learn core.async and feel like Willy Wonka, I recommend the LispCast Clojure core.async videos. They build up a deep understanding of the fundamental concepts in a fun and gradual way.
You might also like