Keyword: bootstrapping

Clojure is Imperative

August 22, 2014

Summary: Clojure is an imperative language. Its operations are defined in terms of concrete actions. But those actions are often the same actions available to the programmer at runtime. This makes it easy to bootstrap.

Update: أخلاق الخيميائي pointed out that I was wrong about the size of GHC. Luckily it was not salient to my point so I just removed that part of the article.

Update: After talking with several people, I've decided that my writing was really unclear. I've done some major editing to make it as clear as I can. Thanks to everyone who commented and helped me clarify my thinking and writing.

I was recently on the Cognicast and I mentioned something really important to me, but I did not go that deep into it.

Clojure, and Lisps in general, are imperative languages. Yes, they are good for doing functional programming, but their main paradigm is executing lists of commands in order.

On the podcast I mentioned the first imperative example that came to mind, which was the do form, which executes each expression in the body and returns the value of the last expression. You would only want to execute an expression and throw away its value for its side effects.

But why is that important to me? It got me thinking about a deeper but related idea.

Clojure is a relatively transparent layer above the JVM. I say "relatively" because languages do get quite a bit more opaque¹. But it manages to be powerful through well-chosen abstractions.

I should be a little more specific about what I mean by "transparent" and "opaque". This should be the most controversial part of this post, so I want to get this right. These are not formal definitions. Transparency/opaqueness measures abstractions. Opaque abstractions show less of the underlying machinery. Transparent abstractions show their machinery. This is a spectrum.²

Clojure's functions are rather opaque. Defining a function (with fn) in Clojure creates a class and instantiates it with the values from its lexical environment. This happens without having to think about classes. You're not thinking about the machinery. The machinery leaks out sometimes, like when you're looking at stack traces. But in general, an illusion is maintained.

But Clojure's def form is pretty transparent. You do have to think about what it's doing, about the current namespace, the order of the defs in a namespace, etc. There is not much of an illusion to maintain.

Haskell has a well-defined execution semantics. It's formally defined and you can step through the execution of a Haskell program by hand if you want. In that sense, it's imperative. But the execution order is obscured by the somewhat opaque abstraction of lazy evaluation. Clojure's execution order is more or less directly the execution order of the JVM it runs on--hence more transparent.

The reason this is important is that Clojure's strategy is to be transparent unless there is significant gain. This is part of what is meant by "embracing the host". Haskell's strategy is orthogonal to the transparency/opaqueness axis. Haskell aims to be formally well-defined. Formal semantics allows deep static analysis and program transformation.

Besides the strategy of being transparent, what I like even more about Clojure is that the many abstractions are defined in the same abstractions that you have available as a programmer.

This is from the docstring of def:

Creates and interns a global var with the name of symbol in the current namespace (*ns*) or locates such a var if it already exists. If init is supplied, it is evaluated, and the root binding of the var is set to the resulting value. If init is not supplied, the root binding of the var is unaffected.

Creating a var? I can do that. Interning it? I can do that, too. Setting the root binding? Easy! The core can be kept minimal because abstractions can build on each other. If you get the abstractions right, the amount of code you have to write in your implementation language is small.

And this gets to the heart of it: you can write a Lisp yourself. Many people have. You can write an easy Lisp compiler in a weekend and build features on top of it, almost never having to change the original compiler.

This is the magic of bootstrapped languages like Lisps. They have a small core that you need to get right, then everything else can be written in that core. It's the ultimate minimal virtual machine.

What's the relationship between bootstrapping and transparency? The more opaque the abstractions, the more the language must do to maintain the illusion. Lisps are easy to bootstrap because the abstractions chosen are either transparent and trivial to implement (like def or if) or opaque and powerful (like fn).

I like Lisps (and Clojure) because I feel that I can understand them and build them myself. I don't actually understand everything, but I could if I tried. Somewhere along the way I developed a deep interest in bootstrapping. Bootstrapping is compounded leverage. You build small abstractions on top of the previous ones, and use those to build yet grander ones.

If you like this attitude toward programming languages, you should learn a Lisp. I suggest Clojure, and I recommend the LispCast Introduction to Clojure video series. You'll learn about building up powerful abstractions, one layer at a time, in a small amount of code.

Learn Functional Programming using Clojure with screencasts, visual aids, and interactive exercises

Learn more

There are more transparent languages as well, but they tend to be obscure.↩
As an aside to those who read previous versions of this post, what I meant by imperative/declarative was transparent/opaque. I botched it and I'm trying to get this idea right.↩

Two Kinds of Bootstrapping

August 23, 2014

Summary: I like languages with a small core that is extensible. The languages tend to be weird and require less code to bootstrap.

I know of two ways to bootstrap a language.

The first way is probably more traditional. I'll call the first way Type 1. In Type 1, you write a bare-minimum compiler for your language in a host language. So maybe you write a Lua compiler in C. Then you write a Lua compiler in Lua. Then you compile your compiler. Now you have a compiler, written in Lua. You can add to it and modify it without ever having to touch the C code again. You have the advantage of writing the features of your language (Lua) in a higher-level language (Lua). And finally, as you add features to your compiler, you can use those to add more features. There's some leverage.

I like the second way better. I'll call it Type 2. In Type 2, you write a small, powerful set of abstractions in the host language. For instance, you write an object system in C, a stack and dictionary in assembler, or lexical closures in Java. Then you write a compiler that targets those abstractions. If the abstractions are chosen correctly, your compiler is done. You can begin building abstraction on top of abstraction without touching the compiler.

There are a few things to note:

Type 2 languages (Lisp, Smalltalk, FORTH) tend to be weird because they were birthed in a different way. The abstractions, though powerful, are often raw.
Type 2 languages can be bootstrapped faster. The core is often much smaller than a full-featured compiler.
Type 2 languages tend to require less code in general. I guess it's because you're writing most of it in a language that is compounding leverage.
Type 2 languages are more easily ported, since all you have to do is rewrite the core. Type 1 languages, depending on how they are built, can require you to re-bootstrap or write a cross-compiler.

In the end, I believe that both Type 1 and Type 2 are viable options for language-building. I prefer Type 2. If Type 2 intrigues you, you should learn Lisp (or FORTH or Smalltalk). I recommend the LispCast Introduction to Clojure videos course.

Learn Functional Programming using Clojure with screencasts, visual aids, and interactive exercises

Learn more

Clojure is Imperative

You might also like

Two Kinds of Bootstrapping

You might also like