December 29, 2012
Summary: LESS and Sass (and similar solutions) have saved CSS for three reasons: separation, abstraction, and cascading. While I welcome them, CSS still has other problems which I believe can be solved. I propose some solutions.
Introduction
A lot is said about LESS and Sass, and for good reason. CSS is hell to get right and even harder to maintain. LESS and Sass (and similar tools) make CSS into a much more useful language.
But when people talk about why they are so great, they miss the main point. It is true that your style files are now shorter and more readable. However, there is something deeper going on than mere saving of keystrokes and being able to name things.
In this essay, I will try to put into words (and some pictures) what my intuition tells me as a developer and programming language enthusiast to clarify why CSS is innately unmaintainable, does not satisfy its own design goals, and why LESS and Sass make a bad language more bearable. I also will propose solutions which would raise the bar past the high level where LESS and Sass have taken it.
Zero Degrees of Separation
Way back when, people used HTML tables to style their pages. Documents looked like this:
Those were the days of font tags and tables.
Then CSS came along, and people talked a lot about separation of content from presentation. CSS did help you move styling concerns outside of the HTML file, but that is about it.
Your styles were still tied to the structure of the document they were styling. They had no grouping of their own. If you wanted to repeat a style, you either had to copy and paste or use a selector with a comma. Both were bad solutions.
An even worse solution, which is, unfortunately the most common, is to build CSS classes which name a style. We see this in the numerous and all equally bad "CSS frameworks" which litter your HTML with style information. Grid systems do this to a fault.
But it is not the fault of the authors of those frameworks, nor of the poor web developers who are in search of some solutions to their problems. No, the blame lies with the authors of CSS itself. With CSS, separation of content from presentation is possible but extremely difficult and time-consuming.
HTML and CSS are separate but not equal. HTML can exist without CSS. But what is CSS without HTML? Nothing.
If you truly want to be able to separate content from presentation, you have to set them both on equal footing, like this:
Content in the HTML, style in the CSS, and you tie them together in some third language.
This is possible in LESS (LESS being the question mark), evidenced by the existence of frameworks such as LESS Elements. People talk about LESS reducing boilerplate and repetition. Or about hiding browser-specific CSS properties. But that is not the essence of the matter. What all of that talk is trying to get at is that they can finally define a style, in the form of a mixin, which exists independently of any HTML structure. It can then be tied into zero, one, or more HTML elements merely by mentioning its name.
With LESS, I can define a mixin called Dorothy:
.dorothy() {
background-color: green;
text-color: yellow;
border: 1px solid red;
}
Yes, it is probably an ugly style. But it is just a style. It does not depend on any HTML structure for its existence, not even one <p>
tag. Now, if I want to use it, I can use it wherever I want by relating, in a separate way, the style with some HTML.
div.main {
.dorothy;
}
blockquote {
.dorothy;
}
This is one of the reasons LESS makes styling HTML bearable. In addition to mixins, you can also define variables which contain sizes and colors, which is just another way to name styles (or elements of styles) to be tied to HTML later.
Abstraction
If you take the idea of mixins and variables even further, you will notice that they compose. I can define a mixin and use it in another mixin. I could define dorothy
as the composition of three styles, red-border
, yellow-text
, and green-background
. This type of composition suggests that there is some amount of abstraction going on.
This was not possible in HTML + CSS.
Well, I say not possible, but there were ways, they were just terrible. You could copy-paste, which is just not a solution at all, but it would get you your style. Or you could reuse non-semantic class names like in grid frameworks (blech!). Or, finally, you could do what I call "inverted-style", where the styles take precedence and the selectors take a subordinate role. That will take some explaining.
Let us say we want div.main
and blockquote
to be styled like dorothy
. Also, div.main
and p
should have a top margin.
Normally, we would write this in CSS:
div.main {
background-color: green;
text-color: yellow;
border: 1px solid red;
margin-top: 10px;
}
blockquote {
background-color: green;
text-color: yellow;
border: 1px solid red;
}
p {
margin-top: 10px;
}
This does not look bad, but there is a lot of repetition and the intent is not clear. We could instead write it in inverted-style.
/* Dorothy style */
div.main, blockquote {
background-color: green;
text-color: yellow;
border: 1px solid red;
}
/* Top margin */
div.main, p {
margin-top: 10px;
}
If we discover that div.footer
also needs a top margin, we add it to the selector instead of making a new rule. I bet someone else has come up with this style (and probably a better name for it), but I am unaware of it. I also guess that this was one of the original intentions of the CSS authors. In practice, in my experience, this is hard to keep up. Somehow, I do not have the discipline to keep the styles separated into their own rules. CSS properties that are related to div.main
and blockquote
, but not to dorothy
slip into that first rule, and then all is lost. Maybe a professional could do better, but I have never met one.
I do not have that problem with LESS. It is simple to define a mixin once I identify a consistent set of properties. I can then reuse it wherever I want.
Cascading
LESS and Sass provide a pretty good, but partial, solution to the cascading problem. The cascading problem is basically one of complexity. There are too many places for the value of a particular property for a particular element to be set. And the rules for determining the precedence of all of those places are too complex.
The value of a CSS property is determined by these factors:
- The order of CSS imports
- The number of classes mentioned in the CSS rule
- The order of rules in a particular file
- Any
inherits
settings
- The tag name of the element
- The class of the element
- The id of the element
- All of the element's ancestors in the HTML tree
- The default styles for the browser
It is just too many factors. Yes, the wisdom is to keep everything clean and simple. That works for small projects but at some point, cascading rules will bite you.
Jason Zimdars shows the solution to cascading they came up with at 37 Signals. He shares a good analysis of the problem and how LESS can alleviate some of the pain.
For the first time we could write CSS that we knew wouldn’t cause problems later on because they couldn’t cascade out of control. This opened us up to create elements with self contained styles that could be dropped onto most any page and they’d just work.
Sounds like the holy grail of separation of presentation from content!
By using nested LESS rules and child selectors, we can avoid much of the pain of cascading rules. Combining with everything else, we define our styles as mixins (and mixins of mixins) and tie the styles to the HTML with nested rules which mimic the structure of the HTML.
Perfect, final solution? Not quite.
Further
There are a few more issues to deal with. LESS and Sass were defined as supersets of CSS. That means that your valid CSS files are automatically LESS files as well, which means you can just start using the LESS compiler.
But it also means that LESS has inherited all of the problems it has no solution for. What I will suggest is that we need a subset of CSS to move further, and I will attempt to choose that subset. I would love to hear your suggestions, as well.
Yes, nested rules help you deal with cascading, but there are other issues with cascading. Mixins cannot really help you with the box model. No amount of variables and arithmetic can make two divs have the same width.
I will go through the problems one by one.
Cascading, again
Let me put it bluntly, cascading was a mistake on the part of the authors of CSS. It has a nice abstract purity to it, but it does not work well in practice.
With hindsight, we see that we really only want one level of cascading. The CSS reset was a beautiful invention which neutralized differences between browsers. The CSS reset cut off cascading from the default browser styles and gave you a fresh base to start with. That is really all of the cascading that you want: cascading to a sane default. Other than that, it turns into a mess of spaghetti.
Sometimes it seems that you want some cascading. For instance, you want to set the font family of the entire document. So you declare body { font-family: 'Comic Sans'; }
and call your job done. In such a declaration, you are implicitly relying on the inheritance of the font-family property down through the document tree. In fact, if you want every element to have a certain font, you should just say it: * { font-family: 'Comic Sans'; }
This has the same effect as a CSS reset: set the default styles for everything in one place.
This implies a rule: Reset once, then avoid cascading. We now just have to systematically apply it. Here is what our setup looks like now:
No cascading means we must restrict ourselves to never select the same elements with different rules. I cannot say how we can do this strictly. But we can define some guidelines.
- Only bare (classless + non-nested) selectors may occur in the reset.
- No bare selectors may occur in the LESS rules.
- No selector may be repeated in the LESS rules.
These guidelines will limit the amount of cascading even further when combined with Zimdars' solution.
Common mistakes
I call them mistakes for lack of a better word, but really the blame lies on CSS.
Box model
The box model sucks. But we can avoid some of the easy errors to make.
One mistake is what happens when you define the left-margin
but not the right-margin
. In such a situation, where does the right-margin
get determined? Cascading.
And what happens when I set the width to 100%? What if a padding is cascaded in? Oops.
How to deal with this? Do not use individual CSS properties where a compound property exists.
I propose to boycott the following properties:
left-margin
, right-margin
, top-margin
, bottom-margin
; use margin
instead
left-border
, right-border
, top-border
, bottom-border
; use border
instead
left-padding
, right-padding
, top-padding
, bottom-padding
; use padding
instead
width
, height
; use the .dimension mixin instead
To get more sane behavior, we define this mixin:
.dimension(@w,@h) {
width : @w;
height : @h;
margin : 0;
padding : 0;
border-width: 0;
}
This forces you to set width and height at once, and it resets the margin, padding, and border (which affect actual width, thanks to the box model). You can still override them, you just have to do it explicitly. This does not solve the entire problem of the box model, but it helps cut out a lot of surprising behavior.
My argument for using this mixin is that any time you are setting the dimensions of an element, you should also be explicit about the margin, padding, and border at that point, since they affect the box model.
Font color
Now I will pick some nits.
How many times have you seen this code?
body {
color: black;
a:link {color: blue; }
a:hover {color: red; }
a:active {color: blue; }
a:visited {color: purple; }
}
Too much! And I always forget one of them. Time for a mixin.
.font-color(@f,@a:blue,@h:red,@c:blue,@v:purple) {
color: @f;
a:link {color: @a};
a:hover {color: @h};
a:active {color: @c};
a:visited {color: @v};
}
Again, the pattern is clear: what you do not set explicitly gets reset to a default.
Conclusion
Separating style from content was never fully achieved with CSS. LESS (and Sass) finally allowed the separation to occur. And, using LESS, we can begin to round off the sharp edges of CSS. But instead of adopting a superset of CSS, we should be looking to subset CSS and replace problematic CSS properties with mixins. The subset could be enforced with a linter.
These recommendations are a good start, but there is still a long way to go.
Post script
There is one final reflection into CSS cascading that I wanted to mention but could not find a place for it above, mainly because it is not a problem so much as an inconvenience. I have often wondered why in CSS, element styles (styles defined in the style
attribute of an HTML tag) take precedence over all other styles. Similarly, why do styles defined in the HTML (in a style
tag) take precedence over those that are linked to externally? It has always made more sense to me that it should be the exact opposite. An HTML page could define default styles for its elements, which would be carried in the page, and overriden with an external stylesheet.
However, the actual rules dictate that I must edit the HTML file if I want to change the style of an element with an element style. In this not the exact opposite of the intention of CSS?
For more inspiration, history, interviews, and trends of interest to Clojure programmers, get the free Clojure Gazette.
Learn More
Clojure pulls in ideas from many different languages and paradigms, and also from the broader world, including music and philosophy. The Clojure Gazette shares that vision and weaves a rich tapestry of ideas from the daily flow of library releases to the deep historical roots of computer science.
You might also like
April 05, 2014
Summary: Use the OWASP Top Ten Project to minimize security vulnerabilities in your Clojure web application.
Aaron Bedra gave a very damning talk about the security of Clojure web applications. He went so far as to say that Clojure web apps are some of the worst he has seen. You should watch the talk. He has some good recommendations.
One of the jobs of web frameworks is to handle security concerns inherent in the web itself. Because most Clojure programmers build their own web stack, they often fail to look at the security implications of their application. They do not protect their site from even the easiest and most common forms of vulnerabilities. These vulnerabilities are problems with the way the web works, not with the particular server technology, yet it has become the server's responsibility to mitigate the vulnerabilities. Luckily, the vulnerabilities are well-studied and there are known fixes.
The Open Web Application Security Project (OWASP) does a very good job of documenting common web vulnerabilities and providing good fixes for them. They have a project called the Top Ten Project which every web developer should refer to regularly and use to improve the security of their app. You should also run through the Application Security Verification Standard checklists to audit your code. But the Top Ten should get you to understand the basics.
Warning: I am not a security expert. You should do your own research. The code I present here is my own interpretation of the OWASP recommendations. It has not been audited by experts. Do your own research!
Also, security is an ongoing concern. If you have any comments, suggestions, or questions, please bring them up!
Here is the Top Ten 2013 with a small breakdown and a Clojure solution, if applicable.
If a server accepts input from the outside and then parses and interprets that input as a scripting or query language, it is open to attack. The most common form is SQL Injection, where an input form is posted to the server, the value of that form is concatenated into a string to make a SQL statement, and then the SQL statement is sent to the database to be executed. What happens if a malicious user types in "'; DELETE FROM USERS;"
?
My preferred solution to SQL Injection in Clojure is to always use parameterized SQL statements. clojure.java.jdbc
, supports these directly. The parameters will be escaped, making injection impossible.
Another problem is if you want to read in some Clojure data from the client, and you call clojure.core/read-string
on it. read-string
will execute arbitrary Java constructors. For instance:
#java.io.FileWriter["myfile.txt"]
This will create the file myfile.txt
or overwrite if it already exists. Also, there is a form (called read-eval form) to execute code at read-time:
#=(println "Hello, vulnerability!")
Read in that string, and it will print. Any code could be in there.
The solution is to never use clojure.core/read-string
. Use clojure.edn/read-string
, which is a well-documented format. It does not run arbitrary constructors. It has no read-eval forms.
Summary: Always use parameterized SQL and use clojure.edn/read-string
instead of clojure.core/read-string
on edn input.
Authentication
This is a big topic and I can't address it all here. Clojure has the Friend library, which is the closest thing we have to a de facto standard. My suggestion is simply to read the entire Friend README and evaluate whether you should use it. This is serious stuff. Read it.
Session Management
Ring provides a session system which is fairly good. It meets many of the OWASP Application Security Verification Standard V3 requirements. But it does not handle all of them automatically. You still need code audits. For instance, if you are logging requests, OWASP recommends against logging the session key. You must ensure that the session key is added after the request is logged.
The ASVS also recommends expiring your sessions after inactivity and also after a fixed period, regardless of activity. Ring sessions do not do this automatically (the builtin mechanism has no notion of expiration) and the default implementations of session stores will store and accept sessions indefinitely. A simple middleware will do the trick of expiring them in both cases:
(defn wrap-expire-sessions [hdlr & [{:keys [inactive-timeout
hard-timeout]
:or {:inactive-timeout (* 1000 60 15)
:hard-timeout (* 1000 60 60 2)}}]]
(fn [req]
(let [now (System/currentTimeMillis)
session (:session req)
session-key (:session/key req)]
(if session-key ;; there is a session
(let [{:keys [last-activity session-created]} session]
(if (and last-activity
(< (- now last-activity) inactive-timeout)
session-created
(< (- now session-created) hard-timeout))
(let [resp (hdlr req)]
(if (:session resp)
(-> resp
(assoc-in [:session :last-activity] now)
(assoc-in [:session :session-created] session-created))
resp))
;; expired session
;; block request and delete session
{:body "Your session has expired."
:status 401
:headers {}
:session nil}))
;; no session, just call the handler
;; assume friend or other system will handle it
(hdlr req)))))
Set the HttpOnly attribute on the session cookie. Very important for preventing stealing of session ids from XSS attacks.
Do not set the Domain attribute, and do set the Path if you want something more restrictive than /
(the Ring session default).
Do not set the Expire and Max-Age attributes. Setting them makes the browser store the session id on disk, which simply expands the number of ways an attacker can get ahold of it.
Change the session cookie name to something utterly generic, like "id". You don't want to leak more information than necessary about how your sessions work.
Use HTTPS if you can and set the Secure attribute of the cookie.
Do not use in-cookie sessions. In-memory are good but they can't scale past one machine. carmine
has a redis-based session implementation.
Summary: Here's how I use Ring sessions (with carmine
) based on these OWASP recommendations.
(session/wrap-session
(wrap-expire-sessions
handler
{:inactive-timeout 500
:hard-timeout 3000})
{:cookie-name "id"
:store (taoensso.carmine.ring/carmine-store redis-db
{:expiration-secs (* 60 60 15)
:key-prefix ""}) ;; leak nothing!
:cookie-attrs {:secure true :httponly true}})
Whenever text from one user is shown to another user, there is the potential for injecting code (HTML, JS, or CSS) that is run in the victim's browser. Imagine if Facebook allowed any HTML in the post submission form. A malicious user could add a <script>
tag with some keystroke logging code. Anybody who viewed that post in their feed would also get the key logger installed. That would be bad.
XSS is common because of how easy it is to make an app that stores user input (from a form post) in a database, then constructs the page out of stuff from the database. If you're not extremely careful, you could create a place where people can exploit each other.
The solution is to only use scrubbed or escaped values to build HTML pages. Because HTML pages can include different languages (HTML, CSS, JS), text needs to be scrubbed differently in each context. OWASP has a set of rules to follow which will guarantee XSS prevention.
hiccup.util/escape-html
(also aliased as hiccup.core/h
) will escape all dangerous HTML characters into HTML entities. JS and CSS still need to be handled, and rules for HTML attributes need to be followed.
If you want to allow some HTML elements, you will need to do a complex scrub. Luckily, Google has a nice Java library that sanitizes HTML. Use it.
Summary: Validate and scrub input from the user and scrub/escape text on output.
This one is a biggie: each handler has to do authentication. Does the particular logged in user have access to the resources requested? There's no way to automate this with a middleware. But having some system is better than doing it ad hoc each time. Remember: an attacker can construct any URL, including URLs with a database key in it. Don't assume that just because a request contains a key, the user must have the rights to it.
Summary: Always check the authority of the requesting session before performing an action.
This is about keeping your software up to date and making sure the settings of all software makes sense.
Having data is risky. Don't let it leak out.
Use an authorization system (Friend) and audit the roles used for access control.
Let's imagine you have a bank account at Bank of Merica. You just checked your balance and didn't log out. Then you go to some public forum, where someone has posted a cool file. There's a big download button. You click it, and the next thing you know, you're on your bank page and all of your money has been transfered out of your account.
What happened?
The download button said "Download" but it was really a form submit button. The form had hidden fields "to-account", and "amount". The action of the form was "http://www.bankofmerica.com/transfer-money". By clicking that button, the form was posted to the bank, and because you were just logged in, oops, it transfered all your money away.
The solution is that you only want to accept form posts that come directly from your site, which you control. You don't want some random person to convince people to click on other sites to be able to transfer people's money like that.
There are several possible solutions. One approach is to add a secret to the session and also insert that secret into every form. That is the approach taken by the ring-anti-forgery library.
The solution that I like is to do a double-submit. This means you submit a secret token in the cookie (sent with each web request) and in a hidden field in the form. The server confirms that the cookie and the hidden field match. But the hidden field in the form is added by a small Javascript script which reads it from the cookie. Browsers don't allow Javascript to read cookies from other sites, so you guarantee that they form was posted from your site.
There are three parts to the solution.
- Install a secret token as a cookie.
- Install a script to add the hidden field to all forms.
- Check that the field matches the cookie on POSTs.
Here is some code to do 1 and 3.
(defn is-form-post? [req]
(and (= :post (:request-method req))
(let [ct (get-in req [:headers "content-type"])]
(or (= "application/x-www-form-urlencoded" ct)
(= "multipart/form-data" ct)))))
(defn csrf-tokens-match? [req]
(let [cookie-token (get-in req [:cookies "csrf"])
post-token (get-in req [:form-params "csrf"])]
(= cookie-token post-token)))
(defn wrap-csrf-cookie [hdlr]
(fn [req]
(let [cookie (get-in req [:cookies "csrf"]
(str (java.util.UUID/randomUUID)))]
(assoc-in (hdlr req) [:cookies "csrf"] cookie))))
(defn wrap-check-csrf [hdlr]
(fn [req]
(if (is-form-post? req)
(if (csrf-tokens-match? req)
;; we're safe
(hdlr req)
;; possible attack
{:body "CSRF tokens don't match."
:status 400
:headers {}})
;; we don't check other requests
(hdlr req))))
The Javascript should be something like this:
(def csrf-script "(function() {
var cookies = document.cookie;
var matches = cookies.match(/csrf=([^;]*);/);
var token = matches[1];
$('form').each(function(i, form) {
if(form.attr('method').toLowerCase() === 'post') {
var hidden = $('<input />');
hidden.attr('type', 'hidden');
hidden.attr('name', 'csrf');
hidden.attr('value', token);
form.append(hidden);
}
})
}());")
You should add it to all HTML pages. Note that this example script requires jQuery. Put it right before the </body>
.
[:script csrf-script]
The nice thing about this solution is that it is strict by default. If you don't include the script, form posts won't work (assuming wrap-check-csrf
is in your middleware stack).
Summary: CSRF attacks take advantage of properties of the browser (instead of properties of your server), so their defense can largely be automated.
Software with known vulnerabilities is easily attacked using scripts. You should ensure that all of your software is up-to-date.
One common pattern for login workflow is to have a query parameter that contains the url to redirect to. Since it's a user parameter, it's open to the world and could be a doorway for attackers.
For example, let's say someone sends an email to someone asking them to log in to their bank account. In it, there's this link:
http://www.bankofmerica.com/login?redirect=http://attackersite.com
What happens when they click? They see the legitimate site of their bank, which they trust. But it redirects them to the attacker's site, which has been designed to look like the bank site. The user might miss this change of domains and unwittingly reveal private information.
What can you do?
OWASP recommends never performing redirects, which is impractical. The next best thing is to never base the redirect on a user parameter. This would work, but puts a lot of trust in the developers and security auditors to check that the policy is enforced. My preferred solution allows redirects that conform to a whitelist of patterns.
(def redirect-whitelist
[#"https://www.bankofmerica.com/" ;; homepage
#"https://www.bankofmerica.com/account" ;; account page
...
])
(defn wrap-authorized-redirects [hdlr]
(fn [req]
(let [resp (hdlr req)
loc (get-in resp [:headers "Location"])]
(if loc
(if (some #(re-matches % loc) redirect-whitelist)
;; redirect on our whitelist, it's ok!
resp
;; possible attack
(do
;; log it
(warning "Possible redirect attack: " loc)
;; change redirect back to home page
(assoc-in resp [:headers "Location"] "https://www.bankofmerica.com/")))
resp))))
Summary: Redirect attacks can largely be avoided by checking the redirect URL against a whitelist.
Conclusion
Web security is hard. It takes education and vigilance to keep our servers secure. Luckily, the main security flaws of the web are well-understood and well-documented. However, this is only half of the work. These need to be translated into Clojure either as libraries and simply as "best practices". Further, these libraries and practices need to be discussed and kept top-of-mind.
If programming the web in Clojure interests you, you might be interested in my Web Development in Clojure video series. It covers all of the basics of web development, building a foundation to understand the entire Clojure web stack.
You might also like
February 24, 2015
Summary: Hiccup is a Clojure DSL for generating HTML. If you're using it, you might like these tips.
Hiccup is a Clojure Domain Specific Language (DSL) for programmatically generating HTML. It's one of the three tools I recommend in my Clojure Web stack. It's a thin layer over HTML, but oh, how I welcome that layer! The biggest win is that since it's an internal DSL, you can begin abstracting with functions in a way that you will never be able to, even in templating engines.
Hiccup is an example of a great Clojure DSL. It uses literal data structures pretty well, it's as or more readable than what it translates into, and, as a layer of indirection, solves a few sticky problems of HTML. If you don't know it, go check out the Clojure Cookbook recipe for Hiccup. Hiccup might take some getting used to, but once you do, you'll appreciate it.
This article will assume you are familiar with the syntax and want to up your game.
Cross-site Scripting Vulnerability
Ok, this one is a pretty bad problem. Hiccup generates Strings, just plain old Java strings. If you put a String in the body of a Hiccup tag, it will just concatenate it right in there, no questions asked. That's just asking to be exploited, because that String is going to be shipped off to the browser and rendered, scripts and all.
Most templating libraries will default to escaping any String you pass them. The non-default, usually obviously marked alternative is to pass in a String unescaped. Hiccup got this backwards and it just sucks. It means you have to do extra work to be secure, and if you forget just once, your site is vulnerable.
The fix: This is the work you have to do every time you're getting a String that could be from the "outside" (a form submission, API request, etc.). Normally, you'd do this:
[:div content-of-div]
That will work but it's unsafe. You should do this:
[:div (h content-of-div)]
That little h
(hiccup.core/h
to be exact) there means escape the String. It sucks, but that's how you do it in Hiccup. One day I want to write a secure fork of Hiccup.
Overloading vectors
One downside to Hiccup and any DSL that overloads the meaning of vectors is that vectors are no longer useful as sequences within Hiccup. They now mean "start a new HTML tag". It's not a huge deal, but I've spent a lot of time debugging my code, only to realize that I was using a vector to represent a sequence. I use literal vectors everywhere else (because they're convenient and readable), but in Hiccup land they're wrong.
The fix: You can't use a literal vector, but you can call list
to create a list. Not as beautiful, but it is correct. Sometimes I will call seq
on a return value to ensure that it's never a vector.
I don't know why this still happens, but it's common, so I'll mention it. Sometimes I'll be looking at the HTML output in a browser and I just can't find an element. It's gone. Reload, still not there. When I look through the code, the hiccup to generate the tag is right there! Why won't it render?
Well, long story short, it's because in Clojure, only the last value of a let
or fn
is returned. Doh! My missing element was being rendered then discarded.
(defn list-with-header [header items]
[:h3 header] ;; this header is missing
[:ul
(for [i items]
[:li i])])
The fix: Wrap the two (or more) things in a list (not a vector!).
(defn list-with-header [header items]
(list ;; wrap in a list, not a vector
[:h3 header] ;; now it's there
[:ul
(for [i items]
[:li i])]))
Hiccup plays nice with nil
This one is just a little design touch with some perks, not a problem that it's solving. In Hiccup, the empty list renders nothing. This is extended to nil
as well. A common thing in HTML is that you want to render a bunch of children in a parent tag, but you don't want the parent tag if the list is empty.
Standard constructs will render the parent tag:
[:ul
(for [i items]
[:li i])]
When items
is empty, you still get <ul></ul>
. This is a problem with lots of templating libraries.
The fix: The fix in Hiccup is due to it playing nice with nil
. Wrap the parent in a when
:
(when (seq items)
[:ul
(for [i items]
[:li i])])
It's not beautiful, but then again, you could be using HTML with Moustache.
Use defn
for snippet abstraction
HTML has no way to abstract. The crap you see is the crap you get. Many templating libraries have some kind of snippet concept, often referring to different files. Well, because Hiccup is just code inside of Clojure, you've got a better designed way of making reusable pieces of Hiccup.
Let's say I'm repeating something a lot:
[:div
[:ul
(for [i list1]
[:li i])]
[:ul
(for [i list2]
[:li i])]
[:ul
(for [i list2]
[:li i])]]
That ul
is basically the same three times.
The fix: Standard functional abstraction. Pull out the repeated bit into a new, named function.
(defn make-list [items]
[:ul
(for [i items]
[:li i])])
Now you can call it from inside of Hiccup:
[:div
(make-list list1)
(make-list list2)
(make-list list3)]
Compiling your snippets
You may know this, but Hiccup is a compiling macro. That means it takes your literal vectors, maps, etc, and at compile time, generates code that will generate your HTML. All this means is that Hiccup is really fast, about as fast as concatenating strings can be.
But, because the Hiccup compiler doesn't do a full examination of your code, it can't compile everything. It inserts run time fallbacks for stuff it can't handle at compile time which will interpret it at run time. So, for instance, if you're calling a function that returns some Hiccup, it can't compile that automatically. It has to wait till the function returns to know what it is. That is, unless . . .
**The fix: ** The way to get Hiccup to compile something is with the hiccup.core/html
macro. That's the macro that does the compilation and it will do it anywhere. So if you've got code like this:
(defn make-list [items]
[:ul
(for [i items]
[:li i])])
(defn my-three-lists []
[:div
(make-list list1)
(make-list list2)
(make-list list3)])
You should wrap the Hiccup form in its own compiler, like this:
(defn make-list [items]
(html
[:ul
(for [i items]
[:li i])]))
For this little example, it probably won't make a noticeable difference. But it can be significant for larger documents.
Just to note: the Hiccup compiler can understand if
and for
forms, so there's no need to wrap them in the compiler. No hard either.
Autoterminating
Just a good thing to know about HTML.
Did you know that this is not legal HTML?
<script />
It's true. A script
tag can't be self-closing.
There's all sort of silly rules in HTML like this. And then there's XML mode versus HTML mode. We are lucky: Hiccup does all of this for you, so you don't have to wade through the HTML spec(s!) looking for why something won't render in IE7.
The fix: hiccup.core/html
takes a map of options as the first argument (it's optional). If you pass in a :mode
option, it will set the correct HTML mode, which unfortunately are incompatible. There are three modes, :xml
, :html
, and :xhtml
. The default is :xhtml
.
Id/class DSL
Hiccup is a DSL. And it has its own sub DSL for HTML ids and classes. It's similar to CSS selectors.
Let's say you have a div
like this:
[:div
{:id "main-content"
:class "blue-background green-border"}
(h "Here's some content.")]
Well, Hiccup lets you make that shorter and easier to read.
The fix: Use the id/class DSL:
[:div#main-content.blue-background.green-border
(h "Here's some content")]
Here's how it works. Every element has an optional id and zero or more classes. After the tag name (div
here), you can put the id after a #
. Then list your classes starting with a .
each. Omit the #
if there's no id. Ditto for the .
if there's no class. Oh, and the id must come first. That will get converted to the id and class attributes you want. Also, these will conflict with attributes in the map, so choose one or the other, not both.
Generating Hiccup from HTML
Last thing, I promise!! Sometimes you have some HTML that someone gave you and you want to Hiccupify it so you can just stick it into your code. Manually doing that is kind of tedious. Luckily, there's a solution our there for you.
The fix: This page lists three options for outputting HTML from Hiccup. I have personally used Hiccup-bridge. It does a good job (and it even goes both ways). You call it like this:
lein hicv 2clj hello.html
That will output hicv/hello.clj
in hiccup format. Pretty rocking when you need it.
Conclusion
Well, there you go. My Hiccup tips. Hiccup is pretty nice for templating. I recommend using it (or my future secure fork) in your web projects. If you'd like to learn more Hiccup and how to build web apps in Clojure, please check out [Lispcast Web Development in Clojure][webdev]. It's a video course teaching just that. All you need to know is Clojure and HTML.
You might also like
July 10, 2014
Summary: Clojure is well-suited for processing JSON, but there are some decisions you have to make to suit your application. The major decisions are actually easy, though they took me a while to figure out.
I tend to use JSON instead of edn
for an API serialization format, if only because JSON is more readily readable from other languages. I could do both, but it's good to eat your own dogfood and test those JSON code paths.
edn
is better, but JSON is generally pretty good. However, JSON's expressibility is decidedly a subset of the Clojure data structures, so there is considerable loss of information when going from Clojure to JSON. That information is not recovered in a round-trip, at least not automatically. There are lots of decisions that have to go into how to, at least partially, recover this.
One bit of information that is lost is the type of keys to a map. JSON only allows strings as keys. Clojure allows anything. But most of the time, I find myself using keywords for keys. I say most, but really, it's the vast majority. Maps are bundles of named values pretty much all the time. So the optimal decision, after trying lots of combinations, is to convert keywords to strings (the default in JSON libraries I've seen) when emitting JSON; and to convert map keys (always strings in JSON) to keywords (also known as keywordize-keys
) when parsing JSON. That covers nearly all cases, and pinpointed special cases can cover the rest.
But that's not the end of the keyword/string story. What about namespaces? Surprisingly, the two major JSON libraries, clojure.data.json
and cheshire
handle things differently. How do you parse a JSON key that has a slash in it, indicating a namespace? If we're keywordizing (as I suggest above), they both give namespaced keywords (keyword
will parse around the /
). But when emitting JSON, they act differently. clojure.data.json
will emit the name
of the keyword (and ignore the namespace
) while cheshire
emits a string with "namespace/name"
.
I like to keep the namespace, or, put another way, I like to drop as little information as possible. So I prefer the namespace approach. I'm not sure how to get clojure.data.json
to do that. I just use cheshire
. The other gotcha for namespaces is that ClojureScript's clj->js
and js->clj
are similarly asymetrical.
Keywords in other places besides the keys of maps will just get turned into strings, but they don't get converted back to keywords. That sucks, but it's minor. You'll just have to convert them back some other way. At work, we use Prismatic Schema's coercions. They do the trick nicely, in a declarative way.
So, back to other JSON issues. The other issue is other data types. Dates, URI's, and UUID's are in our data as well. Dates, well, it's up to you how to encode them. I've always been a fan of the Unix timestamp. It's not exactly human readable, but it's universally parseable. There's also the ISO datetime format, which is probably a better bet--it's human readable and agreed upon among API designers. You could emit that as a string and coerce back to a Date object later.
URI's and UUID's are by definition strings, so that's easy. How do you set up cheshire
to handle the encoders? It's pretty simple, really.
(cheshire.generate/add-encoder java.net.URI cheshire.generate/encode-str)
That means add the encoder for the java.net.URI
type to be encoded as a JSON string. str
will be called on the value. You can figure out the other types you need. There are some JSON emission settings built-in, including Date (the ISO string format) and UUID. Weirdly URI is not in there, so you have to add it.
What's next? Oh, pretty-printing. Yeah, I pretty-print my JSON to go over the wire. It's nice for debugging. I mean, who wants to curl
one long, 1000-character line of JSON? Put some whitespace, please! How to do that?
(cheshire.core/generate-string mp {:pretty true})
That's right, it's basically built in, but you have to specify it. But, oh man, that's long. I don't want to type that, especially because my lazy fingers are going to not do it one time, then I'm going to look at the JSON in my browser and see a one-line JSON mess. So, what do I do? I put all my JSON stuff for each project in json.clj
. It's got all my add-encoder
stuff, and it's got two extra functions, just for laziness:
(defn parse [s]
(cheshire.core/parse-string s true))
(defn gen [o]
(cheshire.core/generate-string o {:pretty true}))
Or of course whatever options you want to pass those functions. This one is my choice--you make your choice. But these two functions are all I use most of the time. Parsing strings and generating strings. Woohoo! Much shorter and less to keep in my little head.
Well, that just about wraps up my JSON API story. There's the slight detail of outputting JSON from liberator
, which is its own blog post. And there's a bit of generative testing I do to really challenge my assumptions about how I set up the round-tripping. But that, too, is another blog post for another day. Oh, and what about all that JSON middleware? Again, another post.
If you like peanut butter and you like jelly, you will probably like peanut butter and jelly sandwiches. If you like web and you like Clojure, you will most definitely like Web Development in Clojure, which is a gentle, soothing, visually rich video course ushering in the fundamentals of Clojure web development through your eyes and ears and down through your fingertips and into your very own Heroku-hosted web server. At least watch the preview!
You might also like
February 18, 2015
Summary: Ring, the Clojure Web library, defines three main concepts that you use to construct web applications.
Ring is the center of the Clojure web universe. It's not the only option in town, but the other options refer to Ring to say how they are different. Understanding Ring will help you learn any other web system in Clojure.
Ring has three main concepts that work together to build a web application.
- Adapters
- Handlers
- Middleware
Adapters
I like to think of Ring Adapters as plug adapters. When you go to a different continent, you often have to adapt the holes in the electical outlet to fit your cords. Ring Adapters let you convert different web server implementations into a standard Ring Request/Response format. That way, all of your code is standardized to the Ring format. Your code can travel into any kind of server as long as an adapter exists.
There are Ring Adapters for many existing servers.
And more.
Handlers
Handlers do the work of your application. They are like the computer. They are just Clojure functions. HTTP is basically a request/response protocol that maps well to functions, which are just a protocol from argument to return value. Handlers take a Ring Request and return a Ring Response. They should do whatever logic is necessary for your application.
Middleware
Middleware are the voltage converters. Here in North America, wall sockets run at 120 volts, which is different from almost everywhere. In order to run an appliance from elsewhere, you not only need to adapt the socket, you also need to transform the current to a compatible voltage. Middleware are often used to convert the incoming request in some standard way. For instance, there is middleware to parse a JSON body into a Clojure map and store it away in the request.
The transformer also "cleans up" the current. Voltage spikes are evened out so they never get to the computer. Middleware can similarly protect a handler by making sure the browser is logged in.
The analogy kind of breaks down, because middleware can do work (like the computer). Middleware are the hardest part of the Ring idea. They're not hard because the concept is hard. They're hard because they require design decisions. If all you had were Adapters and Handlers, you wouldn't have to think about where to put your logic. It would all go in the Handlers.
But there would be a lot of duplicated logic in your handlers. Authentication, routing, content-type switching, all of these things are done the same way over and over. It's the perfect problem for a little higher order programming. That's essentially what Middleware is.
Ring Middleware are functions that take a Handler and return a new Handler. Since Handlers are functions, Middleware are higher-order functions. The transformer on your computer's power cord takes a machine that requires a certain current and turns it into a machine that takes a different current. Middleware are used to do all sorts of things.
So, for instance, there's a Middleware called Prone that captures exceptions in the Handler and displays them in a nice format. Prone is a function that takes a Handler and returns a new Handler that catches exceptions and returns a different Ring Response in that case. Or you have Middleware that handle session cookies. The Middleware take a Handler and return a new Handler that understands sessions.
My recommendation for what to put in Middleware versus what to put in Handlers is simplest to explain with a graph.
Along the x-axis, we have logic that ranges from HTTP logic (handling header values, query params, etc.) to business logic (which bank account to withdraw from). Along the y-axis, we have how unique the logic is, ranging from highly duplicated to custom. These are the two axes I use to figure out whether it should be in the Handler or the Middleware.
The clear cases are easy. In the upper right corner (red dot), where it's custom business logic, it's definitely in the Handler. In the lower left (blue dot), where it's duplicated HTTP logic, I prefer Middleware. The hard part is in the middle. Somewhere between those two, there's a fine line where a case-by-case decision is required.
Conclusions
Ring is great because it requires so few concepts to capture so much of HTTP. But it's not everything. Standard Ring does not support WebSockets, for instance. Small adaptations are necessary. In general, I think this is a great abstraction. And Ring is so central to Clojure on the Web, it's important to know.
If you want to learn more about Ring and how to construct web applications in Clojure, you should check out LispCast Web Development in Clojure. It's a video course designed to guide you through the major libraries used for web development. You'll learn how to build Clojure web applications from existing and custom parts. You build Middleware to make your application adapt to browser limitations. And if you sign up below, you'll get a handy Ring Spec reference sheet, which specifies the Request and Response formats.
You might also like
July 22, 2014
Summary: Ring is great because it closely models the HTTP message format using native Clojure data structures. It strictly defines a message format that any software can use and rely on. With Ring 1.3, the specification has gotten even closer to the HTTP spec.
A couple of months ago, Ring 1.3 was released without much fanfare. It included a few improvements and updates, but in general, not much had changed.
One change, though, is very significant: the specification is shorter. It's simpler. Three keys were deprecated in the Ring request map (:content-type
, :content-length
, and :character-encoding
). These keys were unnecessary because their values were in the headers, which are also in the Ring request. Equivalent utility functions have been added for pulling the data out of the headers.
Why is this important? While many libraries get more complex and overburdened, it is refreshing to see a library going in the correct direction of shedding complexity. It does not significantly impact application development. Nor does it reduce the already low barrier to entry. Still, I welcome this kind of change.
Ring is the central specification that ties most of the Clojure web ecosystem together. The spec should be minimal. And a mark of good software is that it models the problem very closely without unnecessary abstraction. Ring merely defines a common format (using Clojure data structures) that mirrors the text-based HTTP message format. That's why Ring has worked so well thus far and why it is appreciated in Clojure.
Because I was so happy about the change, I decided to update my Ring Spec to Hang on the Wall PDF. The newly deprecated keys are gone. It used to be two pages long. The Ring Request took up an entire page, and the Response took up about half of one. But now, with three keys removed and a little tweaking of the font sizes, everything fits on one page.
One page in big, readable fonts, with just the information you need for quick reference. I like it. I'm printing one out right now to tack on the wall. You can get a free copy for yourself by getting on the PurelyFunctional.tv mailing list here.
You might also like
March 28, 2014
Summary: The Ring SPEC is the core of the Clojure web ecosystem. The standard is small and a reference is handy.
If you program the web in Clojure, you probably use Ring. Even if you don't, your server is likely Ring compatible.
Ring has a small SPEC. It's centered around defining the keys one can expect in the request and response maps. And the exact names for keywords are easy to forget.
I don't want to forget. I use Ring often enough that I want a quick reference. A while ago, I printed out a quick summary of the keys for the request and response maps and hung it on the wall behind my monitor. I refer to it frequently.
If you program the web in Clojure, you might appreciate this printout. If you're learning, it could be an invaluable reference. Please download the PDF.
And if you like it, you might also like my Web Development in Clojure video series.
You might also like
April 17, 2014
Summary: One reason to separate style from content is to reuse HTML or CSS. Ultimately, we would like a solution where we can reuse both.
Reusable Content
There is an economic reason to separate presentation from content. Publishers have thousands of pages of HTML on their site, yet they want to enhance the style of their pages over time. It would cost a lot of money to change every single page to match their new style. So they invest a little more time writing each page so that the HTML markup does not refer to styles but to the semantics of the content (referred to as semantic HTML). Then they hire a designer to write CSS to make their existing content look new. The HTML is permanent and reusable, and the CSS is temporary and not-reusable. The separation is only one way: the HTML doesn't know the CSS, but the CSS does know the HTML.
Examples: CSS Zen Garden, newspaper websites, blogs
Characteristics: Semantic markup, CSS tailored to classes/structure of HTML
Reusable Styles
Yet another economic reason is a relatively newer phenomenon. It has become very easy to create a new web site/application. Writing (or generating) lots of HTML is cheap, and it changes often during iterative development. What is relatively expensive is to design each of those pages each time the pages change. CSS is not good at adapting to page structure changes. So people have built CSS frameworks where the CSS is (relatively) permanent and the HTML is temporary. In these cases, the HTML knows the CSS, but the CSS doesn't know the HTML. The separation is again one way--this time the other way.
Examples: Open Source CSS, Bootstrap, Foundation, Pure
Characteristics: HTML tailored to classes/structure of CSS, Reusable CSS
Reusable Content and Styles
What if a newspaper site, with millions of existing HTML pages, could cheaply take advantage of the reusable styles of frameworks like Bootstrap? That is the Holy Grail of separation of concerns. What would be required to do that?
What we really want is a two-way separation. We want HTML written in total isolation and CSS written in total isolation. We want permanent HTML and permanent CSS. How can the style and content, each developed separately, finally be brought together? The answer is simple: a third document to relate the two.
We have already seen that CSS is not good at abstraction. CSS cannot name a style to use it later. However, LESS does have powerful forms of abstraction. LESS has the ability to define reusable styles and apply them to HTML that did not have those styles in mind. If you put the definition of reusable styles in one document and the application of those styles in another document, you achieve true separation. And it is already happening a little bit. You can do it in your own code.
It is a bit like a software library. We put the reusable bits in the library, and their specific use in the app.
Examples: Compass, Semantic Grid System
Characteristics: Semantic markup, Reuseable Styles, Tie-in document to relate Style to Content
Conclusion
CSS preprocessors, which began as convenience tools, are actually powerful enough to solve fundamental problems with HTML and CSS. While it is still early, LESS and other CSS preprocessors, if harnessed correctly, could dramatically transform how we build and design web sites. Typography, grids and layout, and other design concerns can be used as plugable libraries. And other languages that are specifically designed to do that may emerge. What would a systematic, analytical approach to such an approach look like?
For more inspiration, history, interviews, and trends of interest to Clojure programmers, get the free Clojure Gazette.
Learn More
Clojure pulls in ideas from many different languages and paradigms, and also from the broader world, including music and philosophy. The Clojure Gazette shares that vision and weaves a rich tapestry of ideas from the daily flow of library releases to the deep historical roots of computer science.
You might also like
March 23, 2014
Summary: There are a number of web frameworks in Clojure, but beginners should roll their own server stack themselves to tap into the Ring ecosystem.
One question that I am asked a lot by beginners in Clojure is "What web framework should I use?" This is a good question. In Python, there's Django. In PHP, Drupal. And of course in Ruby, there's the king of all web frameworks, Ruby on Rails.
What framework should you use in Clojure? The question is actually kind of hard to answer. There are a number of web frameworks out there. Some people call Compojure a framework, though it is really a library. lib-noir does a lot of work for you. Then there's your true frameworks, like Pedestal or Hoplon, which provide infrastructure and abstractions for tackling web development. All of these projects are great, but for a beginner, I have to recommend building your own web stack starting with Ring.
Compojure is really just a routing library, not a framework. You can use it for your routing needs, though there are alternatives, such as playnice, bidi, Route One, and gudu. If you don't want to decide, use Compojure. It's widely used and works great. If you want to go in depth, read the docs for the others. They are each good for different cases.
lib-noir comes from Noir, which was a web framework (now deprecated). It was easy and provided a bit of plumbing already built for you, so you could just start a project with a lot of the infrastructure built in. lib-noir is that infrastructure in library form. I haven't used it, but a lot of people like it. However, when I look at it, I see that most of what it provides I either won't use or it is trivial to add myself. That would normally be ok if there was huge adoption for it (like with Rails) so you get an ecosystem effect, but there isn't. lib-noir is used but certainly not dominant.
Pedestal has a lot of backing. It aims to tackle single-page apps by providing a sane front-end environment using ClojureScript in the form of a message queue. If you're into "real-time apps", this may be for you. Though, I would caution you that it's not for a Clojure beginner. Pedestal introduces a lot of new concepts that even experienced Clojure programmers have to learn. The tutorial is long and arduous. You will have problems learning Pedestal without knowing Clojure.
Update: Pedestal has changed dramatically since I last looked at it. It is no longer a frontend framework. It is now a high-performance backend server for high performance asynchronous processing. It's worth looking at if you need that. Otherwise, I stick with my recommendation to use basic Ring.
Hoplon is also designed for web apps. It gives you a DOM written in ClojureScript (including custom components), dataflow programming (like a spreadsheet), and client-server communication. It's a bold step, but again, requires you to buy into programming models that will take a long time to understand. If you are not already familiar with Clojure, you are asking for trouble.
There are other frameworks out there. But I recommend rolling your own stack. If you're learning Clojure, the best way to grasp how web apps work is to get a Ring Jetty adapter set up with some basic handlers. Add existing middleware as you need it. Write some middleware of your own. Use Compojure to route. Use Hiccup to generate HTML. That setup will get you a long way.
Ring is just functions. With a few basic concepts and a copy of the Ring SPEC handy, you can build a web server very quickly that does exactly what you want and you can understand every aspect of it. The experience of building one yourself can teach you a lot about how the other frameworks are put together.
What's more, Ring is dominant. Most people write functionality (in the form of middleware and handlers) assuming Ring and no more. So by staying close to the metal, you are tapping into a huge resevoir of pre-written libraries that are all compatible with each other. Ring is the locus for the Clojure web ecosystem.
Wiring up your own middleware stack is not that daunting. If you want guidance, my Web Development in Clojure video series is now for sale. It starts with a brand new Clojure project and ends with a fully functional app, backed by a database, and hosted on Heroku (for free!). In one hour, it explains all the concepts and shows you lots of examples. There's lots of exercises to get your brain whirring.
Recommended stack:
Then keep adding and customizing.
You might also like
November 24, 2013
Clay Shirky nails it with nice, narrative style.
You might also like