Databases in Clojure: Lobos, java.jdbc and environ

December 30, 2013

Today I wanted to learn about Lobos, the schema migration library for Clojure.

While I ultimately had a different project in mind, I picked Fjord, my “blogging engine”, as a guinea pig. Fjord had no database and simply stored posts in memory¹. Perfect candidate to add a database to. The other elements (routing, serving pages, rendering templates) were already there.

A migration in Lobos looks like this:

(defmigration add-posts-table
  (up [] (create
           (tbl :posts
             (varchar :title 200)
             (text :content))))
  (down [] (drop (table :posts))))

This migration initializes the blog’s database, adding the posts table. Migrations can also be rolled back, so we have a down part that undoes the up part.

Note that you start with a completely empty database (created using an admin interface, such as the Postgres psql command line). I’m not sure if it is common to think of the creation of your initial tables from nothing as a “migration”, but Lobos seems to encourage that use case, and I see no good reason why not.

Since this will be a single-user blog, the schema is simple. A post is a title, content, and date of publication. tbl, a macro provided in the Lobos readme, wraps table to add a few nifty things to the table we’re creating. In particular, it adds a created_on column, defaulting to the current time at insertion. We’ll use that as the publication date.

The migration goes in a namespace called lobos.migrations (note, you don’t use the name of your project), and in lobos.config you provide a DB spec. Here’s my lobos.config:

(ns lobos.config
  (:use lobos.connectivity))

(def db
  {:classname "org.postgresql.Driver"
   :subprotocol "postgresql"
   :user "dbuser"
   :subname "//localhost:5432/fjord"})

(open-global db)

As with all SQL database libraries in Clojure, your project.clj needs to include the appropriate adapter for the database you’re using — in my case, PostgreSQL (I found the latest version by searching Maven):

[org.postgresql/postgresql "9.3-1100-jdbc41"]
[org.clojure/java.jdbc "0.3.0-beta1"]
[lobos "1.0.0-beta1"]

Be careful: the java.jdbc dependency’s version has to stay in sync with Lobos.² Then, open up a REPL and (migrate) away:

user=> (require '[lobos.core :refer [migrate rollback]])
nil
user=> (migrate)
add-posts-table
nil

Voila, there’s a posts table! If something went wrong, (rollback) to undo the most recent migration.

OK, so I have a posts table. That’s cool, but not very satisfying if I’m not doing anything with it.

Querying and inserting with java.jdbc

Probably most Clojure programmers who need SQL queries go with Korma, a high-level DSL that lets you generate and manipulate queries in Clojure. A wildly different approach is offered by YesQL, where you write SQL queries separately from your Clojure code, save them as “.sql” files, and load them as needed.

Both are based on java.jdbc, a low-level wrapper around Java’s database abstraction. For now, I want to use java.jdbc directly. Maybe in the future I’ll reach for Korma when I do “real work”, but then, maybe not — there’s no way to appreciate its advantages and tradeoffs unless I try without it.

Coincidentally, java.jdbc’s interface has been completely redone with the recent 0.3.0 release. I have no idea what it was like to use it before, but one of the most surprising parts is, there’s no manual connection management. I thought I would have to create a database connection when my server starts up, save the connection globally in a reference type somewhere, and pass that to all database functions. I was wrong.

Remember the database spec from before?

(def db
  {:classname "org.postgresql.Driver"
   :subprotocol "postgresql"
   :user "dbuser"
   :subname "//localhost:5432/fjord"})

This isn’t a database connection. It’s just a hashmap with info about how to connect to the database. And yet, this is what I pass as the db parameter to all the java.jdbc methods, and they do the right thing!

Apparently, functions like insert! and query check to see whether a connection to the described database is already open, and if not, they open one. This isn’t made entirely clear in the documentation, but from a quick read of those functions’ source, it appears to be what’s going on.

My old model layer was a sham, saving and loading only to an atom:

(defonce ^:private posts (atom ()))

(defn get-by-id [id]
  (first (filter #(= (:id %) id) @posts)))

(defn retrieve-latest
  "Retrieves latest posts, skipping `offset` (if provided) and limiting
   to `limit` results (if provided, else 10)."
  [& {:keys [offset limit]
      :or {offset 0, limit 10}}]
  (take limit (drop offset @posts)))

(defn add!
  "Adds a post, assigning it a unique ID and returning that ID."
  [post]
  (let [last-id (-> (retrieve-latest :limit 1) first (get :id 0))
        new-id (inc last-id)
        new-post (assoc post :id new-id)]
    (swap! posts conj new-post)
    new-id))

It was the quickest way to get a prototype running — we just maintain an atom of lists of posts, each post a hashmap.

The new one:

(defn get-by-id [id]
  (first (query db ["SELECT id, title, body
                       FROM posts WHERE id = ?" id])))

(defn retrieve-latest
  "Retrieves latest posts, skipping `offset` (if provided) and limiting
   to `limit` results (if provided, else 10)."
  [& {:keys [offset limit]
      :or {offset 0, limit 10}}]
  (query db ["SELECT id, title, body
                FROM posts
                ORDER BY created_on DESC OFFSET ? LIMIT ?"
             offset
             limit]))

(defn add!
  "Adds a post, assigning it a unique ID and returning that ID."
  [post]
  (:id (first (insert! db :posts post))))

Simple and obvious. In add! you can see me taking advantage of another pleasant surprise: when you call insert!, it returns a map of all the autogenerated columns. I’m pretty stoked I don’t have to do that manually, since SQL itself doesn’t offer a standard way to return autogenerated columns during an insert.

Saving the config nicely with environ

You’ll notice that, before, I plopped all my database configuration directly into the lobos.config namespace. This is not ideal for a couple reasons:

The Lobos namespaces aren’t a part of my main app — they’re like a little thing off in the corner. I shouldn’t have to require a lobos.foo namespace in the main app.
Plus all the standard reasons for not storing configuration in your source code — risk of accidental disclosure of credentials, the fact that credentials vary between development and production, different deployments (you may host your own Fjord blog separate from mine), etc.

The first issue is easy — I moved the DB spec into fjord.config and made lobos.config require that. Solved.

The second issue I’ve dealt with in other languages before, but not yet in Clojure. It turns out that environ provides a handy solution.

In my project.clj, I add environ as a dependency and lein-environ as a plugin:

:dependencies [...
               [environ "0.4.0"]
               ...]
:plugins [...
          [lein-environ "0.4.0"]]

and I add an :env key to my development profile (note, the Ring stuff was already there):

:profiles
{:dev {:dependencies [[ring-mock "0.1.5"]
                      [ring/ring-devel "1.2.1"]]
       :env {:fjord-db-user "dbuser"
             :fjord-db-table "fjord"}}})

Then plug this into the database spec, with appropriate defaults:

(ns fjord.config
  (:require [environ.core :refer [env]]))

(def dbspec
  {:classname "org.postgresql.Driver"
   :subprotocol "postgresql"
   :user (get env :fjord-db-user "dbuser")
   :password (get env :fjord-db-pass "")
   :subname
   (str "//" (get env :fjord-db-host "localhost")
        ":" (get env :fjord-db-port 5432)
        "/" (get env :fjord-db-table "fjord"))})

I can now use environment variables like FJORD_DB_TABLE, FJORD_DB_HOST and FJORD_DB_PORT to override these values in production, while relying on the defaults and the stuff in project.clj during development.

Testing

When I converted the blog over to a database backend, I made my one and only test — testing the model layer — fail. It turned out I had made two mistakes. One, I forgot the (first ...) in add!, so it wasn’t returning the new post’s ID. Two, retrieve-latest was ordering the posts in ascending order by date — so it was returning the oldest posts. That was because I forgot to put “DESC” in the SQL query.

While the test did its job, it now has the ungainly side effect of adding a fake post to my actual blog each time it’s run. I had used the handy with-redefs macro to mock the atom where I was storing posts, but now the atom isn’t being used and that has no effect. For my next trick, I’ll need to figure out how to initialize a mock database, make that work with Lobos and java.jdbc, and use it in all the tests.

Here’s the code for this project as of this post’s writing.

Which means, yes, all the posts were lost when you restarted the server. ↩
If not, Lobos fails to load: I was requiring it at the REPL and getting an error until I tracked this down. I could just omit the java.jdbc dependency, but I also plan to use java.jdbc directly, as you’ll see in a minute. ↩

You can follow me on Mastodon or Bluesky, or this blog via RSS.

Original text and images (not attributed to others) on this page are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.