SQL Doldrums: transactions

Are you using an SQL database to build your news app? Two things are slowing you down: object-relational mapping (e.g., ActiveRecord) and transactions.

Today, let’s talk transactions.

What’s a transaction?

Convince lots of users to INSERT, UPDATE and DELETE rows in your database. Make them keep doing it. Then ponder what’s in your database right now. And then now. And then now.

Each instant, your database is valid. Unplug your computer and reboot it. It’s still valid.

Your database appears to jump from transaction to transaction, as though only one person is writing at a time. That’s called ACID compliance.

Isn’t that a good thing?

You certainly want your database to spin back up after you unplug your computer.

Aside from that, transactions don’t always behave as you’d want.

Let’s say two users visit a page, and you run this SQL for each:

UPDATE pages SET n_comments = n_comments + 1 WHERE id = 2;

You’re guaranteed there’s only one write at a time. But what about the read?

Oddly, “n_comments + 1” is a read. If both users run this query at the same time, they’ll both read the same value for “n_comments + 1.” Then your page will show the wrong comment count.

In short: transactions are error-prone. They don’t solve the problems you’d expect them to solve. There are no quick fixes in general.

(Not that it relates to my essay, but there is a quick fix for this specific problem: SET n_comments = SELECT COUNT(*) FROM comments WHERE page_id = 2;)

Okay, but how are transactions a bad thing?

First off, transactions make you comfortable when you shouldn’t be. That’s bad.

Moreover, transactions can be slow.

Let’s say you’re importing data. You have one large transaction with tons of INSERTs and UPDATEs and DELETEs. While you’re transacting, the database system maintains all those rows separately from the rest of the data, in case you don’t COMMIT in the end. There’s a whole lot of bookkeeping that you can’t control. And before you know it, you can hit a bizarre outcome: your large transaction can take too long to COMMIT … and too long to ROLLBACK. You’re stuck with an unusable database.

Now consider tiny transactions. If you have lots of users and you do things when they click, this applies to you. To ensure your data is saved, your database system may fsync() often. That’s extremely slow — far slower than the actual process of writing things.

So I shouldn’t use transactions?

If you’re using SQL, every query is in a transaction. You can’t avoid transactions.

If your database is slow, you have two options:

  1. Configure your database system specifically for your news app. For instance, turn off automatic fsync. (In practice, this can be very hard — especially if you run two apps that need different things, or if you share your project with other developers.)
  2. Don’t use SQL. (In practice, this is often easy.)

What’s your beef with SQL?

It’s a solution for some problems. But it might not be the best solution for your problem.

Think of your next news app. You probably expect your users to do a few things (let’s say five). You expect X clicks, and you want each response to come in Y seconds. You’ll seed your database with Z megabytes of data….

Keep pondering your news app. Think of how you’ll use it. Think of how your users will use it. Think of where you’ll get your data and how you’ll keep it current. Think and think and think, for hours….

… and you’ll never think: “I need transactions and ACID compliance and b-tree indexes and associative tables and a mature data manipulation language.”

Those are SQL servers’ main features. For most news apps, they’re all useless.

Okay, now I’m rambling. Back to my main point.

Transactions have advantages, but they may not apply for your news app. Transactions have downsides, and they probably do apply to your news app.

And if you use SQL, you must use transactions.

Journalist, ex software engineer

Journalist, ex software engineer