Last week I presented on “Building Distributed Applications That Don’t Suck” twice at the ADUG conferences in Melbourne and Canberra. As the name may suggest, it’s a heavily opinionated session about the common mistakes people make when building distributed systems, and some techniques you can use to avoid the mistakes.
One of the great things about opinionated sessions is that they spark arguments, and I have thoroughly enjoyed the discussions that started during the breaks between sessions and that have continued on in email since then. One of those discussions has been around distributed transactions, and my advice to actively look for opportunities to avoid them. In the session I discussed how, where the business rules allowed, you could use things like Idempotency and even ordering of inserts to avoid the cost of a distributed transaction. However, it was the topic of Eventual Consistency that caused the most angst.
In a single RDBMS, we’re very often taught to wrap changes in a transaction, so that they succeed or fail as a single unit. It’s also through transactions that we can control the visibility of changes. This is all about ensuring that outside the bounds of that transaction, our data is consistent. All hail ACID!
However in a distributed environment, this strong pursuit of consistency comes at the cost of either availability or partition tolerance (see the CAP Theorem). Hence the idea that where you see an opportunity to tolerate an inconsistent state for awhile, you may benefit from taking advantage of it.
This can be hard to swallow initially, as a number of people made clear. I struggled with it when I was first introduced to the idea. It doesn’t apply everywhere, there are some business processes where you absolutely need immediate consistency, however over time I’ve been surprised at how often I can either relax the consistency for a particular process, or at least minimize the scope of the data that needed to be immediately consistent.
The discussion of Eventual Consistency has had a revival of late with the rise of NoSQL databases, and one of the better discussions I’ve heard is on an episode of Software Engineering Radio with Dwight Merriman from 10Gen, the folks behind MongoDB. If this is new territory for you, and you don’t mind having your assumptions challenged, I recommend you have a listen.