I’m no big fan of SQL. When I was a young, naive programmer, I really enjoyed ‘doing good architecture’ and normalizing my database. After maintaining my fourth or fifth major ORM layer however, I had started to suspect the emperor’s clothes were perhaps not very practical for a blizzard.
But that’s not why SQL is doomed. SQL is doomed because it simply won’t fit into the new world of distributed computing – no matter how much leverage we give the shoehorn.
Key Problem
The interaction between two trends is moving the programming field to distributed databases. One, most new software is written for the Internet, and therefore is usually designed with good scalability in mind. Two, we started reaching some fundamental limits on CPU speed. Scalability has increasingly meant working with multiple CPU’s and computers, not just bigger machines. Gone are the days you could spend 100x to get a computer that was 100x they speed of your desktop. Now the ‘big iron’ just has better interconnects and it’s a much tougher decision between that and buying ten desktops.
As we started implementing distributed databases on lots of machines, some smart folks formalized the CAP theorem. There are better descriptions elsewhere, but basically you just have to realize that when you are running on a lot of unreliable hardware, you need to give up either Consistency or Availability. Since most of us are working on the Internet, and believe in constant uptime… we’re left with only Consistency to give up.
Painful, but necessary.
If Consistency Goes, So Goes SQL
This is the crux of my argument. Working with a modern document-oriented database, it’s just not that big of a deal to give up some consistency. There are a number of different strategies. Some aren’t that great, like: Don’t Worry About It and Hope. Another strategy is built into CouchDB – Conflicts Happen, Just Detect Them and let the programmer decide which record to keep.
Which strategy you choose depends somewhat on what kind of records you are keeping and what you need out of consistency for them. But the great salvation of document-oriented databases is that you’re choosing this strategy for an entire document. It’s a bit scary when you have to start thinking about giving up consistency, but for a lot of documents it’s really pretty easy – keep the last timestamp.
Serializing Objects is the Soft Underbelly of SQL
But if we start trying to deal with eventual consistency issues on SQL database, we hit a brick wall. The relational model hasn’t mapped well to objects in a long time. Everybody ends up using some sort of ORM (Object Relational Mapper) to deal with the problem, but it’s fundamental to relational databases.
What are you going to do with row-level consistency problems? Push only big transactions that affect every important row at once? If you try this, you run against the CAP problem and can’t partition your database well.
Do you make some heinously complicated additions to your ORM to determine what to do with each row and fix up consistency as needed? ORM layers are already at the limit of reasonable complexity. Making it worse increases the cost of using SQL too much.
I don’t think there is a good answer. I may be wrong, but I don’t think SQL is strong enough for the challenge. NoSQL databases are not made by maverick programmers that don’t want good architecture. They dispense with SQL because relational databases aren’t up to the challenge of eventual consistency in an object-oriented world.