RethinkDB is “the database for solid state drives”. Or, it was. Slava Akhmechet, the brains behind the system, now talks about a system that adapts the the storage medium. And, indeed, that the product is primarily targeted at supporting high-speed blob-retrieval for web page generation. Focus, focus, focus. The company sounds more market focused. Smart.
But it raises the question: Consolidation or fragmentation? The old orthodoxy was what Mike Stonebraker parodied as “one size fits all” (OSFA); the new orthodoxy is specialized databases, or “workload optimized” systems. A specialized database can give you 2 orders of magnitude (100x) performance benefit.
The Vertica column store can beat a general purpose rowstore by gigantic margins on big analytic queries, but sacrifices insert performance and foregoes ACID transactions. StreamBase can perform queries at 1M messages per second, per node, by processing incoming events before they are stored. VoltDB is blindingly fast for OLTP workloads if you can live with single-partition transactions, limited SQL support, RAM-based durability and pre-canned queries. Ditto for document databases, XML databases, blobstores, arraystores, etc. Specialized can be special-fast.
If specializing gave you 20% performance gains it would be a yawn. 10x would be interesting but 100x is compelling. And “workload-optimized” is still far from fully exploited. Both StreamBase and VoltDB are compiler-driven: Their executable images are substantially code-generated from application specifications. It’s a safe bet that there is a lot of performance leverage in optimizing those compilers over time.
So is that it? No more general purpose databases? Or are we missing something?
All we’re missing is the Big Picture. As relates to essential goods, from bicycles to shoes, shops to kitchen equipment consumers favor general solutions until they prove inadequate. Cell phones are killing the camera market. Wedding photographers, photojournalists and sports photographers will always want specialized cameras, but for most people the consolidated solution is Good Enough. Markets evolve through specialization and then consolidate back to general solutions where possible. The pendulum swings between costly specializations that break new ground and the economics of commoditization that drive generalized solutions to cover much of the new niche space.
And for databases the negative impact of specialized systems goes well beyond the incremental cost of more software licenses, or the inconvenience of carrying both a camera and a phone. There are very expensive data integration issues, data integrity issues, skills issues, security issues, vendor management issues and much more. People will go to a lot of trouble to use a single database product if they can. If that’s right then someone is going to build a better general purpose database, one that learns the lessons of the specialist systems and competes with all of them on an 80/20 basis. The system will be good enough for the vast majority of what people need, and specialized systems will continue to push the limits in their own dimensions.
What would such a system look like, and where would it come from? Functionally it would need to sound a lot like a traditional OSFA database (SQL, transactions, elastic scalability, BLOBs, Etc. ). But it would a) scale-out elastically on commodity hardware, b) allow specialized nodes within the architecture, and c) embrace concurrent execution of hybrid workloads. The company that delivers that will have delivered the 21st century database.
Where it would come from is simple: The specialized systems are all pressured to broaden their capabilities. You see NoSQL stores trying to add SQL. You see systems with no disk-based durability trying to add durability. And you see incumbent SQL vendors trying to add horizontal scale-out. In other words the next generation general-purpose database will be an evolution of something with a far more specialized initial market focus. Some have started with the right architecture to get there, and others have not.
So the thought about RethinkDB is that Slava is doing a smart thing: Pick a place you can win and be the best at it. Specialize and exploit your architectural advantage. But don’t assume unlimited fragmentation of the database market. The pendulum will swing. Extreme workloads will be the domain of specialized engines; for everything else there will be general-purpose systems.
NuoDB (fka NimbusDB) is built on a radical new architectural foundation that delivers elastically scalable transactional SQL solutions today (NewSQL). The technology is peerless in that domain, and we’re very excited about what we’re hearing from our early users. We’re very busy and very focused.
But on occasion we step back to talk about One Size Fits Most (OSFM). We know we’re going to be there. The debate is about who else can step up.