Thoughts from GigaOM Structure 2011
Structure 2011 has been about Cloud. Earlier this year GigaOM ran a Big Data conference in NYC, but this week in San Francisco the top three topics were cloud, cloud and cloud. And one of the big themes in the cloud is data, including data storage and data management.
The most interesting panels have been 1) the Guru Panel with representation from Facebook, Salesforce, Netflix, COMCAST, Linkedin, and 2) the Cloud Databases panel with representation from NuoDB (fka NimbusDB), Xeround, ParAccel and Cloudant.
Guru Panel folks (Sid Anand, Claus Moldt, Jacob Rosenberg, Kevin Scott) tackled the big issues of how you run a large web-facing service. Sid Anand talked about what it took to move Netflix to the Amazon cloud. Much of it is covered here. The interesting comment was that they intend moving to masterless database instances. The requirement is that anything can run even if everything else fails. They want to go beyond database fail-over and have peer database instances. In fact they want to turn the Chaos Monkeys loose at the database tier. Cool idea.
But it is also one of those “told you so” moments. Netflix could let their Chaos Monkeys loose on a NuoDB (fka NimbusDB) installation and find that a) the data is always safe, and b) as long as there are two nodes running they would have a live and accessible database, obviously with some performance limits. What he was highlighting was that even with a highly cloud-targeted architecture such as Netflix is running, the database tier is behind the curve in terms of business continuity. Ping Li of Accel Partners reinforced this in a later panel when he noted that delivering elastic cloud solutions cannot be done by simply repurposing traditional inelastic database technologies. This all sounds like an ad for peer-to-peer database architecture. You heard it here first.
The other interesting thing from the panel was the discussion of solid state disks (SSD). There was consensus from a panel of people that should know that this is the next big infrastructure revolution. Several SSD vendors are at the conference and the stories are very credible. A big immediate market for SSD vendors is database acceleration. Replacing your spindle storage with SSD storage for eg a big Oracle installation gives you a machine that has many characteristics of an Oracle Exadata appliance without making an Exadata commitment. The Guru panel clearly felt that SSDs are set to substantially change their architectural thinking. For too long the performance characteristics of the spinning disk have determined how we build large systems, and at a time when data volumes and management requirements vastly exceed anything we have seen before this is a Big Deal.
The Cloud Databases Panel (Mike Miller, Razi Sharir, Jim Starkey, Barry Zane) represented NewSQL (NuoDB/NimbusDB, Xeround), NoSQL (Cloudant) and BI/Analytics (ParAccel). It also represented different delivery models (software license, DBaaS and mixed). Topics included common cloud challenges as well as elements unique to the different categories. The common requirements were elasticity and multi-tenancy. It was interesting that for technologies targeted at such different markets and usage patterns these base requirements are the same. Jim Starkey summarized the view with his comment that “Existing RDBMS just flat don’t scale. Big ramifications.”
But there are big differences in emphasis between the panelists too, highlighted by the audience question on killer apps and target database size. Answers:
|Company||Killer App||DB Size|
|NuoDB/NimbusDB||Web Applications||Any size. For OLTP working-set size is what matters.|
|Xeround||SQL access to unstructured data||2-50GB|
|Cloudant||All of the above, plus sync to mobile.||1GB-1TB|
The hardest question was from an audience member that wanted to know how NuoDB (fka NimbusDB) can be a 100% ACID SQL database and yet scale elastically. Jim Starkey was given about 143 seconds to respond to the question, and made the only useful comment you can make in the time. He pointed out that if you define consistency in a distributed database as a requirement for a known, global state at all points in the system at all times then you have defined a synchronous distributed system. Obviously a synchronous distributed system is not going to scale.
It is the ability in NuoDB/NimbusDB to deliver a consistent view of the database to any observer on any node at any time, without actually enforcing a known, global state of the database at every moment, which allows the system to be profoundly asynchronous and thereby scalable. It also allows the system to operate as a set of collaborative peer-to-peer processes that make voluntary contributions to overall system performance.
It was a good point on which to end the panel. The big challenge in cloud databases is elasticity and multi-tenancy. It’s great to hear one of the fathers of relational databases confirm that the hardest of the database elasticity problems (the OLTP problem) can be solved if you think about it differently.