Evaluating Apache Cassandra as a Cloud Database

You hear the phrase “Cloud Database” a fair bit, but it’s usually in the absence of a definition.  What makes something a Cloud Database, and what sorts of things disqualify a system from being a Cloud Database?

Sometimes when there is no definition it means that everyone gets what a term means.  I have never heard a careful definition of a bus, but we all generally agree on whether something is a bus, is not a bus or is a borderline bus.  Maybe Cloud Databases are things that we all just know when we see them, but I am not so sure.  In talking to a lot of smart people that generally understand technology quite well it is clear that the term has a range of interpretations.  So it is great to see the Datastax guys define what they mean, with reference to Cassandra..

They talk about 8 dimensions of the cloud, and corresponding characteristics of a Cloud Database:

  • Transparent Elasticity
  • Transparent Scalability
  • High Availability
  • Easy Data Distribution
  • Redundancy
  • Support for all Datatypes
  • Easier Manageability
  • Lower Cost

Well it’s fantastic that they are a) trying to define the characteristics of a Cloud Database and b) doing a good job.  Our own definition is similar:

  • Elasticity
  • Virtualization
  • Extreme Availability
  • Geo-distribution
  • “Zero” DBA

That’s a pretty close match.  We would add that SQL, ACID transactions and a data model that is application-independent are pretty high on the list of things you would ideally want.  You could argue that this is more about the Database part than the Cloud part, and it is the latter that is generally not well defined. Fair enough.

Congratulations to Datastax for stepping up and trying to define this.  We’d welcome a refinement of the definitions – there is no doubt that customers need this kind of clarity.

Add new comment