It's all about hybrid workloads
Part of being in a category, such as NewSQL, with competing technologies, such as VoltDB, is understanding where your solutions overlap, complement or diverge from one another. This week VoltDB announced their Enterprise Export-to-Hadoop Client which clearly shows us where they see their solution within the NewSQL market.
“Big Data applications come with a complex combination of operational and analytical challenges,” said VoltDB CEO Scott Jarr.
Put slightly differently, big data requires a combination of OLTP and OLAP database techniques. Even more succinctly, it’s all about hybrid workloads. We here at NuoDB agree with this. When we talk to customers we also hear the request for near real-time analytics and low-latency, high throughput transaction processing within the same database. The driver, companies want to be more agile, to do that they need good BI on their transactional data.
From their press release we see that, “VoltDB’s Integration for Hadoop allows customers to rapidly move high velocity data from VoltDB to CDH for long term storage and analysis.” Imagine a situation where a discovery during the analytics stage has implications on decision making that occurs within the transaction processing stage. In a VoltDB/Hadoop world processing hybrid OLTP and OLAP workloads means that you’ll be shuffling data back and forth into HDFS via Sqooop, a SQL-to-Hadoop database import and export tool (more commonly called extract-transform-load, or ETL), so that you can use Hive to coordinate your analytical map/reduce job. Then after your map/reduce results are ready you’ll need to trigger a process to move a portion of that data back into your VoltDB cluster so that it can have an effect on your transactional processing. This sounds like a process measured in minutes or more involving many complex pieces of software that all have to be administered, secured, and coordinated.
Isn’t there a better way? Why can’t a scalable database also be flexible? Certainly there will be a category of OLAP/BI queries that are highly complex and require some amount of time to process, and for that category a Hadoop cluster might be the right call.
We believe that a hybrid workload is manageable on a NuoDB cluster simply because of our natural ability to dynamically respond to changes in load and run in an asynchronous, non-blocking manner.