The Scalability Paradox

When considering database performance it is easy to focus on the size of the database in terms of storage. However in practical terms the degree of scalability of the database application is every bit as important, if not more so. For those, like myself, who have a background as a database administrator in the distant past, planning enough capacity to cope with peak transaction workloads was a tricky affair. Not so many years ago processing capacity was limited to the speed of your server processor(s) and the memory it could use. With so-called symmetric multi-processing (SMP) all CPUs share the same memory. This leads to bottlenecks when high throughput is needed, as CPUs try to grab the same memory as others.

In recent years more and more high-throughput systems use multi-parallel processing (MPP), in which many CPUs run in parallel to execute a single task, each with their allocated memory.This is much harder to program, as the different processors need some way to communicate with one another, but avoids the bottlenecks of SMP. With an MPP approach a large database request may be split up and sent to each processor in the cluster to execute in parallel. An analogy is when you queue at a bank. If there is only one bank teller then the queue of customers has to wait for each customer in turn to be dealt with. If more tellers are available then the queue will move quicker, provided that each customer’s business can actually be handled independently by the various tellers.

One key issue is how variable the demand is: in the bank analogy, do customers come through the door as a steady pace throughout the day, or are there peaks and troughs of demand? In some businesses it can be seen that there will be a lot of variation in terms of customer demand. In a retail business there will be much more demand for processing sales transactions around Christmas or other peak buying periods. Florists will experience strong peaks around Valentine’s Day, Mother’s day etc. This causes headaches to capacity planners, who do not necessarily want to buy enough servers to cope with the year’s busiest day, since all that extra capacity will be idle throughout much of the rest of the year. Moreover, the problem of peak demand seems to be worsening: in the words of one logistics consultant: “the peaks are getting even ‘peakier’ ”.

This is where the cloud can help. Instead of running your business transactions on the servers in your data center, the cloud offers the possibility of a more flexible, scalable solution. As demand increases you would ideally just rent more capacity, adding more servers and memory as required in order to deal with extra workload. When things calm down you can hand back that extra capacity to the supplier rather than having to pay for that peak capacity forever. In order for such a scenario to work you do need a database that is actually capable of handling such variations in capacity. Ideally you want a database that has linear or near-linear scalability: in other words, if you double capacity, a task should run twice as fast. In practice it is very difficult to attain such linear scalability, so customers should carry out detailed value testing rather than trusting vendor claims of scalability. Follow up customer references and make sure that you test things on your own data: what may work in a different environment may not work so well in your own situation. With these caveats, there is little doubt that scale-out, cloud-based database approaches offer many businesses a significant opportunity to seamlessly add capacity over time and handle peaks and troughs of demand in a way that was not practical just a few years ago.

Andy Hayler is a data warehousing professional and food critic. He led the creation of the dynamic data warehousing architecture within Royal Dutch Shell that he later commercialized as the KALIDO Active Information Management software product of Kalido, the Shell subsidiary founded by Hayler in 2001. In 2002, Hayler was selected for the “Top 10 Innovators” list by the editors of Red Herring, the magazine of business technology innovation and entrepreneurism.

Add new comment