Traditional Database Scalability Methods
Let’s turn our attention to traditional methods for achieving scalability in database systems. Databases scalability is often implemented by clustering. With clustering, multiple servers are used to serve database requests.
There are two predominant architectures for implementing database clustering: shared-disk and shared-nothing. At a high level, these names do a reasonable job of describing the nature of the architecture, but let’s take a more in-depth look into each. Both are forms of horizontal scaling. Later in this paper, we’ll explore a new architecture that combines many of the benefits of both of these architectures.
Figure 1. The shared-disk architecture.
With a shared-disk environment all of the connected systems share the same disk devices. Refer to Figure 1. Each processor still has its own private memory, but all the processors can directly address all the disks. This means that there is no need to break apart data into separate partitions because all of the data is shared in shared-disk implementations.
But it is important to understand that only the disks are shared. Main memory is not shared; each processor has exclusive access to its memory. Because any processor can cache the same data from disk, a cache coherency mechanism is necessary to ensure consistency when multiple nodes modify the data. A distributed lock management capability is also required to manage the consistency of the data as it is being requested and modified by multiple network nodes.
A shared-disk implementation offers several benefits including potentially lower cost, extensibility, availability, load balancing, and relatively simple migration from a centralized system. However, shared-disk benefits from potentially costly Storage Area Networks (SANs), which can drive up the cost. Typically, shared-disk clustering tends not to scale as well as shared-nothing for smaller machines. But with some optimization techniques, shared-disk is well-suited to larger enterprise processing, as such is done in the mainframe environment.
The specialized technology and software of the Parallel Sysplex capability of IBM’s mainframe family makes shared-disk clustering viable for DB2 (and IMS) databases. In particular, the coupling facility and DB2’s robust optimization technology helps to enable efficient shared-disk clustering. Mainframes are already very large processors capable of processing enormous volumes of work. Great benefits can be obtained with only a few clustered mainframes – whereas many workstation processors would need to be clustered to achieve similar benefits.
Shared-disk is usually viable for applications and services requiring modest shared access to data, as well as applications or workloads that are very difficult to partition.
Examples of database systems that implement a shared-disk approach for clustering include DB2 for z/OS (with Data Sharing), DB2 for LUW (with PureScale), and Oracle RAC.
Figure 2. The shared-nothing architecture.
In a shared-nothing environment, each system has its own private memory and one or more disks. Refer to Figure 2. The clustered processors communicate by passing messages through a network that interconnects the computers. In addition, requests from clients are automatically routed to the system that owns the resource. Only one of the clustered systems can “own” and access a particular resource at a time. Of course, in the event of a failure, resource ownership may be dynamically transferred to another system in the cluster.
The main advantage of a shared-nothing architecture is improved scalability.
The main advantage of a shared-nothing architecture is improved scalability. In theory, a shared-nothing multiprocessor can scale up to thousands of processors because they do not interfere with one another – nothing is shared. However, in practice, shared-nothing scaling of database systems is implemented on far fewer nodes. The scalability of shared-nothing clustering makes it ideal for read-intensive analytical processing typical of data warehouses.
A disadvantage of shared-nothing is that a partitioning scheme must be designed to apportion the data across the nodes of the database. Data is usually partitioned horizontally by row. This requires identifying a column (or set of columns) to be used to split a table into multiple tables, each with a different subset of the rows that were in the initial table. The scheme may be simple, as when the data matches a segment of the business. For example, partition 1 stores data about the Western region and partition 2 stored the Eastern region data. Or the scheme may be more complicated (such as based on a hash key), especially when there is no easy way to separate the data in a business-relevant manner. Automatic partitioning is known as sharding (and it is discussed in the Database Sharding section of this paper).
Remember, the data is not shared, so it must reside on (at least) a single node and the DBMS must know how to partition and access the data based on the partitioning scheme. Vertical partitioning, or splitting a table into subsets of columns is also a possibility.
The shared-nothing approach is based on chopping up the data into smaller subsets because larger single-image databases that are not partitioned can be more difficult and costly to administer and query. Additionally, creating and maintaining a very large database in one place can require high-end, costly computers, whereas partitioning can be accommodated using multiple, cheaper distributed commodity servers.
But partitioning almost always involves trade-offs. One partitioning scheme may work well for certain applications, but another scheme works better for others. There is no universal way to partition data that optimizes all application usage. As usage patterns change and evolve and data volume grows, you may need to re-address the partitioning scheme to better accommodate the data in your database. Repartitioning is a non-trivial exercise that is not conducive for 24/7 processing because it requires DBA and programmer effort as well as database downtime causing application outages.
A disadvantage of shared-nothing is that a partitioning scheme must be designed to apportion the data across the nodes of the database.
Another challenge arises whenever data must be accessed or modified across multiple partitions. Shared-nothing works well when access and modification is performed only to a single partition, but whenever data from more than one partition is required, complexity arises and ACID compliance can break down.
Depending upon the capabilities of the DBMS being used, the partitioning scheme may allocate data redundantly to more than one node (for failover and availability requirements).
Examples of database systems that implement a shared-nothing approach for clustering include Teradata, MySQL, and many NoSQL and NewSQL offerings. Shared-nothing clustering can be particularly effective for NoSQL databases. This is so for several reasons. For example, many NoSQL database systems do not support ACID, instead relying on eventual consistency, which is easier to implement, but can result in applications and users reading outdated data. Additionally, NoSQL systems typically work on commodity hardware with no built-in high availability features, like a SAN, thereby enabling quorum replication to be used for replicating the data to all pertinent nodes.
Sharding is often used with a shared-nothing approach to automate partitioning and management. The word “shard” means a small part of a whole. So database sharding is a technique for partitioning databases that separates large amounts of data into smaller, more easily managed parts called shards.
Instead of scaling up, sharding breaks apart data to allow scaling out. Sharding is similar to horizontal partitioning in that it splits tables by row, but the data is partitioned across multiple instances of the schema. The primary benefit of sharding is that the processing workload against a large partitioned table can be spread across multiple servers.
There are drawbacks to sharding. For example, after sharding, instead of having a single database to manage, there are now multiple databases each with its own server, CPU, and memory requirements. Additionally, sharding can negatively impact fault tolerance. When one shard goes down, the data on that shard is not accessible. That is why sharding is also often accompanied by replication to have a duplicate data set ready for usage in case of failure (Replication).
A preordained method for determining how to shard is required; that is, in which shard each specific row should be placed based on the data and an algorithm that apportions the data to a specific shard, or partition.
Once the data is sharded, each shard lives in a totally separate logical schema instance. This can be across physical database servers, multiple data centers, or even across multiple continents. There is no ongoing need to retain shared access (from between shards) to the other unpartitioned tables in other shards.
Examples of database systems that implement automatic sharding include Apache HBase, Couchbase, and Informix.
In many cases, database systems that support shared-nothing with sharding also support redundant replicas of data to bolster fault tolerance. Replication involves setting up a separate copy of the data on a different node.
Of course, if this is all that you do then the data will quickly become stale as the original data is processed. To remove this problem, the database system provides a replication engine. When data is changed on the master copy of the data the replication engine ensures that the changes are replicated to other copies. The exact method for replication will differ from system to system, and may require reading the database logs and retrying modifications until they success across all replicas.
Replication across multiple servers can be easy to setup but on-going administration and management will be required. Of course, replication requires additional storage (for each replica), as well as additional I/O and CPU usage to support the data replication process.