You are here
NuoDB 2.6, Part 3: Scale-Out with Table Partitioning and Storage Groups
This week we’re returning to our, “What’s New in NuoDB 2.6,” series with a discussion about the table partitioning and storage group capabilities of this latest release.
Table partitions can be used to improve data management and query performance. Storage Groups enable scale out and improved input/output performance at the storage layer. Both of these features together allow the application to perform actions such as data aging or archiving more efficiently, assign specific workloads to slower or higher performance disks, take advantage of parallel computing power, and process higher throughput volumes of reads and writes.
Let’s dig into this a bit. First we’ll go over what table partitioning is, why it’s useful, and what we support in NuoDB 2.6. Then, we’ll explain Storage Groups and why they are an important part of NuoDB architecture, unique to NuoDB. Lastly, we’ll explain how NuoDB uses these two features together to boost performance and obtain effective storage layer scale out.
SQL-Standard Table Partitioning
One feature of table partitioning is that it allows you to perform partition pruning. That is, separating individual tables into sections in order to quickly process different parts of the table separately. One example where you might want to do this is an inventory table.
Let’s say you run a large e-commerce website and have a large number of items for sale. When customers visit, explore, and shop on the site, they constantly perform actions that query the inventory table. Depending on customer traffic, the table could easily run into performance issues when trying to handle all the transaction requests. Table partitioning allows you to separate the inventory data to optimize for access, processing, and performance.
Previously offered in technical preview, NuoDB 2.6 fully supports table partitioning. You can use standard SQL commands to create a partitioned table, specify partition by range or by list, and alter a partitioned database. Once you create your partitioned table, you can map the partitions to individual storage groups.
A Symbolic Storage Unit: The NuoDB Storage Group
In a traditional partitioning approach, the table partitions are typically mapped directly to physical storage on a single server. With NuoDB, each table partition is assigned to exactly one (and not more than one!) symbolic Storage Group. A Storage Group includes one or more table partitions and symbolizes a unit of storage. It is identified with its own, unique name.
Let’s use a graphical example to dig into how this works:
In the example shown above, you can see that the table is partitioned into TP1, TP2, TP3, etc…
- TP1 and TP2 are assigned to a Storage Group named SG1
- TP3 and TP4 are assigned to Storage Group SG2
- TP5 is assigned to Storage Group SG3
Storage Groups are then mapped to different Storage Managers, each of which runs on its own server, effectively allowing your table’s data to span multiple servers. This approach provides flexibility for scale-out and performance optimization beyond what you typically achieve in traditional systems.
What’s a Storage Manager? For those new to this blog and NuoDB’s distributed architecture, NuoDB appears as a single, logical, database to the application. Under the hood, it has a peer-to-peer, two-layer architecture that can be deployed within or across data centers and which includes an in-memory layer of Transaction Engines (TEs) and a storage layer of Storage Managers (SMs). The in-memory layer allows the application to naturally build up its own caches of frequently accessed data, and the storage layer provides ACID guarantees, data redundancy, and data persistence. Each layer has the ability to elastically scale out (and back), simply by adding and removing TEs and SMs.
So, a Storage Manager is the process node that provides durability for your data. It both writes data to disk and manages data on disk. By assigning a Storage Group to one or more Storage Managers, you control the physical location of where your data is located, and you control how many copies of that data are stored for redundancy, continuous availability, and separate processing purposes. The diagram below shows one approach for accomplishing this:
We previously divided up our five table partitions among three Storage Groups. And now we’ve assigned the Storage Groups (symbolic storage units) to two Storage Managers, each. This particular configuration provides full data redundancy for each partition at the storage level in our NuoDB system.
Read more about table partitioning and storage groups by visiting the NuoDB Documentation.
Separation of States: Table Partitioning, Storage Groups, and Storage Managers
One of NuoDB’s key advantages is its peer-to-peer, distributed architecture. The technology presents itself as a single, logical database to the application. However, invisible to the application, the database architecture provides a great amount of the flexibility to optimize for performance and availability requirements. Both within and across data centers, you can dynamically scale out and back in to optimize for database performance and operational characteristics such as data redundancy, availability, specialized processing, read/write throughput, overall storage capacity, and more without having to worry about interrupting service to the application or worrying about adjusting application code so the application is “aware” of the underlying database changes.
Our implementation of table partitioning and Storage Groups continues to take advantage of this unique architecture. Table partitioning occurs at the application level while Storage Groups are handled by the database operator separate from - and invisible to - the application. Being a logical storage unit, Storage Groups, themselves, are an independent entity from the physical storage location. This “separation of states” specifically allows the database operator a large amount of flexibility and control over the location or locations where data is being stored, granular levels of data redundancy (for high availability), and aligning hardware capabilities with data processing needs.
Example Use Cases
There are many use cases where table partitioning and storage groups can come in handy.
Let’s first take an e-commerce example. Below, we’ve partitioned our inventory table by product type into five different partitions:
The divisions fulfilling orders are Fasteners, Hands Tools, and Power Tools. Each of these divisions have an application responsible for processing transactions and fulfilling orders, but each division is only interested in fulfilling requests for specific types of products. Instead of querying the entire table each time they need to fulfill a request, with table partitioning, each division’s application will query only the partitions that include the types of products they care about.
In addition, since each Storage Group is available on two Storage Managers, if one Storage Manager becomes unavailable, the second Storage Manager will be available to provide uninterrupted data service to the application.
Here’s another example to think about. Two weeks ago, we explained how NuoDB can be deployed as a single logical database across multiple data centers, providing active-active benefits with none of the complexities or costs typically associated with this type of capability. When you use Storage Groups in an multi-data center situation, you can precisely control levels of redundancy and availability for your data without needing a storage area network (SAN) setup or other expensive technologies. Let’s say the application in Availability Zone #1 reads and writes to TP1 and TP2 (assigned to SG1), while the application in Availability Zone #2 reads and writes to TP3 and TP4 (assigned to SG2):
You can set up your Storage Groups so that you maximize availability for SG1 in Availability Zone #1, while storing SG2 as backup, and maximize availability for SG2 in Availability Zone #2, while storing SG1 as backup.
There are plenty of other ways you can optimize performance and availability using table partitioning and storage groups! The main point, however, is that when you combine NuoDB’s architectural approach with this type of functionality, you receive a great amount of flexibility in what you implement. The choice of what to optimize for is yours to make, according to the needs of your organization.
This is Part 3 in a multi-part series getting in-depth with NuoDB 2.6! Check out Part 1, Introduction to NuoDB 2.6, to learn everything we released in NuoDB 2.6 and Part 2, Continuous Availability with AWS Active-Active. Look out for our last blog in this series about SQL Enhancements coming soon.