CTO Seth Proctor walks through the ease of replication with NuoDB. Find out the difference between full data and on-demand data replication and why the distinction is important.
(Seth Proctor): Hey, everyone. I’m Seth Proctor, CTO of NuoDB. There it is. Last time, we talked about sharding; we talked about why sharding is difficult, and what we give you to free you from having to shard. Today we’re going to talk about a related topic and its replication. Replication can mean one of many things, and in practice, you may have come across one of the few kinds of replicated schemes. If you’ve used something like Cassandra, you’ve run across what would be called a ksave scheme, where there’s an explicit attempt to maintain multiple copies of an object to protect against failure. It’s very cool stuff. It’s not what we’re talking about today. We’re going to talk about two other things that replication could mean. One of them is kind of explicit, full-database replication, and one of them is on-demand replication. And we’re going to talk about how using one versus the other can introduce problems, or can really help you solve problems in a distributed database architecture.
OK. So, here we have what may look like a somewhat familiar architecture. This is kind of a very, very, condensed form of what you might see if you were looking, say, at Facebook. We have a caching tier, so that we can work in memory, and then we have a sharded database. And last time we talked about explicit shards for various reasons... In this case, I’ve drawn it with shards where each shard is, in fact, itself, replicated. So we have these three shards here.
So, there are two major reasons why you might explicitly replicate your database. And one of them has to do more with high-availability, and one of them has to do more with scale-out. When we’re talking about replicating your database, we’re typically talking about a model where you have an active version of your database, and at least one passive version.
And by active and passive, what I mean is, you have a database where you’re actually making changes, where you’re doing inserts, where you’re doing updates, where you’re doing deletes... And then you have a passive copy of your database, where you’re replicating. You’re pushing out those changes. And so, you have kind of a read-only version of the database. And so, you might be pushing it from one active copy of your database to a single, passive copy. Or, you might be pushing from a single, active copy to multiple, passive copies. Going from a single, active copy to a single, passive copy is a technique for giving you higher availability. It’s a way of making sure that you have a copy of your database that’s running somewhere else. So, it’s both an extra-durability point, and it’s also insurance if you’re active host goes down, you can fail over to your passive host. Replicating from a single, active database to multiple, passive databases is something we talked about last time. It’s a way of trying to scale-out effectively, to give you multiple databases and multiple hosts where you can be [weaving?] your data. But that’s important.
There’re mostly only hosts where you could be reading your data. You could only be doing your update back in the active copy. And that’s part of the problem with this kind of scheme, is that no matter how much you want to scale-out, all of your updates still have to go through this one point. And then, there’s going to be a replication delay between the time you’re doing an update in this active copy, and it gets pushed out to all of your passive copies. So, for example, if I do an insert on my active database, there’s going to be some amount of time before I can do a select at one of my read-only databases. And that breaks a lot of the rules of ACID, and that’s a problem. You get that same kind of replication delay when you’re trying to maintain high-availability, or when you’re just trying to keep an extra, durable copy of your data. And that’s also a real problem because it means that, if you’re in the middle of doing updates, but they haven’t been replicated -- they haven’t been pushed out to your passive copy yet -- and your active copy fails, you’ve now failed without all of the updates that were made on your active copy. And so, you’ve been left in an impossibly inconsistent state.
There are other challenges. If you’re a MySQL user, you know there are choices about how you do replication. Are you doing statement replication? Are you doing road replication? Each of these has their own implications in terms of what side effects are observed or lost, how quickly data is replicated and how that data is replicated. So, there’s a lot of complexity, and there’s a lot of complexity to the programmer, because the programmer needs to understand something about the delays that are involved, needs to understand something about the way the data might or might not be correctly replicated, and so the programmer really needs to understand something about this deployment model. And, to the operations person involved, they’re maintaining multiple databases, and they have to maintain the connections between these databases, the monitoring of multiple databases, and they have to think about all of the different issues, especially when you look at this picture. Here, when you think about multiple passive copies, how many hosts are involved? How many different replication points are involved? And how do you make sure to maintain the model correctly when your active host dies and you have to fall over to one of your passive hosts? So, that’s kind of a quick view of explicit replication of the database.
So, in NuoDB’s architecture, we take a slightly different view of what we should be replicating, and why. And we use what’s called an on-demand replication scheme. And, the exciting bit, for you, is that this picture I drew here, essentially is the same thing that happens in our database from the point of view of you get a single view of the database with a caching layer, and with a scale out architecture. But you don’t have to explicitly configure any of this. You get an always consistent view of the world, and you get an always consistent replicated view of the world that scales out even when you’re running across multiple data centers.
Let’s talk about how that works. So, here’s a picture that may look familiar if you saw our first video, were we talked about the basic architecture of NuoDB consisting of transaction engines and storage managers. I have two transaction engines, and a single storage manager running. So, this might be scaled across, say, three different machines. And, in it’s current state, there’s an object that’s sitting in my durable store, but no transaction’s currently using that object. And as a result, it’s not sitting in any cache. We’re not trying to maintain it on multiple machines; we’re simply keeping it in our durable store.
Now, let’s say a client connects to this transaction engine, and starts a transaction that needs to reference this object? At this point, what this transaction will do is, on-demand, it will pull in the object from the durable store into it’s in-memory cache. So now what we have, is we have one copy of this object. We have it sitting on this transaction engine. And as long as no one else wants to interact with this object, this is the only place it lives. We’re not doing any work to explicitly replicate it anywhere, but we’re also not worrying about failure [modes?]. If this transaction engine fails, that’s fine, because the object is being maintained in the durable layer, and this transaction engine is just in memory. So, it can fail, it can drop this object -- not a problem. As soon as a second transaction starts on a second transaction engine, and we need that same object, what will happen is, that object is now replicated into the cache at the second transaction engine. The difference is that this time, it comes from the cache at the first transaction engine, because that’s a much faster place to get it than down at the durable layer. So, now what we’ve done is we’ve replicated this object in the sense that it now exists in two places. But again, it can be dropped from either of those caches, either of those transaction engines can fail, and our system continues to maintain its consistency and its correctness. So, what’s nice about this model is that the database programmer’s point of view is always to a version-consistent view of this object. There’s never a concern about, because I’m on one transaction engine versus the other, I’m getting an inconsistent view, because of any update times. That’s how our default isolation model works.
It’s also the case that I can now bring online, say, a third transaction engine somewhere else in the system, and it doesn’t have to hear about any explicit replication messages until it is actually doing something that interacts with this object. And that’s part of why we can scale across multiple regions very effectively. Because if you have reasonable cache locality, then you’ll populate caches within a region, but typically there won’t be a lot of communication between regions, because this is an on-demand cache, and not something where we’re explicitly trying to replicate every object in every location. And again, all of this is happening for free in our system.
You don’t have to think about any explicit replication scheme, you don’t ever have to think about active versus passive because we’re in an always active, always consistent scheme. But, because we use this on-demand replication scheme, instead of some kind of explicit replication, you know that you’re always getting a consistent view of the world. You know that there are never any side effects that you’re going to lose based on which kind of replication scheme you’re using. And you know that we’ll be able to scale-out, whether it’s in a single data center, or multiple data centers.
If you’d like to learn more about this model I was just showing you, with the transaction engines and the storage managers, you should go check out an earlier video we did, where I walked through this basic architecture, and how you use NuoDB to scale out.
I encourage you to tune in next time, where we’ll talk more about kind of pieces of this architecture, and where your pain points may be, some of the problems we think we help you address. Again, my name is Seth Proctor, I’m the CTO here at NuoDB. I encourage you to go learn more about our product in our tech blog. It’s at NuoDB.com/techblog. You can follow me @TechnicallySeth. And, until next time, have a great day!