NuoDB team members Dai Clegg, Senior Director of Product Marketing, and Steve Cellini, VP of Product Management, discuss what to consider when delivering your newest cloud application.
(Dai Clegg): Hi, my name’s Dai Clegg and I’m the Senior Marketing Director for NuoDB, and what I want to talk about today are some of the decisions, some of the factors, some of the complications that will face you as you select a database for cloud applications.
Now, some people might say, well why do I have to select another database? I already have my application running on-premises. Why don’t I just lift and shift the existing database to the cloud? And that’s kind of not quite as simple as it might seem, particularly if you have your existing application running on a traditional relational database, there are some issues you’re going to face when you go to the cloud which I want to go into in this presentation, and some compromises that you’re going to have to make.
If you take the alternative approach, which is to migrate away from a SQL database onto a NoSQL database, the cloud, then there are other issues that you will face. There will be skill shortage; there will be change of architecture. There will be rewrite of your applications. There will be the loss of transactional processing in your applications, which you then have to decide whether that’s important and what you’re going to do about it. And let me be really clear, there are a number of use cases, there are plenty of use cases where you don’t need transactions. If you are loading a data warehouse with a bunch of big, overnight ETL jobs, you don’t care about transactional consistency because you’re only running one job at a time. There is no possibility for an update conflict.
If you’re running some Internet of Things applications which are all about streaming data in from thousands, tens of thousands, or hundreds of thousands of devices, each of which is saying, “Hi, I’m Device X, at timestamp Y, and for value Z, here’s my reading.” There is no conflict there; that’s uniquely X, Y, and Z, uniquely identify that thing as an event in time that will never be updated.
Now you might in that particular instance want to do some transactional processing on that data once you’ve gathered it, but the use case I’ve just described of collecting effectively append-only insert data, you don’t need transactional consistency for that. So I’m not saying -- let me be clear, I’m not saying that oh yes, transactions are the only way to go, and NoSQL’s no good, and I’m not saying that, oh you can throw away transactions and go NoSQL. There are issues both ways you go, and needless to say, those are the issues that NuoDB was kind of founded to address. And so, my intention today is to talk about how NuoDB will help fix some of those problems, and internally, rather than just take my word for it, I will tell you a few stories about our customers where they face such issues, and what we’ve done to help them to resolve those. And if I’ve got a little time at the end, we’ll take a peek under the hood, and see exactly what it is about the NuoDB architecture which makes it possible for them to address some of these issues.
So the big cloud database problem that I talk about. Look at this picture, and it’s simplify architecture of a typical traditional application, users connect to an application server, connects to a database which has storage at the back. You want to scale it up, no problem at all. Add some more users, add more application servers, put the database on a bigger box with more storage on the back of it, and all is sweetness and light, and it doesn’t really matter whether you’re running this on the cloud, or on-premises; that approach to increasing scale will work either way.
The problem will arise if you are running a traditional relational database, which is where most people have most of their applications, and want to migrate them to the cloud, is that whereas everything else you scale out, with the database, you scale up. You install the database on a bigger machine, with more storage, more memory, more CPU power, and by the way, your relational database vendor of choice will be happy to direct you to bigger and bigger boxes. There are some huge appliances designed just to run relational databases, and they scale up fantastically well. If you’ve got the depth of pocket to deal with these expensive appliances, and of course, with the licenses to go with that.
But if you take that kind of approach on the cloud, you’re going to run into a problem, because if you go to the cloud, when you want to scale up, there will reach a point much sooner than it would reach with on-prem with the relational database scale-up where you are already deploying your database on the biggest box that you can get from your cloud provider, biggest virtual box that you can get. That database is going to start bulging out the seams, if all you can do is scale up. That is the problem of traditional architectures moving onto the cloud. You know, client layer, application server layer, even the storage layers are cloud-friendly; they scale out. Traditional relational databases only scale up to the bigger and bigger machine, which has its limits anyway, but increasingly, the ceiling on those limits is lowered if you’re running on the cloud. It’s the biggest box you can provision.
What you really want to do is to do what the cloud is built for, which is to scale out your database on lots of commodity boxes, particularly elastically, so you don’t have to provision equipment and licenses for your peak, which you only hit for a few days a year. You provision equipment and software for when you need it, so you can then take advantage of what the cloud gives us, which is really, a new business model, which is pay for usage rather than pay for the maximum amount of usage you might make at some point during the year. Just, the traditional enterprise database licensing model. And now, if we had this cloud-friendly scaling, as I describe it here, wouldn’t we be in a better place?
OK, if that’s what the problem is, let me just step back from that and talk about some of the thinking that went into making NuoDB the way it is, and some of the thinking that I use when looking at applications and how one might migrate those to the cloud. And the first aspect of it is transactional. Do you need in your application the ability to have multiple users update the same records, many users, many records, but with update conflicts without getting into a mess, without the database becoming inconsistent. That is vitally important in some kinds of transactions, many financial transactions, billing, metering, subscriptions, account management, particularly in a world where you have mobile users, if I’ve got my phone, my tablet and my laptop and I have three views into my account, my local betting provider, and I want to place a $50 bet on each of a horse race, a football game, and a baseball game, and I’ve only got $50 in my account, oh I wonder if I press the button on all three at once whether that’ll go three. Well I might just try that, and my provider had better have transactional consistency so that they can make sure that two of those transactions will roll back, and only one of them will go through. But equally, there are lots of use cases that don’t need transactions, and I’ve already described a couple of those.
So let’s go onto the language of implementation. As Winston Churchill so famously didn’t say, “SQL is the worst data language, except for all the others that have been tried.” SQL’s been around for decades. It has its flaws, and over the years, it has chipped away at, and diminished those flaws, and it has met new challenges and expanded its capabilities, and the situation we are in today is SQL is a very good data processing language, and what’s more, there is an enormous ecosystem built up around it of tooling for accessing SQL databases, for designing SQL databases, for migrating SQL databases, for programming anything you like on a SQL database, and there is a huge pool of talent and skills, probably in your own organizations, and out there in the marketplace that can do those things.
So, although I have been unable, though I tried for this webinar, to find an example of a problem that can’t be solved without SQL, you can do pretty much you like with data without SQL, but it’s so much easier in the world that we live in to do it with SQL. So throwing away SQL has a big cost, has a cost in time-to-value, how long it will take you to acquire new skills to rebuild your systems from the ground up including re-architecting all of the software, that will have an increase in risk because obviously you’re using new technologies. They may not be mature, and you can very sure that your skills are not as mature in them as they would be, if you kept with your traditional tools. And there is the hidden cost of acquiring, training, and keeping the talent in some of these rare -- comparatively rare NoSQL new technologies. So SQL has its advantages, not just intrinsically in the language itself, but in the ecosystem that’s grown up around it.
And of course, if we’re going to the cloud, pretty much the reason we’re going to the cloud is we want to embrace the new business model, as I talked about, we want elasticity, and we want to be able to remove the constraints, any constraints that will stop us to do that. So I think the scale-out is going to be important pretty much for any database that you’re selecting for the cloud.
And the fourth dimension that I look at is what I call “geo-distributed,” which means multi-datacenter. At the very least, you’re going to need to support two data centers, because it is the nature of the cloud and how it works that there will be breaks in the network, and indeed, we all know stories of major cloud providers losing whole regions temporarily. So you need to have at the very least disaster recovery, and preferably, you need to have the capability to operate symmetrically in both of those data centers and to be able to expand it beyond two, three, four, as many as you need because as we move from shipping software by download, or by an old school term, shipping CDs, to delivering software-as-a-service, either kind of as a product, or as a service internally to your enterprise, to your organization, then we need to be able to deliver that, as it were, to the user in their locality, local to them. So we want to be able to take the system, the capability, and that includes the database, to the users to avoid having global traffic having to come to one data center and introduce huge latency which is totally unnecessary, and is just for our convenience because we choose to operate a monolithic database. So we want to have global databases; we want to have distributed databases. It’s the way the world is going, and it will be, I think, relatively short-sighted select, the database of the cloud, that didn’t support geo-distributed.
So of course, the old SQL players are in the intersection of SQL and transactional; they all do both of those, but they don’t do scale-out and geo-distributed. The NoSQL guys broke away about a decade ago because they absolutely had to have scale-out. They either couldn’t, or weren’t prepared to pay for the scale-up model of bigger, bigger boxes and more and more licenses, or indeed, in some cases, they absolutely outgrew what was available, and ten years ago what was available in a kind of traditional scale-up, databases, a lot less capable than it is now, but even so, there are many use cases where you just can’t deal with that; you can’t be -- it’s not a suitable solution, and if you’re going to the cloud, of course, you want to go the scale-out so the NoSQL databases were addressing the cloud requirement entirely, but in order to escape scale-up, they kind of threw out the SQL and transactional babies with the scale-up bathwater, and that left them with the situation that, for example -- I mean, there are no NoSQL vendors who don’t claim to have some kind of SQL support, and I was talking to an analyst about a month ago who was telling me about a client he’d been consulting with who had just chosen a particular, very well-known NoSQL database for his operational database, and then had been appalled to discover that the SQL support he had been assured existed for this database was fine if you wanted to access one table; there were no joins. Well, there’s ways you can work around that with massive denormalization. There’s maybe some applications that are quite happy to work that way. But it isn’t a general-purpose SQL solution. So you need to be very aware of the capabilities of SQL in, for example, boasted of by NoSQL database players.
The NewSQL guys pretty much took the old SQL model, but thought to break the scale-out constraints usually by use of shared memory caching, and techniques like that, new technologies like that, but they haven’t addressed the geo-distributed, and as I said before, NuoDB was kind of founded, architected, designed from the start to address all of that, and the approach we talked was, right, you’ve got a distributed system; we know how to build distributed systems. Now let’s put a transactional SQL database on top of it, rather than the other way around, which is, oh, we’ve got this transactional SQL database. How can we make it distributed, which is a much harder problem to solve. And so far, it’s proved intractable, despite the efforts of some giants of the industry over many decades.
I was really tempted to not use this truth-table slide, because I have seen so many presentations from vendors with these sorts of truth table slides, where you have a list of features down the left-hand side; you have a row of columns which are competitor 1, competitor 2, competitor 3, and Os along the top, and lo and behold, only one of the columns is all green, all text, and it’s always the one, the vendor who’s delivering the spiel at the time. So I kind of feel I ought to apologize for this, but I’m not going to, and I’m not going to, and I’ll tell you why: because the value of these kinds of truth tables, and they’re widely used by people when they’re evaluating products, the value is in that list down the left-hand side. If it’s just a bunch of interesting features that Vendor X has got, or that Vendor X’s marketing organization has pulled out in order to differentiate themselves, just for the sake of this slide, then that’s -- you know, you have a right to be skeptical about it. But those four dimensions that I just talked about, Transactional, SQL, scale-out, and geo-distributed, I think they are genuinely not just, you know, token features; those are genuine value dimensions for cloud databases, and for that reason, I’m not ashamed to put this up.
However, I would say, I would -- it’s not necessarily all black and white, or rather all red and green, as I’ve got my crosses and ticks on here. Even if you’ve got scale-out, what’s important -- well not’s -- one of the critical things for scale-out in cloud applications is what I talked about before, the ability to respond to peak e-workloads that rise and fall quite dramatically, therefore elasticity is important, and we’re going to achieve that cost commensurate to usage equation that I talked about being a value proposition for the cloud generally, so be careful about just how elastic your solution is, whether it’s NewSQL, NoSQL, or NuoDB, you know, try it out. Ask vendors the questions about that.
Likewise, in the situations where I’ve said, here, there is no geo-distributed old-SQL and many, in the NewSQL solutions, they will say to you, “Well yes, but we do synchronous replication,” or, “We use sharding,” and there are lots of situations in which those are adequate. All I would say is, about sharding is that I read a white paper written by one of the architects of MySQL, a guy at Oracle who wrote a paper on, you know, “Sharding with MySQL: Hints and Tips, Guidelines,” whatever, and the first one was, number one, don’t do it, because it’s horribly complicated. If you can find any way to avoid sharding, then use that. But if you have to use sharding, then, here’s points two to ten, where I guess, really excellent guidance and advice for people who have to do that. But here was a guy in a pretty authority position, and his number one rule was, don’t do it unless you really, really have to. So it’s clearly not regarded as being a desirable situation, and I’ll talk, in fact, in the context of one of our customers about database replication. So I’ll defer that for a little later. So look carefully at what your compromises you’re going to have to make.
And then the last thing is, and I’ve already alluded to this, is quality and availability of SQL. If that is important to you, and you are -- but you are looking at a NoSQL vendor and talking to them about their SQL support, or what tools that they can work with to deliver SQL access for you, and as I say, they all have a story they will tell, then you really need to drill down on how much SQL, how strong SQL, how well-baked into their solution is SQL, because there is a lot of variability, and the bar is frankly pretty low if you go that route.
So, I said I would tell you some customer stories. Let’s do that. First one is, an ISV in the US, and they deliver applications through their CSVs, through the telcos, the mobile telcos, they deliver BSS-type solutions for them to their customers, to mobile customers in the US, North America, South America, and also in Europe. They’ve been doing that on their own dedicated equipment, software and hardware, pre-configured, installed in the telcos’ data centers, set up as a, active-passive disaster relief, disaster recovery scenario across two data centers, and they came to us with two problems. Number one was, they -- their customers were becoming increasingly resistant to the suppliers installing their own kit in the data centers. It’s one of the strongest themes in the telco industry right now is towards virtualization of pretty much everything, so the last thing they want is new -- uncontrolled tin boxes in the data centers. They want cloud solutions.
And then the second problem they came with to us was this, the complexity of running this replication between two data centers in order to maintain -- even to maintain active-passive, never mind to be able to update from both of them. It wasn’t that they couldn’t do it with their existing vendor; they could do it, but it was very expensive; it was very complex; it was very brittle, very difficult to do upgrades to hardware or to software or to infrastructure. And so they really needed, in order to scale out their capability as an organization to be able to deliver more product through more customers to more end users, they need to make this a lot more simple. So that’s what we gave to them was continuous availability, not just disaster recovery, allowing them to work to connect clients to both the data centers, and indeed, says here, certainly one of the benefits here, active, active, active, they want to add third, and maybe beyond data centers in future, which we can do that for them.
And we simplified the operation, because NuoDB running it in two, three, five data centers is actually just one logical database. From a management point of view, it is a single database that happens to be installed on lots of nodes, and those nodes may happen to be in different data centers doing different jobs, but it’s a single database. So we simplified that for them, and because NuoDB is a full SQL database, it was much more simple for them to migrate that across, and get time to value much decreased, risk lowered, and the cost of acquisition of skills negated because pretty much, all they have to do is retrain people from one dialect of SQL to another, which is relatively trivial.
My second case study is a European ISV. They built, kind of a mobile app, sort of an e-Commerce-y kind of thing, they had delivered this into what they called “emerging markets,” i.e. relatively small, but they ran up against the limitation of the throughput they could get on the configuration software and hardware, again, they kind of pre-configured the whole thing, and in order to make money out of this, or to be really successful with it, they need to take it to bigger markets, therefore, they needed to scale the number of users and the amount of processing they could manage, and to do that, they needed a cloud-style scale-out deployment, and they needed, you know, the performance to go with that, so what we could offer them was that ability to give them an elastic scale-out cloud solution, and once again, ease the migration. In fact, in this particular instance, we didn’t know about them; they downloaded and started to migrate their app, and we didn’t know until they had their app up and running in test on NuoDB, and they wanted to talk about licensing. In fact, the performance that they had got as a first-time migration on their own was not quite as good as they could get out of the tuned appliance-like solution that they’d been delivering for a while, with all of the skills in-house honed and developed over years for that particular vendor, so we were pretty impressed that they got that far. In fact, when we got our service people on the job, we were able to tune it further, so we were getting better performance on equivalent hardware, like-for-like than they had been able to get on their appliance-like solution originally, and take it to much bigger markets, of course.
I threw this third case study in, because it’s kind of interesting. This is actually a cloud provider, an organization who is building cloud infrastructure, delivering infrastructure-as-a-service, and at the heart of that, as they were building this out and starting to deliver it, they’re now in global expansion mode, they had a SQL database, traditional relational database, in the center that manages everything, which is all about who the users are, their billing, their metering, what they’ve got provisioned, how much they’re paying for this, how much paying for that. It’s a transactional database; it’s being pinged all the time, a very high rate, every time a user of the cloud does something, they need to update that user’s history in order to manage the account, and of course provide the billing and all the back-end capability out of that.
But as they started to expand, they found two problems. One is the scaling, and the second one is the geo-distribution. As they scale their operation globally, they need to be able to scale the delivery to their customers globally, and so they came to us for exactly those requirements, to have a globally-distributed management database under the cloud, as it were. And in talking to them about them and in delivering that for them, we got to talking about the second opportunity, which is, well hang on, if this does a great job for us, under the cloud, to manage our infrastructure, this is a great offering for us to make as a database-as-a-service on top of that cloud for their customers, that actually, they’re not delivering that yet, but still a work in progress. But I kind of thought it was worth throwing into the mix, because it’s like, you know, we did such a good job for them on the infrastructure that they thought, well, we’ll take this and we’ll deliver it to our customers as an OEM-type solution of top of the cloud.
So if I was to summarize what our customers told us, you know, they want to get scale-out, elastic scale-out, and they want geo-distribution, but they’re not prepared to throw away the transactional consistency they have in their existing applications, the ease of developing and delivering with their existing skill sets in SQL, and so you’ve got that combination of the four dimensions I talked about, which is, you know, why we sit in the middle, why we want to sit in the middle.
Now, I’ve tried to be, pains to point out that, you know, not all use cases, not all requirements are, “Oh yes, NuoDB is the database for everything.” There are plenty of use cases where you can deliver a great solution with some other product. We’re not trying to, you know, say that we’re the only people that can do anything. No, we can do this one thing, this combination, and we do it very well, and actually, I think we do it uniquely well in the sense that I don’t know anybody else who can make the same claims, so if these are requirements you have, then you know, hopefully this will have been a useful time spent for you in hearing a little bit more about it.
So, I’ll finish up with a little dip under the hood, as I said, what is NuoDB? And start with the things it delivers. A single data center is just not enough anymore, and although not all use cases need ACID transactional consistency, a great many of them do, and they are often very high value, literally very high value, because they -- for example, financial transactions, and because SQL’s been around for so long, there are the skills and the infrastructure in place, and so, although NuoDB is a database-engineered architecture for the cloud from the beginning, what we’ve done is we have not thrown away the transactional and the SQL babies with the scale-up bathwater in order to get to scale-out and geo-distributed.
So, how does it work? Multi-tier architecture, the management layer of what we call the brokers and the agents which do the handshake to make sure that every node on every data center knows where it is, and where it sits in the environment. Under that, we have first a transaction layer, and the storage manager. The transaction engines, they connect to clients and provide their transactional processing for them, and in turn connect on the back-end to storage managers which worry about the persistence of all of the data, but the data itself resides in this distributed, shared cache that the transaction engines and the storage managers share between them, and deliver up to the clients on demand.
And of course, it’s elastic. If you need more transaction engines, you just kick off a new one, and it’s up-and-running and taking client connections immediately. It starts to pull the data in needs from the cache in other transaction engines, the storage managers in order to deliver for its client connections, and so it’s immediately providing additional processing power, and as you pass the peak and you no longer need it, then you can close down the transaction engines, de-provision the virtual machines they’re running on, and you are saving yourself the cost. The same is true of storage managers. As soon as you kick up a new storage manager, it starts to synchronize its data archive with the other storage managers, and so is rapidly comes up to speed. The reason you have multiple storage managers is mostly for resilience, and there are some other reason I’ll talk about in a moment, but resilience is the key one for that.
And of course, because all of these nodes are all talking to each other, they’re all networked together, there is nothing to stop you putting a wide area network into a connection in that network, or indeed, multiple connections, in the case of multiple data centers. For our last release, we published benchmark results showing how NuoDB scales out, as you add more and more nodes, more and more transaction engines, and more and more storage managers on the back, and we got it up to a million NOTPMs, million New Order Transactions Per Minute on the DBT2 public benchmark. And that was to show that we scaled out pretty linearly.
What we’re doing for our next release, and we’ve already started this, is testing that -- doing that same test, but running it across two, three, four, and up to five data centers so far where we have clients connecting to each of five data centers and executing their transactions across a single logical database. And yes, particularly where we’ve simulated increased latency between the data centers, yes you do see a little bit of a slowdown, but frankly, we’ve been positively surprised at how good our numbers are, and that’s in relatively early days. I’m not going to share any more about those numbers with you right now, but in a couple of months’ time, we will be publishing the results of that, and they’ll show pretty conclusively how well you can run a NuoDB database across multiple data centers, active, active, active everywhere.
And because of, as I talked about, replication for resilience, and because of the ease by which you can ease into place, then a NuoDB database is a pretty hard database to kill. You lose individual processes; you can lose nodes, machines, you can lose whole data centers, and it can still carry on running.
Just going to finish off with a little example of how NuoDB handles update conflicts in order to maintain consistency. So in this instance, I have two clients. They both want to update Record A, Object A. Object A is currently in cache in three transaction engines and one storage manager, and in one of the transaction engines, the middle of the three, you can see there’s a little red ring around that one. That one is what we call “the chairman.” In NuoDB, every single object record, row, that’s in the cache anywhere, will be probably in the cache in more than one place, and it will have a chairman. Typically that is the first place where it was either created or retrieved into, but it doesn’t have to be. And that chairman is responsible for managing the multiple versions of that object, whatever it is. So, in the case where we have two transactions, both want to update A, they both connect to their TEs, they both grab their copy of A. They perhaps make an update, and they request the chairman permission to commit this update. The chairman will accept one of those, and it will reject the second one. As many people as like can read with read consistency older versions of A, but only one update can be accepted until it’s been committed and then the second one. So the first one in will get accepted; the second will be rejected. That one will be rolled back. The first transaction will be committed, and once it’s committed, of course the second client can retry theirs, and add an update to the update. That way, you’ve got this ability to manage concurrent updates, and manage them across multiple data centers, across multiple clients, but without imposing a single point of failure, because there isn’t a lock manager that manages everything; the cache manages each object itself.
That’s the completed, and I think we have got time for Q&A, so I’m going to hand back to Knox and ask the Q&A session. Thank you very much, and I hope it was some value for you.