Watch this episode of The Briefing Room to hear Jim Starkey and veteran Analyst Dr. Robin Bloor discuss how next generation DBMS, purpose-built for the cloud, can create strategic organizational advantage.
(Eric Cavanaugh): Ladies and gentlemen, hello and welcome back once again to “The Briefing Room.” Yes, indeed. My name is Eric Cavanaugh, I will be your host and moderator for the show that is designed to get right down to the nitty gritty of what is going on here in the world of enterprise software. And boy, the times they are a changing. There are some really fascinating developments going on all over the space, quite frankly, all over the data management space. We’re going to talk about that today, The Crown Jewels Is Enterprise Data Ready for the Cloud. Good question, there’s a slide about yours truly. And enough about me. So the mission, as many of you know, here in this particular show, is to reveal the essential characteristics of enterprise software, we do that by taking live analyst briefings. So we’ll bring in analysts, oftentimes it’s our very own Dr. Bloor, as it is today. But we work with many of the different independent analysts all around the industry, and the reason is pretty simple. One, they can tell their honest opinions. They don’t work for big analyst firms, so they don’t have any pressure from management to give their candid opinion about things. And two, because there are so many of them out there who have a lot of good, real world experience. And that’s what we like to see. So we want to see people who have built solutions, gotten into the weeds, actually worked out some kinks and figured out how to make all this stuff seamlessly integrate into your enterprise. It takes longer than usual sometimes. And sometimes it goes pretty fast. So we’re trying to give you out there the ability to learn from these experts, to figure out what you can do to improve your situation in your company.
So, we talk about different topics each month. This month is the cloud. Big topic obviously, big data is also a very big topic. May is going to be database. We do that because we want to get a good apples to apples comparison between key vendors in a particular space. So as many of you know, when you’re purchasing some new enterprise technology, it’s usually not terribly cheap. So you want to make sure you get exactly precisely what you need, and frankly that is the reason we invented this entire show. So, let’s talk about what’s happening in the cloud with respect to database, you know, I’ve been tracking this space for quite a few years now, and really so many of the old database technologies, even though they had great stuff, were really never designed for the cloud, why? Because the cloud kind of didn’t exist. I mean, you could argue semantics and say it’s been around for 30 or 40 years, used to be something called the mainframe, obviously. And we’re kind of almost going back to that era in a slightly different way, there’s a lot of stuff going on up there. But there’s a very heterogeneous environment in the enterprise these days. There’s a lot of data that’s on premise, and now we’re all starting to explore the cloud, and doing so in various ways, really a handful of disruptive forces have come along, like for example, SalesForce.com. But many, many others, to lots of different applications in the cloud that companies use, and it’s interesting, I was at a conference just a couple days ago, they talked about shadow IT. And I always thought that name, shadow IT, it sounds so exciting. And the cool thing about shadow IT is that it delivers some kind of business value. Almost inevitably what happens is people in a company get tired of dealing with some bottleneck, some issue or hurdle, someone standing in their way, not letting them do what they want to do, and so they go and develop their own system. Well these days, a lot of that shadow IT is in the cloud. Because there are all these solutions, whether it be for contact management, or marketing, or sales automation, or whatever the case may be, there are tons and tons of cloud offerings. There are a whole conglomerate forming coalitions around provisioning cloud services. So there’s a lot of stuff happening there. The one thing that has really been a bit of a hurdle for cloud deployment has been which database do you use up in the cloud to manage these strange environments, all these APIs, all your data, how much security do you need? How much governance do you need for all that stuff? There are all these different requirements and frankly, they’re very, very complex, and that’s why we need innovation in the database space and boy, are we lucky today to have none other than Jim Starkey, along with our very own Robin Bloor.
So there’s Robin’s slide. You can see NuoDB is the vendor in “The Briefing Room” today, it’s a new SQL distributed database solution, architected to scale elastically on the cloud. So this is really interesting stuff. So you can see it leverages a period of peer-distributed architecture, and it’s ACID compliant. Those of you who went to college in the ‘60s love that. And there’s our man, Jim Starkey. So, this guy has been around for a long time, he looks good though, as you can see. And really ask any question you want, because he can answer probably any one of them. So with that, I’m going to hand it over to you Jim, and keep in mind that because of your phone connection, you’re a bit soft, so just act like you’re mad and speak loudly, and the floor is yours. Let me hand you the keys, hold on, to the WEBex. Almost forgot that, and the floor is yours. Take it away.
(Jim Starkey): OK, thank you very much. I’m sitting in here for Barry Morris, who’s going to be doing this presentation. Barry got called away for a very, very, very important meeting, which happens when you’re a new start. So, you’re just going to have to put up with me, I guess. Actually I’m retired from the company, I started the company, I developed the architecture, the software, and yeah. I’ve been around since the Arpanet. And so, it was time for me to go off and do something else. But, I think I can represent the technology and the company quite well. NuoDB is, I would like to say it’s the next generation distributed database. It’s a database system that is a total break, 100% break from what’s been done in the past. It’s relational standard, industry standard interfaces, it does JDBC interface, it’s primary interface. But the technology under the SQL engine is just completely different. It’s not disc-based; it’s based on what we call atoms, which are distributed (inaudible). There can be any instances, number of instances of an atom, they replicate to each other on a peer to peer basis. And they’re completely dynamic. It’s a radically different architecture from anything that I’ve done in the past, and anything else that exists that I’m aware of on the planet. It’s designed to be completely elastic, which means you can, at any time, add another node, and it goes faster, higher capacity. It’s certainly designed for the cloud, for the unusual characteristics of the cloud. It’s for data centers and everything between.
It -- call it active active. Let me describe very briefly what the architecture is like. In a single database for NuoDB, there are really two different types of nodes that have to be present. One is a transaction manager that pumps SQL, does user requests, run transactions. And the other is a storage manager, something that keeps persistent copies of all of the data. So the database is shut down and brought back up if necessary. A database system can have any number of transaction nodes, whatever is necessary. And any number of storage managers. They all communicate together. They all participate in replication. And depending on what the demands of the application are, you can balance what you need for storage, and where you want the storage to be, as well as much transaction processing that you need. There is great depth in the company and the investors. I have been around literally since the Arpanet, I like to say I was on the internet when there were 47 computers, but it was actually the Arpanet is the precursor, doing distributed data management. I’ve done four or five relational database management systems, depending how you count them. I did the DEC RDB product, when the DEC RDB product, designed that architecture. Barry Morris, who’s our CEO, with CEO [stream base?], a real-time, oh I forgot exactly what he calls it. It’s a real-time database type process for high-volume data. But also on our board, we had Mitchell Kertzman, who was at one point, CEO of Sybase, and Gary Mergenthaler, who was the founding CEO of the original commercial Ingress company, and also the CEO of Allustra. So, you know, back when I was on the board, five members of the NuoDB board, we had four database CEOs. A great deal of experience.
The company is located in Cambridge, if you listen to Car Talk, know that’s our fair city. Recently, NuoDB announced a very interesting business deal with Dassault. For those of you who follow the CAD world, Dassault has a system called CATIA, it’s a 3D modeling system. If you’ve been on an airplane in the last month, it was probably designed with CATIA. They kind of own that market. Dassault not only has a business arrangement with NuoDB, they came in as an investor, it’s an investment route, a major investment. A company makes an investment in a vendor for really one or two reasons. One is they want to, they’re going to base their product on a technology and they want to be able to ensure that that is being done properly. And the other is they see a hot market and they see it, and they want to jump onto that, we hope that in this case, it is absolutely both. It’s a very demanding application, and it’s going to be pushing the technology for both companies. Very excited about that.
Roughly what NuoDB is about is that, if you look at cloud-style applications, we’re all talking web applications, mobile applications. People have figured out how to have web servers that scale out. You have load balancers; you have as many web services as you need to have a demand that distributes very nicely. Same with app servers, and whenever you need to actually do the computing, and feed the web servers, and down on the bottom, you’ve got storage servers, everybody knows how to put lots and lots of storage on computers, network (inaudible) storage or whatever have you. But the bottleneck has always been the DB in the servers that historically have not scaled. And the reason that they don’t scale is that generally, they either run on one computer, and the only way you may get more capacity is to get a faster and more expensive computer, or they -- in a shared storage system where you’ve got a bunch of CPUs, a small number of CPUs essentially emulating a single processor system. But sharing shared disc.
A cloud database has to do better than that. It has to treat processors, a CPU, not as the central part but just a piece of the computing environment. Elasticity is everything. You need to be able to make something faster by plugging something in, rather than taking it down and reconfiguring it, and bringing it back up. So, this picture here, and it’s, you know, real data. You have an application that’s running on a single node. It’s running nicely, producing a fair number of transactions per second. You plug in the second node, there’s a short delay, and suddenly you’ve doubled the capacity. NuoDB actually has this characteristic. That’s me. I have no idea where the quote came from, they use it all the time, I don’t think I ever said that. Pretty good thing to say, it’s probably true, but...
So, let me talk about the capabilities of the system. And essentially everything is about elastic scalability. All of the characteristics that make NuoDB different are all direct results of elastic scalability. That is the characteristic of the system to be able to plug in another transaction node, fire it up, get additional capacity, also to be able to take a node that’s in the database system, drop it out to reduce, if you don’t need it, you can turn it off, you can spin it down until you’re actually running in the public cloud. But the ability to add and drop nodes is kind of magical. Because, you know, one is, you know, obviously it lets you scale up in terms of your capacity, as number of users go up, you can add more capacity. Another is that you can actually move a running database around the world if you need to, or from hardware to hardware. You can start it up on one set of machines, add another, drop out one of the original, and without ever taking the database system down, you could move the physical database say from Cambridge to California. You could change hardware. There’s no reason to ever shut the system down. It just continues to run. And one of -- because of that, NuoDB is able to implement rolling uprights. With all other database systems that I’m aware of, when you do a software upgrade, you have to shut down the database, do the upgrade, and then bring it up. With NuoDB, it’s quite different. We have an automatic process that does this, but what goes on under the covers is start up another node, an extra node, shut down one, upgrade the one that has been shut down, bring it back up, reenters the cloud, and then it goes onto the next node. When the last node running the old version of the software goes away, suddenly it will kick -- each server will kick to the next version, to the new version of the software, and start using enhanced protocols. New message formats. There’s no reason to ever -- that requires that you ever take the -- a NuoDB database down.
An interesting characteristic which NuoDB has been using, getting a lot of attention, is geo-distribution. And that is that you can have a database system that straddles data centers, so you have one in New York and one in Tokyo. And by the nature of the replication within NuoDB, you can make that architecture work. Where you don’t need full bandwidth, and you don’t, to move all of the data continuously between New York and Tokyo, a lot of it can stay local. The way this works is at NuoDB, every server has a series of atoms which are distributed objects. Each atom knows what other systems have copies of that atom. So they know how to replicate between servers. If they need access to an atom, and they don’t have it, they know who has it through a catalog mechanism. And each one has been pinging everybody else once a second. So they have some idea about the relative responsiveness. They don’t really care whether the other guy’s on a different continent, or has a slow line, or is really busy, or has a slow CPU. It doesn’t make any difference. You know, if somebody’s responsive, they’re good. If they’re not responsive, they’re bad. So when they need a piece of data, when they need some atoms, it will always go to the most responsive node that has it. And if nobody has the memory, it’ll go to a storage module. This means that the large data requests are all being done locally. So, if you have, for example, an ATM network, half in New York, and half in Tokyo, most of the customers in the banks and the ATMs are all in New York, that stuff is all going to be local. And ditto in Tokyo. So there’s not a large amount of data going back and forth. There is necessarily replication data, because somebody from New York might show up in Tokyo. So it does have to be shipped there. But most of the communication is happening locally in one side or the other. Obviously it’s application-specific, and exactly what the, you know, what it’s going to cost in terms of latency in network is very, very hard to predict. And it’s really not application neutral. But it’s a very interesting characteristic.
Continuous availability, I talked about this a little bit earlier. Because of the ability to add nodes if nodes fail, being able to add additional capacity, and to be able to do rolling upgrades when there’s a software update, means that the database system never, never, never has to go down. It can go down if you choose to set it down, or if there’s an intergalactic power failure, you could restart it. But that’s not something that normally happens.
And skip multi-tendency, finally, the administration, I hate system administration, personally. Consider any time doing system administration essentially, you know, detracts from one’s lifespan. So, yeah, designed this thing to be self-tuning. It’s also so dynamic that a human couldn’t keep up with it anyway. So it keeps track of the responsiveness of nodes and communication counts within it, does whatever it does to adapt to what the changing circumstances are. So the tuning, the care and feeding, are basically taken out of the hands of humans, and left in the hands of computers who have a better idea how computers are supposed to work.
So, essential I think, this has to scale up. And scale out is good, elastic scale out is great. This is what makes it work well in a cloud. Give it the resources it needs to handle the current load, and let it go. If the load picks up, which everybody wants, then you add more resource and system response to the load. Talk about availability, geo-distribution, and (inaudible). So that in a nutshell, without going too deep into the technology, is what NuoDB is about. And I look forward to Robin’s questions.
(Eric Cavanaugh): OK, good. And with that, let me go ahead and hand the keys over to Robin. Couldn’t type fast enough there Jim, but I’m going to get a transcript, because you have a bunch of good quotes in there. I especially like any time doing system administration is time lost from your lifespan.
(Jim Starkey): Oh yeah.
(Eric Cavanaugh): I absolutely agree with that 100%. (laughter) Thank God there are people who have the patience to do that, because I am not one of them.
(Jim Starkey): Yeah, but I want to find those people and give them useful things to do.
(Eric Cavanaugh): Yeah, I agree, I’m with you on that. All right, Robin?
(Robin Bloor): (laughter) OK. This is actually an extraordinary -- I have to say, because I look to this technology being briefed a couple of times already by NuoDB reports today. This is the most surprising technology; the most surprising database technology I’ve encountered in, it must at least be a decade. I mean it really is that there is a claim here, I believe it to be true, because of our customers out there, to be able to do things that databases simply could not do before. So, really the -- what I intend to present here is just the question of, so what do we really mean by distributed database? And I’m coming at it possibly from a similar direction to the way that Jim’s coming at it, but in a different way. The true distribution, true database distribution, has always been a holy grail. You know? And it’s like to a certain extent, database engineers have always wanted, in one way or another, to be able to distribute a database. And the reason for it, above and beyond anything else, the reason for it is that that means it scales. You know, and if you go back to the beginning of the world of IT and the original, I don’t know, databases on the mainframe, there was, in the first -- what I would say [naïve?] period when databases came into use, the belief that there would, in some way or another, be the possibility of putting all of an organization’s business data into the database. You know, we would have one database, it would kind of replace the whole idea of file system, and it would just sit there. And whatever business applications you wanted to build, then you would just plug into this database. Well, you know, that idea was shot to smithereens very, very quickly. And we ended up with the idea of a database as a kind of local thing, in the sense that you, you know, you build an application. And you need a place to put the data in the databases, and (inaudible) place, and sticking it in files, because you’ve got the metadata, the data’s reusable and so on, and so forth. And, you know, and the data’s managed for you. So, lots of things you don’t have to do.
Good thing, but you never had the idea that the database would be able to just get larger and larger in terms not only of the amount of data they had stored, but also the various tables you wanted to drop into it. You know, and what we’re talking about in terms of what NuoDB, as I’m sure Jim will correct me if I’m wrong about any of this, is that a database that actually seems to be incredibly expansible, and it’s also an LTP database. You know, there are certain things that you can scale out in terms of large query loads that you simply cannot (inaudible) do with LTP. So anyway, you know, we start off with what is a database? Well, a database is software that presides over a heap of data. Implement a data model, manages multiple concurrent requests for data, implements the security model, is actually complying, I presume everybody listening knows what that means. But there’s a question mark now, because in their efforts to scale out various databases that exist in (inaudible) five or six years, have actually loosened up the acid rules. And they don’t promise immediate consistency, they promise eventual consistency. And they’ve done that in order to be able to scale out for particular workloads. But normally, one expected asset compliance, and we certainly expect data compliance from an OLTP database. And also, we expect the database to be resilient. We do not expect, you know, we expect that if anything fails, whether it’s a software failure or hardware failure, we will be able to get our database back up and it will be -- it will not be damaged in any reasonable way. And that’s what we expect of a database.
The problem of distribution, I’ve kind of drawn it out like this, but it’s really a difficult problem to solve this, you know, what I’ve done out here is two cloud data centers, and two data centers that we control in one way or another. And I display that eight nodes of a database across all of that. And I’m saying well, what happens if, you know, you suppose it’s a query. And in some way or other, by some piece of magic, I’m saying that it goes straight (inaudible). But the application request will go to some kind of broker or other (inaudible) out to various pieces of the database. And if you look exactly what would actually have to be done with a query, in order to implement something like this, it’s really, I mean it’s really complex. It’s incredibly complex. Imagine, for instance, the data has been shared out between these four places, then in order for the query to be answered, the database actually has to know which part of the query to send to which place. So, let’s say it goes first to data center one, and then that sends out the query, the rest of the query, to all the other places, you know. Then, each particular location goes and gets the data that you want to get, the data that’s wanted. And the -- it resolves at the level of each location. I mean, I’m talking about distribution, it resolves at the level of each location. And then the full answer actually has to be resolved across all locations. Well, (inaudible) simple OLTP query it might actually just resolve at one location, or perhaps two locations, depending on what was linked to what. But, you know, if it’s a larger query than that, then you also have the problem that in order to get the full answer, you’re going to have to master the answer somewhere. And in mastering the answer that means you’re going to have to throw some data about. And if you choose the wrong place to master the answer, you may throw a lot of data out. And also, bear in mind that we’re over a network here. So we’ve got the network latency involved in all of this.
And where, you know, you look at this and you think, I don’t think that geo-distribution is actually possible. I mean, yeah. I don’t think it’s possible, because the amount of latency that we’re going to incur by trying to do all of this is going to actually just make it not work. You know, so the interesting thing about NuoDB is the fact that it has an architecture which actually gets around this. I would say that it kind of achieves by putting, in every location, it puts all the data, so that any location in one way or another from disc can actually be able to take over the whole thing if it absolutely had to. And then, it caches the active data in memory, and it kind of goes between the various sites in order to determine which is the best place to respond to anything. But of course, NuoDB hasn’t been built to do large query traffic; it’s been built more for OLTP at this point in time. And that will be a question for Jim later on, which is really, what would happen if you get a big query? But that’s the problem with trying to deal with it, and it looks very intractable if you look at it in the simplistic way that I did.
But, you know, it’s also true that databases absolutely have to distribute their data in some way or another, you know? Usually with a database, you would want to scale up onto a single node, so you don’t have any network traffic, because networks are slower than everything on a single node. It’d be best to kind of saturate a single node before you go to another node. So, the first step in scaling out is to actually go to a well engineered cluster, so that there’s no other traffic going on in the network, and therefore your networking between nodes is going to be as fast as possibly achievable. And that’s what databases did. The original databases, you know, that became, I don’t know, the standard products that people implemented, the Oracle database, for example, was scaled down onto a tight cluster, and the various software was running in parallel across that cluster. You know, that’s what it did. And then the next move to scale out is onto a more loosely (inaudible). And that’s where you suddenly run into problems. And you run into problems for lots of reasons. First of all, if you’re on a very large grid of computers and every single node in the grid has to know what’s going on, on all of the other nodes; you’ve got a hell of a lot of messaging going on. And some -- that messaging itself has to be managed above and beyond the fact that you may be gathering data at each node in one way or another, you may to start throwing data around, you know, you’ve got all sorts of potential latencies occurring, or waits occurring. I mean, because you can also have problems arise from locking. I mean, there’s just an immense amount of problems. So, at some point, the scale out charting approach is going to run into bottlenecks. You know, and that will depend upon what workloads it’s trying to do. But it’s going to occur much sooner with OLTP workloads, because with OLTP workloads, you have the possibility of two different nodes wanting to update something at the same time. And you actually have to resolve the update, and you can get into all sorts of, you know, Mexican standoffs, depending on how you try and do it. This is going to make it difficult. So, databases have to distribute, but what we’ve seen in terms of database architectures, in the past, you know, we got parallelism in a big way, you know, as soon as Intel started to go multi-processor. We got that in a big way, and there has been a lot of work on scaling out. But, you know, databases simply do not distribute very well. And it’s very hard to make them distribute very well.
And if you look at actual attempts at distribution, geo-distribution isn’t really any different to distributing within the same data center, it’s just you’ve got much bigger latency issues, because you know, you’ve got to go through all sorts of protocols to get onto a wide area network, and throw things into the network. To the extent that geo-distribution in the way that Jim was talking about, anyway, something that’s not really been achieved before, or if that has been achieved, it’s been achieved by terrible compromise. And the first thing that you can do, one of the things that’s reasonably easy to do is simple replication. Master/slave, and when -- so you’ve got one of the nodes is a master, the other is replicating data off, it’s a slave. You just don’t let any updates go to the slave, because they’re going to get sent back to the master anyway. This, you can get away with, you know, replicating data in this way, to a certain extent. But as soon as you’ve got intensive updates, you’re going to run into problems with this. Unless the slave is simply a replica that has no relationship to the master and everything’s really happening on the master. In which case, you haven’t scaled out at all. You know, all you’ve got is a replicator that’s taking some query traffic every now and then. You’ve then got multi-master replication, which is peer replication. And this, if I understand it correctly, is what NuoDB’s doing. But there are other attempts at this out there, and they don’t do what NuoDB -- they don’t work in a way that NuoDB works, as far as I understand it. And when you get multi-master replication, you really, you’ve really got the problem that each node needs to know what’s happening on all the other nodes, and working that out, you know, that one is message traffic, the other is the natural latency while you wait for what is effectively a (inaudible) commit. You wait for other nodes to tell you what’s happened. Sorting that particular problem out becomes really, really difficult. And the more nodes that you have, the multi-master, the worse the situation gets.
But if I understand it correctly, NuoDB is a kind of multi-master application, because basically in one way or another, pretty much any one of the nodes can actually master a given transaction. And it replicates to the rest automatically. It’s just that there isn’t any declaration as to what data must live where. That’s my understanding, and I’m sure Jim is going to correct me. So Jim, that’s enough from me on that particular. Thinking about it, can we just have some -- one of the questions I wanted to ask was just, you know, for clarity, right? What parameters can the users set? I know you’re talking about (inaudible) and I understand why a lot of this absolutely is zero admin. But it can’t be the case that the user can’t set parameters, they presumably, in one way or another, will want to know when the number of nodes that I have made available at the moment are starting to run out of resource. So, what can the user do in order to monitor the environment? And what do they need to do?
(Jim Starkey): OK, let me get back to that question in a minute. Because the first thing I want to do is comment slightly on what you were talking about before.
(Robin Bloor): OK.
(Jim Starkey): And that is, the essence of NuoDB is really what I would say a theoretical breakthrough. And it really isn’t a theoretical breakthrough. It’s a discovery of a major blunder in computer science. When the first database systems were invented, and transactions were invented, the only technology that was available for managing transactions, two phase locking, and the lock manager, where you had to lock everything, and if you locked it in the right order, nobody could get out something that was about to be updated until the transaction committed. And the test on whether a system that is based on two phase locking works or not. Whether it’s consistent, is that transactions have to be serializable. Because if they aren’t serializable, that means that one transaction has tripped over another one, and the locking scheme hasn’t worked. Now, from that, there was this implicit assumption that serializability was a necessarily condition for consistency. But it isn’t a necessary, it’s a sufficient condition, but it’s not necessary. What’s kind of magic about NuoDB is that there is no single definitive global database stake. In the serializable database system, at any given time, there’s a single definitive stake, and that essentially makes it impossible to distribute it efficiently. In NuoDB, which is only transactional, you can only view the database through a transaction. Each transaction is itself consistent. But a transaction sees other transactions that reported as committed on that node when the transactions started. It’s not global, it’s what was reported. And we detect when one active transaction tries to interfere with another one, we handle that case. So we can guarantee consistency without being serializable. And that’s what sets NuoDB apart from every other database system on the planet.
And I gave a talk at MIT a long time ago, an IEEE ACM talk, which was entitled the -- I’m saying special theory of relativity and the problem of database scalability. If you take the position of Newton, that there’s a center of the universe, and everything (overlapping dialogue; inaudible) it, then you’re stuck. You can’t distribute things efficiently. If you take an Einsteinian view that it’s relative to a viewer’s perspective, then the stuff works. OK. Now back to your question. (laughter)
(Robin Bloor): Wow let me just comment on that. That’s just smart. That’s -- I really do hope that the audience actually gets that. But in order for them to get it, they actually have to have a reasonably good appreciation of relativity. But yes, I get that. That’s very interesting. OK. So let’s get back to it. What can the user actually do in terms of configuration parameters?
(Jim Starkey): Well, OK. First of all, there are three types of people that interact with NuoDB’s system. There are system managers who don’t have anything to do with individual database systems. They manage what nodes are running, and when they start up a transaction node, or a storage manager, they have to tell how much memory -- physical memory can you use. And this is really a garbage collection parameter, you know, when you get to this point, get rid of this, I only have so much memory on the system. And I’m distributing among a bunch of processes. And that’s just to avoid page [faulty?]. The second parameter that they have to set is a maximum just bandwidth for a storage manager, so it doesn’t overload the system. It’s well understood that if you take a Linux system and you consistently write this data faster than a disc can absorb it, really bad things happen. So, you need to manage this bandwidth on a storage manager, easy to manage (inaudible) the physical memory the database process can use. And that’s pretty much it.
The second level, the second class of user is database administrators, you know, those are the guys that we know they set up the data model, they figure out what the records are supposed to look like. They figure out what’s close to the index. They have all the classical things. They’re secondary indexes, primary (inaudible) all that (inaudible). And the third, and they have almost nothing to do with tuning. Other than defining what indexes are there, and I will say in passing that NuoDB is a little bit smarter, because you can use multiple indexes for a single retrieval. Which makes it unusual, but good. And then finally, we have the, you know, the actual guy who’s, you know, the application developer, or somebody using the ad hoc, you know, tool doing database stuff. And he doesn’t have anything that he can do to tune it. That’s just going to happen. Mostly, the database system is tuning itself internally. It knows where data is. It knows that it should get it from somebody who’s got it in memory. They know that if a couple guys have it in memory, they should get the guy who’s most responsive. It’s usually the nearest, fastest guy in the block. I hate tuning. (laughter) I don’t think humans should tell computers how computers should work. Computers should work. OK.
(Robin Bloor): Yeah, I think that -- I understand that perspective entirely. And it’s like, and when you get into some kind of monolithic database that has all of these levers in which, you know, in which the DBA in some way or another is supposed to be able to pull to make performance happen. It starts to become really, it takes an awful long time to actually properly appreciate what an engine like that actually does when you start mucking about with its parameters.
(Jim Starkey): I was at DEC when DEC (inaudible) BMS version 1.0. I had the first layered product in BMS. And BMS had oh, I like to say 257 tuning parameters, and when anybody said, you know, this thing performs like a dog, well it’s not tuned right. And said well, how do I tune it right? Says well, you’ve got to figure that out. (laughter) I mean, tuning parameters are an excuse.
(Robin Bloor): (laughter) That’s good. OK, well I’ve got two questions in terms of latency. You know, it just has to be the case that the more nodes that you add, I mean I know you’ve done some very clever things, but the more nodes that you add, you will start to build up some kind of potential latencies, or bottlenecks. My understanding is you’ve tested it with 100 server nodes, and it seems to function OK. But, is there any, you know, is there any information you can give us to -- is there a latency buildup that starts to happen once you start to run on that many nodes?
(Jim Starkey): Not really. There’s an additional cost per node. And it mostly comes when a -- it’s time to commit a transaction; it has to send a message to every other node saying this transaction is committed. Assuming the transaction has done an update. If it hasn’t done an update, nobody else has to hear about it. So the more nodes that has to send it to, the more packets have to go out over the network. On the other hand, it batches stuff, so if it’s in the process of sending one message to everybody and then another three or four guys do, they’ll go out on the same packet. So it really isn’t so bad. So the answer is no, there really isn’t much of a latency hit for 100 nodes.
(Robin Bloor): OK. So the next thing is, what is the latency hit for geo-distribution? Because if nothing else, there’s the protocol delay, and there’s the actual speed of light delay between, you know, if I was going from New York to California, let’s say. So have you got any kind of feel for what kind of, you know, latency gets in there?
(Jim Starkey): Can I say no? (laughter) OK.
(Robin Bloor): Yeah, you’ve got to have some idea in terms, I mean I would presume that, you know, you must certainly be getting into the (inaudible) area, one should actually got thousands of miles between data centers.
(Jim Starkey): OK. Well, let me back up and talk a little bit about commit protocols. I can’t see anybody in the audience, but I hope the eyes aren’t going to glaze over before I finish. The question, it’s kind of a philosophical policy question, what does it mean for a transaction to commit? Well, in NuoDB, minimally, it means that I -- it’s been through all of the updates, everybody agrees that it’s OK, (inaudible) say about it. And it has sent a message to every other node in the database, saying that this is committed. So at the lowest level, you can tell the application that it’s committed when you’ve notified everybody else.
(Robin Bloor): Right.
(Jim Starkey): Obviously, there is a point where you send out the messages, and then you tell the application, the committed, and somebody hits it with an axe, or a mudslide hits it, or some disaster occurs. Trips over the power cord. Well what happens in that case is that the other node said hey, you know, Fred over there seems to have gone away, and one will say I’ll manage it, he says here are all the messages I got from Fred since the last commit, and he’s done, everybody says OK, I’ve got a couple more that I got. So they all do message reconciliation, so they get to the same state and decide whether Fred committed or not. If he said he’d commit to one guy, he’s committed. Now, that works, but it’s not classical enough for some people. So, a lot of times people say well, OK, I really want to have the storage manager come back and say that he’s actually received the commit before I’m going to go ahead. So, one of the options that his administrator can do, we say well, how many storage managers do you want to confirm the commit before you report it as committed? Now if you take that option, you’ve got a probably higher degree of safety. But, you also now incur the latency of the round trip to the storage managers and get back. And if it’s geo-dispersed, then you’re going to pay the round trip from New York to Tokyo and back. And there’s no way to avoid that, if that’s what your policy is going to be. The next level up, and there are customers out there who have corporate policies saying well, we can’t consider it committed unless everything having to do with that transaction is sitting on oxide and rotating memory on at least two different discs. Well, what we do in that case, in fact this is always going on, is that NuoDB storage managers log all incoming replication messages, which includes the commit message. And that they’re written using non-buffered I/O, so they go straight to the disc. And if you take that option that says, you know, I’ve got to have, you know, two guys in every region actually write all the messages to disc before I get the commit message back, and I need to get -- hear from all the regions before I can tell the user that it’s committed. Well then you do have a significant latency. So, you know, what I’m saying basically is that based on your corporate policy philosophy, you can kind of choose your commit modes based on what your requirements are.
Now, Google has well established that the human attention span is somewhere between 250 and 300 milliseconds. Before they get bored and do something else. Now, can you actually do this round trip and get back and give somebody, before they hit a refresh button or click on another link? That’s a good question; I think it’s highly application-specific. If you’re standing in an ATM machine, I don’t think it’s a question.
(Robin Bloor): Right.
(Jim Starkey): So, what I’m doing in many words is ducking the question.
(Robin Bloor): Yeah, yeah. But that’s fine, because you’ve given us at least, if you like, at least a sense of the kind of time scale you’re talking about. Let’s move on, I’ve got two things that I just wanted to ask about, you never mentioned, but I think it’s a very interesting thing that you have the capability within NuoDB of putting data in a given location. And because there are various laws in various countries that insist that certain data shouldn’t leave the country, that’s a very useful capability. I just wanted to ask, how does that work? And what’s the extent of it? How do you do that?
(Jim Starkey): I mostly have done something that they haven’t told me about. I don’t think NuoDB does that.
(Robin Bloor): Oh no, in which case it must be something that they’ve talked about in the future. (laughter) OK, well let’s --
(Jim Starkey): Yeah, it’s -- no, we’re very aware of the legislation in Canada and the United States, and, you know, why all of the public cloud providers have to have subclouds in different countries and different continents, and have different rules. And this is very sad that the world has come to this, but right now, every storage manager has a fully copy of the database. And part of this is for safety, and part of it is for performance and part of is to minimize messaging. NuoDB is looking at a way of relaxing that. But right now, there’s no way that you can say I want this data only there, and not there. So, nice feature, don’t have that.
(Robin Bloor): OK. Well the other question, before I hand you over to the audience, is really the query question. You clearly can pose queries to the database that’s distributed in the way that you distributed it. But the question is, you know, at what point in time does the architecture itself, does that create actually any problems for NuoDB if you starting running large queries against the data? Or does it not actually matter too much?
(Jim Starkey): What happens if you throw a large query at NuoDB, it goes to a transaction engine, and starts grinding away. And that transaction engine is very busy, and when somebody else pings it, it takes a while to get around to responding to the ping. And everybody says, I guess that guy’s busy, I need to stop asking him for stuff, and I’ll go someplace else. The -- we have more elegant ways of handling it. That is, when an application process is attaching to a database, it goes through a broker. There are things I can say about the broker. It can give it what’s called a connection ID. A connection key. That broker will hash that, and use that as a decision on to where to direct a query. And this will give the performance characteristics of charting, where you have nodes that start specializing in specific atoms. But the same mechanism, you know, can be used with a little creativity for somebody to say OK look, I’m going to do something that’s going to cause the lights to go dim. And I want to send it to a transaction node, and that’s going to execute it and, you know, don’t have anybody else try and do updates with that guy, he’s going to be busy doing my stuff and, you know, ideally the brokers will handle this, and I’m not sure we have all of the mechanisms in place for doing that. But, you know, logically that’s the way it would work. And it’s just a question of putting in the tweaks in the broker to handle that situation. What NuoDB does not do is, right now, it doesn’t have the ability to execute queries in parallel on different nodes. So, execution is all on a single node, which is pretty much necessary with the messaging and transaction model at the moment.
(Robin Bloor): OK, I understand that. OK Eric, I’m presuming the audience do have some questions.
(Eric Cavanaugh): We have a ton of questions actually, we sure do. So first of all Jim, let me throw a question out to you from myself, is kind of understanding what you’ve designed, and how it fits very well in these cloud environments, which we were talking about, are very heterogeneous and very demanding. I think that’s one of the keys to keep in mind here with the cloud is that you’ll get these spikes of activity that can be frankly overwhelming for older, more traditional systems. And if I understand it correctly, this atomic approach that you’ve taken, where you’ve got all these atoms that can talk to each other and do various things, and we’ve heard of other companies taking a similar approach, not exactly like this, and not for database, but I’m guessing that sort of multi-atomic environment or strategy is what lends itself so well to cloud-based computing, right?
(Jim Starkey): Yeah. There are actually, there are a couple of characteristics that make it nice.
(Eric Cavanaugh): Oh, never mind, that’s a backup recording that’s -- speaking of cloud breaking, (overlapping dialogue; inaudible).
(Jim Starkey): OK, all right. That’s fine. A couple things. One is, the real advantage of a public cloud is you pay for the resources that you use. And if you don’t need a particular resource, you spool down that server, and you don’t have to pay for it. If you own the servers, you pay for them whether they’re working or not, and you’re probably paying for somebody to sit there and watching an idle server, and that’s even more painful than owning an idle server. To make this work, you have to be able to spool up and down your database characteristics. If you have a US daytime heavy load, and everybody goes to sleep at night, presumably, sooner or later, you can throttle back, drop nodes out of the database, and shut down those servers and not have to pay for them. So it fits that economics. Now there are other little things about it is that it is very network-aware, because when you’re running in Amazon, internal connections are really cheap, and very fast, and it -- but if you go outside and come back, it’s slow and really expensive. So it knows about that sort of thing, which is very useful. But the other big thing is that going over the life cycle of an application, it started out in a development shop, and (inaudible) running NuoDB, running on a, you know, on a cheap box in the corner. Then, you know, as it started to be rolled out, it can move to a corporate data center, when load builds up, you can move it to a public cloud, and expand it out there. Elasticity, and then, you know, if it gets to be very expensive, because you’ve got a large continuous load, you can bring it back in to real live physical machines. And the magic thing about NuoDB is that the database system never has to go down during all those transitions. So, you know, if there are times when local machines make sense, there are times when the cloud makes sense. There are times that they’re going to change, and you want to move the locus of where your database system is, and NuoDB kind of gives you a win/win/win.
(Eric Cavanaugh): Yeah, that’s --
(Jim Starkey): (overlapping dialogue; inaudible) together.
(Eric Cavanaugh): Right, that’s interesting stuff. It kind of reminds me years ago, and it is a bit of a metaphor here, but it kind of reminds me of how the difference between Apple and PCs, the way Apple would allow you to upgrade the operating system without having to back up and, you know, decouple essentially your file system manually, whereas, you know, the PC, you couldn’t just go ahead and put a new operating system over your existing file system, it didn’t like that at all. Interesting. OK, so let me just jump into some of these questions. We have a whole bunch from folks right here. So, you know, here’s just one I’ll pick out of the blue, (inaudible) asks I believe you have architected NuoDB to be continuously available, but it runs presumably on things that are not servers, Linux, discs, memory, etc. How do you run in the event of specific failures? You kind of talked about that earlier, but maybe you could just expound a bit.
(Jim Starkey): Yeah, sure. If you -- and that’s why we support multiple storage managers. If a storage manager goes down, well, that’s fine, the rest of the guys carry the load. When you fix it, you bring it back up, and assuming there’s anything left of the disc, it will resynchronize with the other storage managers, and just sit there and fetch stuff from other guys until it’s completely up to speed, and then it will turn itself on and say OK, I’m now a storage manager. Alternatively, if you really need to, we can check out a storage manager, copy the database, unless you’re running CFS, in which case you can do it online. Clone the actual database atoms on disc, and then start up a -- restart that server, and start a second server on the clone. So there’s a lot of ways to handle it.
(Eric Cavanaugh): OK, good. And what about schema changes? And different data models and how do you deal with all that?
(Jim Starkey): From the beginning, the system is designed to be able to change anything online. So, adding columns to tables, changing data types, adding indexes, dropping indexes, you know, you name it, it all works online while it’s running hot multi-user monthly transactions. Just as an aside, it has a very -- a characteristic I really like, and that is, it uses an interesting data encoding, where it’s based, the encoding is based on the actual data rather than the declared data. So, internally it has a concept of string. It doesn’t care how big the string is. It’s just a string, or it’s a number. A number is a number is a number. So, you never have to do things like well, you know, I’ve got 40 characters for a last name and well, that’s somebody’s who’s an anthropologist, second generation, he’s got four hyphens, and I’ve got to make it 80 characters. That doesn’t happen. The -- everything is very soft. When you change metadata in NuoDB, it doesn’t change anything on the disc really. Or in the atoms. It just starts storing new records in new formats. And keeping track of what the old ones look like.
(Eric Cavanaugh): Interesting.
(Jim Starkey): It’s kind of magic, a lot of mirrors in there.
(Eric Cavanaugh): (laughter) A lot of mirrors in there. Smoke, mirrors, distractions. OK, this is great. Here’s another very specific question, does NuoDB support the XA protocol?
(Jim Starkey): That’s an easy one. Nope.
(Eric Cavanaugh): No?
(Jim Starkey): No. (laughter)
(Eric Cavanaugh): And that should open data, or that’s like an open source protocol for two phase commit, is that right?
(Jim Starkey): It’s a two phase commit protocol, yes. And --
(Eric Cavanaugh): OK.
(Jim Starkey): -- NuoDB -- excuse me. I put hooks in there to support it, but no one’s ever asked for it.
(Eric Cavanaugh): Interesting.
(Jim Starkey): That I know of.
(Eric Cavanaugh): Interesting. So you could handle it, but you just haven’t seen that even be asked before?
(Jim Starkey): Oh yeah.
(Eric Cavanaugh): OK, interesting.
(Jim Starkey): OK. For the record, it was already (inaudible) that I did it back in 1983 was the first commercial database system that supported two phase commit. So, I’ve done the first database system to support a two phase commit, and the first database system that has no reason to do one. (laughter)
(Eric Cavanaugh): OK, we can -- you’ve talked about this at length too. But we have some people asking about it, so maybe just a bit more detail. Could you talk about locking again, what kinds of situations cause locks, and how is locking resolution resolved, basically?
(Jim Starkey): Oh, this is great. There is no lock manager, there are no locks. And this is where we got Mitchell Kurtzman interested. We were giving a pitch to his PC company, and he was sitting there dozing off, and somebody asked about the lock manager. I said, we don’t have a lock manager; he stood up straight, he said, what? (laughter) So, it’s multi-version concurrency control. When you update a record, it doesn’t replace the existing record, it creates a new version of the record, stamped with the transaction ID of the guy who made it, every transaction knows which versions that should look at it, and which versions it should ignore. So, under the cover, it sees all of the versions, and this is how we both give the transaction a consistent view of the database, and how we can detect if two transactions --
(Eric Cavanaugh): That’s fantastic --
(Jim Starkey): -- are trying to update the same guide.
(Eric Cavanaugh): That’s fantastic. So that’s -- I mean, talk about great for compliance issues, you’ve got your built in audit trail, and your built in list of controls or transactions that had some impact on the data that’s being read. Because this --
(Jim Starkey): Oh, I wish I could say yes, but, you know, one of the future features, which we didn’t get around to implementing it, and it’s not going to be in the next six weeks either, is the concept of time travel, where you can say I want to execute this query as of last Thursday, but we don’t have that yet.
(Eric Cavanaugh): Right. Well you could -- OK, I’m just curious to know --
(Jim Starkey): (overlapping dialogue; inaudible) collection going on for efficiency, so it doesn’t keep stuff long-time. It could.
(Eric Cavanaugh): I see. I got you. OK, good. Well that’s good detail. All right, let’s see. We’ve got a few other good questions in here. So, let’s see here. Here’s just a random interesting one. How do you prevent NuoDB from eating all the network quota upload and download, while replicating data between data centers?
(Jim Starkey): Excuse me, did I say we kept it from eating into quota? (laughter) I have no idea. We don’t do anything --
(Eric Cavanaugh): That’s funny.
(Jim Starkey): -- we --
(Eric Cavanaugh): Somebody else’s problem.
(Jim Starkey): -- we try and minimize messaging, obviously, because that directly -- performance. But what we have to do, we have to do.
(Eric Cavanaugh): I understand. Sure, OK, good. So, we’ve got a few more questions. We’re going a little bit long here, folks, but if we don’t get to your questions, we’ll be sure to pass them onto the folks at NuoDB. So here’s another really good one. Let’s see. So the attendee asks, could you speak of NuoDB and integration with -- oh, well hold on. This -- he’s asking about integration with Django. I’m not sure if that’s something that is worthy of talking too much about. But you can jump on me if it is. But here’s an interesting one. OK, on data types, do strings containing digits get coerced to integers, and lose leading zeros? The infamous Microsoft Excel scenario. Is the client interface to find in terms of strings? Those are good questions.
(Jim Starkey): No. It -- the interface is JDBC. In face, the SQL engine is JDBC. Even though it’s not Java. It has JDBC semantics. So we support all of the Java data types, you know, all of the classical data types, even if they map into the same thing, you know, internally. We do keep the, you know, when it’s in memory, we know what data type it is. It’s when we serialize it for transmission over the network, or serializing it for storage on disc that we switch to a highly compressed format. It’s just a --
(Eric Cavanaugh): Well this is -- yeah, go ahead.
(Jim Starkey): -- the way the data’s stored, it’s not necessarily the way that it’s declared.
(Eric Cavanaugh): I understand. This is really, really fascinating stuff. Robin, I appreciate you getting all excited about it too, because what you guys have done here really is stretch out in very fundamental ways the capacity of database management. And as you suggested guys, have a number of interesting things in the road map as well. So folks, we’re going to make sure that all the questions get to the folks at NuoDB. If you feel that a question you asked was not answered, so my apologies if that’s the case, you can see this month is cloud, next month is big data, then database in May. And Jim Starkey, wow, thank you so much for your time today, I really appreciate your candor and your detail and jumping right in there. So this is fantastic stuff.
(Jim Starkey): I’m very bashful about talking about my baby, you can see.
(Eric Cavanaugh): (laughter) That is fantastic, folks. All right, we’re going to (overlapping dialogue; inaudible) yeah we’re going to wrap it up, but I’m sure we’ll be hearing from Jim sometime in the future. And with that folks, thank you very much for your time and attention, we will archive this event, we do archive all these events, hop online to Insideanalysis.com to find links to all that. On the right-hand side of that home page, you’ll see a link that says recent episodes. Click on that and you can go to a page with all the episodes of “The Briefing Room” in year number five.