Today, it seems everyone is breaking down their monoliths and building microservices applications. However, one challenge that architects struggle with is how to handle the data.
Hear what NuoDB + Red Hat recommend in this webinar!
BILLY: Hello everybody, and welcome to today’s presentation: What’s the Deal with Data in Microservices Applications brought to you by NuoDB. I’d like to introduce our presenters. As Senior Product Manager at NuoDB, Joe Leslie helps drive NuoDB product releases and road map to ensure NuoDB’s database leadership, delivering elastic SQL database scale out and continuous availability for hybrid cloud applications. Joe has over twenty years of experience delivering database products and management tools in the transactional and analytical database marketplace. Marius works as Chief Architect at Red Hat, leading the efforts around data streaming and Red Hat Middleware, enabling the development and operation of data streaming solutions for Red Hat portfolio -- targeting OpenShift. Marius has a long-standing interest in enterprise application immigration, event-driven architecture, and data strategies. And before joining Red Hat, he was leading Spring Cloud Stream as part of the Spring team at Pivotal as well as contributing to various projects in the Spring portfolio, such as Spring Cloud data flow, Spring integration, and Spring Kafka. He is co-authored to Spring Integration in Action. Now, I’d like to do a quick sound check for Joe. So, Joe, if you would like to say hello to our audience and, audience, please let me know if you can hear Joe by posting a message in the chat panel.
JOE LESLIE: Hello, everyone, and thank you for joining us. Happy Wednesday to you all.
BILLY: Great. Now, I’d like to do a quick sound check for Marius. So, Marius, if you would like to say hello to our audience and, audience, please let me know if you can hear him.
MARIUS BOGOEVICI: Sure. Hello, everyone, and thanks for joining us today. I’m very excited to talk about all these things.
BILLY: Great. And now, I’m going to turn things over to Joe and Marius to begin our presentation.
JOE LESLIE: Great. Thank you, Billy. All right, let’s go ahead and get started. Yeah, we have a lot in store, so we’ll just go ahead and share a bit about what areas we would like to cover with you today. So, first, we’re going to cover some of the challenges around deploying microservices, specifically with SQL-based applications -- it’s going to be part of our focus today -- and breaking down what we’re going to call the database monolith. And then, we’re going to go into some best practice use cases and approaches that we’d like to share on how you can better access and protect SQL databases in a microservices deployment architecture. And then, finally, we’ll land this with some more about NuoDB and how NuoDB can specifically help you deploy these types of data strategies inside of OpenShift that we’re going to be talking about today.
So, as we get started, we’re going to take a view from the CIO office and -- well, I’m not a CIO, but, fortunately, I get to speak with many of them and many of us here at NuoDB as well and Red Hat also. So, we have a pretty good view of what CIOs require these days and the challenges their faced with. Many companies today are seeking, sort of, that next frontier of IT efficiencies and, typically, it’s around some of the same areas of saving opX and increasing bottom line. But today, to further it, the IT team must be agile and they must respond more quickly to requests for change. You know, moving our applications to the cloud has really been one where there’s this promise that the apps are always on or always available and with rolling upgrade and all of these capabilities. But that’s a really hard thing to deliver on and how do we actually deliver on all of those promises?
So, if we look at deploying and winning in today’s cloud microservices and container deployment world, yesterday’s tech no longer delivers. Monolithic software stacks with tightly-integrated components means that everything must deploy together like a bundle and it slows the ability to push out changes quickly. And, really, data center architectures haven’t changed much over the years with -- the assumption there is that migrations always mean downtime and scale up instead of scale out to increase performance. And high availability has really always been the equation of high availability equals high complexity and high cost. And really, each one of these ends up being a roadblock when trying to deploy always on applications.
So, let’s take a look at the effort to migrate an existing app to a microservice application. And again, we’re focusing here on SQL-based applications and, really, as the first bullet sort of points out, most of the applications -- these enterprise applications are -- if we generalize, many of them use SQL, right? They attach to a SQL database. So, in moving to the cloud and trying to gain the scale-out capability, switching to a no-SQL database could be one of those options, right? No-SQL offers easy scale-out. But when we look at the risks associated with complete re-write of the data management logic and retraining app developers and DBAs to learn new tools and migrating the data from a relational format to a key-values store is a huge investment. So, really, what if you could keep your SQL applications and still gain the same advantages of running in containers or microservice environment.
So, how to achieve this is really a large part of what we want to share with you today. So, on this next slide, we’re going to cover a little more of what I was calling the monolith -- the database architecture. Traditional SQL-databases, they were architected, probably, near-about three decades ago. And the architecture was optimized for a different infrastructure, right? Symmetric Multi-processing machines, scale-up, not scale-out, memory was very expensive, and this is kind of the world those databases started in. They require a lot of add-ons for replication and DR and back-up and recovery. And of course, they typically have required, expensive, dedicated, highly-configured hardware to increase the performance.
So, if we look to the right -- and this is what I show off my very fancy Google sheet capabilities -- I had built a stack of blocks. That stack of blocks indicates this monolith, right? And it shows these components that I mentioned that are sort of glued together tightly and they sort of move as one. They’re like LEGOs connected and they move as one. But, unlike LEGOs, they really don’t break apart easily versus a more containerized modular approach, which is where we’re able to realize many more efficiencies -- agile deployment methodologies and automations. And this is really the direction that we want to try to go.
So, if we take -- here’s a real example -- a common, traditional, two-decade old Oracle Rack architecture. Right? Difficult to set-up, huge investment, complex -- again, these tightly-couples components of backup and HA. And on the software that moves together with shared disc architectures and so on to sort of achieve scale-out, but, again, very expensive proposition and still really only delivers high availability, not continuous availability.
So, with that as kind of a backdrop and an intro, I’m going to go ahead and turn it over to Marius and Marius is going to take us through some of the data strategies for HA mircoservices and some of the best practices that Red Hat has to offer us as they work with their customers in the field.
MARIUS BOGOEVICI: Thanks a lot, Joe. This is a great introduction to the changes that microservices developers can operate their space, especially when working with relational databases. And as Joe has just mentioned, like at Red Hat especially with our OpenShift customers and OpenShift users, we’ve seen a lot of use cases. We’ve kind of gathered a lot of knowledge about the best practices that we have for running microservices or beta services on a cloud platform.
So, I think it would be very useful right now to take a look and try to understand how things have come to be this way. What are we -- why are we using microservices? What are the use cases? And basically, what strategies can we employ to manage data in a microservice architecture? Once we have that done, we can take a look at what the OpenShift platform has to offer for running both microservices and data services and how to address concern such as high-availability and disaster recovery. Right? And if you think about the motivations for microservices first, it’s important to understand that, as Joe has mentioned earlier, for decades the traditional development model was focused on monoliths. And monoliths were the norm, both in terms of -- both for applications and both for infrastructure. And that was a good thing because that brought a lot of cost-efficiency. You were concerned how to minimize the cost of running applications, running infrastructure, and everything. And that’s basically because, when you’re running everything on physical machines or you’re running everything on -- at best on virtual machines -- the overhead of provisioning new resources, new machines, is very big. So, there is an incentive to concentrate everything in one place, right? Cost was dominant, right? It was just more efficient, right?
So, from a developing process perspective, this favors a very linear approach. The goal was to translate everything we know -- like everything kind of develop a comprehensive view of the business domain -- like everything there is there in a centralized data model -- the databases basically that is the center of the universe. You end up with these massive database with hundreds of -- many like thousands of tables encompassing everything in a single place and, then, we started mapping POJOs and started creating monolithic applications. They’re thinking this business logic that talked to these database schemas. And that is a pretty fragile model because, on one hand, the monolithic data models around with the entire set of applications are structured are hard to change. So, it’s really hard to intervene and change smoothing at the -- like if you discover new things, if you want to adapt your business, it’s actually kind of hard to do.
On the other hand, the monolithic applications that are coming out of that are pretty hard to scale. So, how can we do better? So, if you go to the next slide, we are provided with alternative. We can think in different terms about the relationship between data and applications. First and foremost, the architect meets like the main driven design that are -- they’re adding focus on better understanding the domain model from the perspective of features and sub-systems and the way they interact with each other.
So, it’s not only about modularization, it’s also about thinking of all the interactions that happen in the domain first and, then, starting to design different data models that fit the different subsystems that compose the application. Because, really, like if you look at even the same entities or even the same components -- can be very different from the perspective of one sub-system or another, right? A ticket or a flight can mean something in -- when we’re trying to book a flight, but it can be something very different -- maybe try to check-in for a flight, right? So, we have an example over there.
The end result of this is that we end up defining more granular -- more focused part of the domain. We, basically, center everything around bounded contexts, which are -- and that -- this basically builds agility into the process as we can start isolating change in individual sub-systems without affecting everything else. And now that we put some modularization in place, we have options. So, even before we talk about microservices, we can think about either deploying these parts together as kind of a more comprehensive applications, which is what they call, in this case, a fast monolith, or we can just deploy each of these bounded contexts as microservices -- as independently-deployed units. And the advantage for microservices is that it can help with resources and scale independently. I think both cases, when you look at it, it becomes apparent that isolating functionality is just not enough and in order for this isolation to be properly preserved you also need to isolate the data models and you need to isolate the way these components interact with databases.
So, we have to think about strategies that we have for handling data in this kind of scenario. If you go on the next slide, then one of the most fundamental and difficult strategies for handling -- for dealing with this -- is what is basically called one database per microservice. We can enforce isolation here by just allocating kind of one -- by using one database per microservice or per fast monolith. And it makes sense because, as each bounded context has its own data model, we also have an isolated data score. And it also gives us a lot of flexibility in choosing our storage strategy. So, we can -- relational database. We can have SQL database. You have basically traces.
Now, one of the things to remember is that databases are, generally speaking -- they are not designed -- at least, traditional database are not designed to be very lightweight components, right? So, there are operational challenges with managing multiple traditional database instances. You cannot just launch and deploy like a hundred of them if you have a hundred microservices, right? So, one of the typical ways to implement this is to maintain -- like to respond to this kind of communication is to maintain a single central database instance or instances and use the multi (inaudible) and isolation features -- things like logical databases or schemas -- whatever they call the individual terminology, but then still maintain a logical database from each sub-systems perspective, right? And sometimes, especially when you’re having monolith to microservice migration scenarios, like for example what we can see in the following slide, you have situations for -- especially when you’re extracting functionality and you’re extracting microservices out of a monolith -- we cannot really do this database refactoring.
So, you still have to maintain a centralized schema. You still have to maintain a common data model that -- especially when you have, for example, legacy applications that talk to that. But what you want is to be able to still construct these individual data models on top of a shared database schema. So, one strategy here is, for example, virtualization to deal with this impedance dispatch of having one single centralized database, one single centralized cable model, but then have separate data models in each of these microservices or fast monoliths or components, right? So, it’s important to remember that one of the goals is to recreate this effect of single model isolation that comes with one database or service. And how do we deal with one other consequence, right? Increased latency.
So, in the next slides we talk a little bit about -- we need to learn a little bit about caching. One of the most common fallacies in distributed computing is that we can make remote calls to people exactly like local calls. And that’s not true, right? I don’t think anyone believes that. We have -- it doesn’t give us that location, but, in the end, remote calls are remote calls, they go over the network, they have latencies, they have failures, all kinds of complications. And one of these is latency.
So, if you have multiple applications talking to each other and calling the other applications, which, in turn, go forth and call other applications. So, you basically have this chain of invocations. Latency can compound you fast. So, if you look at this diagram, right, the booking system kind of goes to check-in, goes through a bunch of other microservices and so on. And so for -- if you can see, we can easily have five, six, ten -- like it could be a number of -- depending on how big these architectures end up. It can have the treating of double-digit invocations as part of one call.
So, treating all of them as remote calls is just not feasible. You will end up having massive latencies. So, obviously, caching can help here by reducing the number of remote calls, reducing the latency at each individual application level, right? But the problem that you have to solve in this case is to integrate these caches seriously with our databases. So, just adding another data storage, just adding another subset -- another system that your applications are talking to will actually increase the complexity of these apps. So, this integration needs to happen seamlessly, needs to happen almost transparently for the user. So, that’s another important thing that has to happen in these microservice architectures and that’s one concern.
Now, this is kind of a bit of an idea of what are the challenges, what are the strategies, and how we structure microservices and how do they -- like what do we expect from the databases they’re talking to? It’s also good to take a look at how you run them and what OpenShift, as a platform, has to offer for writing these applications. So, if we go on the next slide, you actually can see a picture, like a very (inaudible) diagram of OpenShift as a platform, right? So, as I said earlier, the high cost of provisioning resources as physical and virtual machines (inaudible) due to cost-reduction reasons. And one of the biggest revolutions in infrastructure management is the rise of the cloud platform that allows you to manage resources dynamically and elastically and actually add virtualization and agility into your process.
So, what we’re going to talk about today -- like one of the points today is OpenShift, which is, basically, the leading -- can give you this platform that helps you build, run, and scale both applications in data services and in the cloud. So, that’s an ability very (inaudible) because all of that we’ve talked so far -- like microservices, data services -- we want to put them on the same platform. And Joe would provide more about the motivations for that, but what I’m going to talk about next is -- I’ll just give you a bit of sense of how OpenShift works and how OpenShift helps into this process.
So, OpenShift does containerize workloads that allows you to package your applications in a readable structure and isolate it from underlying run time. So, here is our deploy as pods on nodes. And the whole process is really transparent and managed by OpenShift. From a users perspective, the entire set-up resources in the data center or a portion of the data center -- whatever you want to allocate for an OpenShift cluster, whether it’s on premises or the cloud, it’s pulled together and available to be allocated dynamically in the last. Right?
So, that gives you a very -- a lot of ease in allocating resources. And that’s basically what we’re expecting to run -- a fleet -- a large fleet -- like tens or even larger numbers of microservices. You have to think about high-availability in this context, right? So, the fact that you can run multiple applications, for example, or the fact that you can run multiple instances of an application -- the fact that you can have -- like balance can automatically do that -- is there something that the platform offers out of the box? But you have to think about how you design these things. Like, for example, you have to think about the fact that applications that run 24/7 -- it is important that there’s no disruption. But applications crash or they need to be updated or the underlying infrastructure -- like you basically need to upgrade, for example, OpenShift -- if you still don’t want to stop running, even if you do that.
So, the common practice here is to run workload (inaudible), using at least two pods of these things running on separate nodes. This happens -- running pods on separate nodes is done automatically by OpenShift. And one important consequence of that is that the workloads themselves need to behave in a way that allows them to take advantage of this feature, right? So, readiness is -- the OpenShift can definitely run with applications that have not been designed to be called native and that’s actually kind of a big feature of it. That kind of readiness can help you improve our experience, right?
And there are other features, like -- which are listed on the next slide -- that helps you run scaled, highly-available applications. For example, you have wideness and readiness probes that can tell you whether a pod is a good or a bad state, whether it needs to be taken out of rotation for low balancer or it needs to be restarted because the application just crashed so it cannot serve requests anymore. It allows you to change resource usage -- again, a very important feature, especially when you’re running resource-intensive workloads, such as databases. It allows you to control pod placement. So, which, generally speaking, the optimal situation has been everything -- your view of the cluster is very homogenous. Sometimes you want to target a specific node and specific batch of resources. So, there are node selectors, which are an OpenShift feature -- allow you to put certain applications to certain nodes if you want to. And very importantly, StatefulSets.
So, that’s basically -- in the next slide we’ll have a brief view of what that means. Of course, a lot -- a significant number of the workloads they run on OpenShift will be stateless. They don’t have resources. They restarted as blank slates. I can imagine an application that stores everything in the state in the database where it stores some data in a data grid or sends and receives messages. So, those types of applications can, basically, restart and they don’t need to remember anything about what happened before they were restarted. Right. This is perfectly fine. So, that’s -- a lot of the workloads are like that, but some workloads, such as databases or message brokers and, generally speaking, Middlewear require some form of state retention. They need to retain some kind of sense of identity. So, if I restart a database, for example, that’s not very good if it has lost all the data it had before. We just cannot have ephemeral database instances, right?
So, this is where we need to have the ability for certain deployments to retain a sense of identity -- like an ID in a cluster of deployments or a persistent file system, right, where they store all of the information. And this is what StatefulSets, as I said, are essentially a Kubernetes feature and they’re available in OpenShift, which is essentially an enterprise platform based on Kubernetes. They are a very powerful concept that combines elastic resource management with the reliance of state retention. Each part gets an identity and that identity is basically matched with a mounted set of persistent volumes.
So, inside the StatefulSet, for example, when a pod is lost and needs to be restarted, a same pod with the same virtual identity is reinstated and the associated storage is recreated. Right? This is very important when you have exactly -- for, as I said, databases and clustered -- so, generally speaking, for clustered data storage because, A, you don’t want to lose your persistent storage, which is important, and the other thing is you want the -- even if the persistent storage is a reality to a new pod, you want it to retain its whole identity inside the cluster so that all the replication processes and state replication features of the cluster can continue uninterrupted. Right?
So, this is one other important building block in the notion of building data services on top of OpenShift. Right? And if we stop for a moment and sort of think about -- this is all about how things happen inside one individual cluster, right? But we also need to think about how to enforce high availability at a larger scale and how to handle disaster recovery. So, if you go onto the next slide, then, there is -- you have a few of the options for handling -- for managing high availability and disaster recovery. Now, the idea is that OpenShift, by itself, ensures resilience and reliability across a specific site, but you also wanted to be shielded from large-scale, systemic failures, such as the entire site goes down, an entire data center goes down.
So, by its nature, OpenShift requires low latency networking between nodes. So, that basically means that each OpenShift clusters run an individual site. You cannot run, for example, an OpenShift cluster across multiple sites or across multiple data centers. In this case, you have different strategies at your disposal. So, first, you can essentially just run, in making a single site, forgo disaster recovery and, whenever the site goes down, wait until it comes back again. It’s a very simple strategy, but obviously not what you want if you want to run 24/7. You can have -- you can failover to a secondary site that remains inactive until required. So, that’s also a possible strategy. Basically, as we’ll see a little bit later, this can come with some data loss, for example. And if you want to maintain a fully Cloud-native approach, you can run a multisite application.
So, basically this means that you have multiple clusters running across multiple sites. The clusters replicate data in real time. And this also has the advantage that not only you’re -- all your clusters are active and ready to run and ideally they’d have the most up-to-date data with some eventually consistency indications, but it also has the advantage that repressed can be routed to the site that’s most appropriate for the users. So, for example, most geographically close to the user for solving latency problems. And the key here is that between multiple applications across different data centers is not necessarily about the code. The key problem to solve is how you replicate your data.
So, you have a few strategies at hand. Our next slide, for example, we talk about infrastructure-based asynchronous replication. So, in this case, the task of replicating the storage data is undertaken by the application -- by the infrastructure itself. This works especially well if the applications are read-only and they only need to stay up-to-date. So, one process updates one storage and then everything else is replicated across the whole cluster.
The other strategy that you have is application-based asynchronous replication, which is on the next slide. And essentially this, basically, means that this is the most likely situation that you’ll be in if you want to operate applications using relational databases, which receive continuous updates and need to profligate across sites. And when we say applications here, we just mean that this is not a storage-infrastructure concern. How simple or how complex this process is largely depends on our database capabilities. Some databases can and will amalgamate the process -- this process -- and will handle it with a certain degree of transparency.
In some other cases, you need to manually set-up the replication infrastructure to propagate application data across different databases or some databases can do this on their own, right? So, this is, I think, one of the key points here, that total of experience depends on the ability of the database to handle this replication automatically, right? And just to make a brief point about the different disaster-recovery approaches, we have, essentially, two.
So, on the next slide, for example, we have basically the -- a couple of variants of standby. In both scenarios you have an active data center and you have an idle and off-line data center that’s kind of ready to take over if the application -- if the primary data center goes down. Now, this comes with -- this is kind of -- this increases the likeliness of data loss because the backups are periodic. So, all the data between two backup instances -- to backup events -- is basically lost.
On the next slide, you have a slight improvement, which is maintaining a hot standby and maintaining a standby data center that is only used for -- as a backup or, at best, for read-only operations. And this minimizes the chance of data loss. There is still a lag between updates, but that is minimized by the continuous replication that takes places between the active and the standby dataset. And of course, the -- on the last slide is basically the active-active approach, which essentially means that old or different replicas are active. Old or different replicas receive data from their applications and the integration of replication happens in real time, right?
So, obviously, this is kind of the most -- the one that offers you, kind of, the most resilience in a way, but is also the most complex of the three. Right? And with that, I’m going to hand it back to Joe to talk about how NuoDB handles these strategies.
JOE LESLIE: Great. Thank you, Marius. Yeah, that was terrific. Yeah, I would -- you’ve shared a bunch of useful information about OpenShift and deploying in microservice environments. Yeah, I would just like to take probably the next ten minutes. I’m going to talk a little bit about how NuoDB helps deploy these strategies that Marius was referring to and, then, we’ll also open up for some Q and A.
So, I’m going to start with, first, describing some of the pitfalls we might see. And we’ve kind of been touching upon it a little bit, but when we want to deploy applications in microservices -- I mean, this example of OpenShift. And we, of course, want to do that to go ahead and gain those agility and automation benefits by doing so. But when that application is a SQL-based app, and then you kind of get this if your database doesn’t really run well in continuous based on a lot of what we’ve been talking about today, that the natural first thing to do is to sort of run it outside. Right? You poke a hole through the OpenShift containerized network to an external network and you’re able to run the application in that way, but you really haven’t achieved all of the agility and automations that you were hoping for. You create network overlays that are complex and it introduces security concern.
And it’s sort of like crossing a bridge halfway, right? Your application is running in the environment you want, but now you’re stuck managing two separate environments because your database is still running the same-old way. Now, we can make a slight improvement, right? We could run a data cache with -- inside our containerized environment. And many of you may be familiar with these third-party, client-member caching tools to improve performance. But they’re not (inaudible) as well either, right? Because it still requires application development and management to ensure that you maintain consistency between your caching layer and your database.
And then, there’s this sort of third one that I’ve kind of pictured in this swollen container-looking environment here where -- well, what if we took the entire monolithic database and ran it in a container? Well, that doesn’t fit the model either, of course. Right? Containers were not designed to run the context, multitasking database systems. So, again, you don’t get the benefits of microservice deployment environment and the ability to leverage these automations that we’ve been discussing.
So, we’re really looking for a different approach and I would describe each one of these as sort of that square peg in a round hole. But, you know, when we look for that better approach, we want to look for something where we can deploy a single operational environments -- one environment, not the two. And then, by doing so, we’re going to lower operational cost, right? Now, we’re maintaining one environment, which is, of course, the goal and the ideal. And by maintaining the one environment -- and this speaks to some of what Marius was describing earlier -- but the benefits of running a localized databased now in an OpenShift container network are you’re not going to incur those latencies of that -- of those remote database calls. Your database is effectively running inside of OpenShift as an embedded database.
And then, deploying NuoDB as a container-native database, because NupDB leverages a redundant process architecture, it allows the database to scale the transactional layer and the storage layers independently and we’ll see what some of those benefits can be realized coming up. And then, lastly, increasing developer productivity through the same, sort of, CI/CD continuous integration and continuous deployment pipelines that you want to use in your microservice deployment environment.
So, NuoDB is deployment agnostic. So, you know, if we look at the scale-up nature of the non-container-bases, they must deploy on specific systems that were configured in a special way to get the performance that you need. And because of that, they are generally deploying outside of the container network, as I was showing on the previous slide. NuoDBs processes run wherever you like. They can run in containers, VMs, virtual systems, physical systems. So, they really are deployment agnostic. They are processes that are going to run within a process face. And this allows for much great agility and deployment automation.
So, here’s a picture of what that would, sort of, look like. So, NuoDB is a native-SQL database. It’s inherently a distributed SQL database. It offers the scale-out by simply adding more nodes -- means adding more database processes to add more transactional throughput or more storage durability. To the right we see the TEs -- those are the transaction engines. That’s what servicing the application transactions and the SMs are the storage managers. That’s what’s persisting the database to disc using a lot of the technologies that Marius was referring to -- node selectors and StatefulSets and leveraging PVCs -- persistent volume claims -- to ensure that the data always resides, that it’s not ephemeral.
So, the last point here, too, is the idea around the built-in data protection. The SMs automatically sync with each other, so it’s all forming continuous availability. There really is no failover event on DR. If you should use a process, either a TE or an SM, the database continuous to run. As long as there is a single TE and SM, the database will run. And they see less performance, of course, because you have less processes servicing applications, but it’s all about continuous availability.
So, just wanted to also touch upon the in-memory cache -- the TEs I’m mentioning -- the transaction engines. They are effectively a distributed in-memory cache. Each one of them is keeping a cache of the database, so the database continuous to run at optimal speeds. If the data is not available and occurred TEs cache, it can grab another data element -- or, as we call them, atoms -- from another local TE -- its nearest neighbor, if you will. And this effectively creates this distributed cache and offers very low latency transactional data access.
And speaking more to the, kind of, HR and how the system protects itself, Marius was talking a lot towards the end about active-active and how that can be difficult to deploy, but with NuoDB it’s native to the product. It is an active-active, always-on architecture. So, here we see like a zone one and zone two. These can be availability zones as we spread the NuoDB processes -- these engine processes -- onto the availability zones. We have two in each zone for the transaction engines. Each one of them are running in a pod -- those pods that Marius was describing earlier, as well as the storage managers are also running in their own continuous pods.
So, if we should lose one or two or even a complete zone, are database remains available. And perhaps that could be what was so important -- that the top requirement, the CIO, had earlier when we were referencing it -- is this idea around -- that the application must always be on, always be available. And NuoDB can achieve that in an OpenShift, orchestrated-process environment.
This is just a quick example of what I mentioned earlier about, if the database can scale independently, the transactional and storage component -- well, now, your database can serve many different applications uniquely and specifically to the requirements, right? For web mobile apps we can scale more TEs than SMs. For OLTP, we may want to run about the same scale. Right? Logging apps -- we need more storage. We’re going to scale that out so we can log faster. And for an HTAP -- a hybrid transactional analytical -- we can actually transfigure those TEs differently, some with more memory, some with less. So, you now have that all that control available to you.
I talked al little earlier about the deployment model of continuous integration and deployment. Well, you want those same benefits for your database. We talked about crossing the bridge halfway. Well, we really don’t want to do that. We want to cross the full bridge and take full advantage of the OpenShift microservices deployment environment by also adding the database to your pipelines and allowing for rolling upgrade and implementation of operators for many of your operational-type tasks -- can be easily not automated. Okay? And that brings us to the idea around automated operators -- again, another one of the Kubernetes features. And this is a feature that we’re currently implementing now -- Kubernetes operator -- to simplify all of these areas -- all have rolling upgrades, scale in and out, back up, and restore.
I thought I’d just finish with a quick little view of just what does a NuoDB database look like inside of OpenShift. I took a little picture here. So, you can see up above -- is that NuoDB monitoring interface that’s showing that we have two running transaction engines. I’ll point them out here. We have two storage managers. We can see the memory and CPU footprint for all of the running pods on the system. And we happen to see here in the middle the aggregate transaction throughput where I was scaling up the number of transaction engines, which doubled the throughput of the application from about 50 to right about 100 transactions per second simply by clicking up on the number of OpenShift transactional pods.
And with that, I’m going to go ahead and land on the thank you slide. And you have some links here on how you can learn more about Red Hat and NuoDB and what we’re doing in the microservices deployment architectures for your applications. And then, I’m going to turn it over to Billy and -- Billy, do we have those holding questions that we would be able to share with our community?
BILLY: Here we go. Here’s the first poll question.
JOE LESLIE: Great. Thank you, Billy. So, we have a few polling questions we thought we’d have a little fun with just to -- are you considering to deploy a micro-service architecture? Please submit your answers and we’ll get share all of our results for all of you.
BILLY: Close the poll in about ten seconds.
JOE LESLIE: And we just have two other questions that build upon each other to give you all a sense of maybe where you’re out in your own microservice deployments and what you may be considering over the next six months or so. Billy, did they get to see all the questions? Oh, oh, okay.
BILLY: Just one at a time. Here’s the results for the first one.
JOE LESLIE: Awesome. Thank you. Well, overwhelming there. It looks like many of us now are considering and are on their way to deploying microservice architectures. Billy, why don’t we try for the next one?
BILLY: All right. Here we go.
JOE LESLIE: Are you considering deploying the OpenShift platform in the next six months. I don’t know if many of you were out at the Red Hat summit, but I know I met many of you there and others and many were there to learn more about how to deploy OpenShift and enhance their current deployments.
BILLY: I’m going to go ahead and close the poll and share with you all the results.
JOE LESLIE: All right. Pretty good also there to see the next twelve months as well. Let’s go for the next.
BILLY: All righty. Last poll.
JOE LESLIE: It looks like we’ve got about -- thank you, Billy. So, it looks like we have 30 percent of you that are already looking to deploy OpenShift in the next six months. How about, of those -- and this is where we touched upon a lot of these enterprise applicants are SQL-based -- how many are looking to deploy SQL-based applications in OpenShift in the next six months or so. And hopefully, as you’re considering deploying that, you’ve learned some new techniques today on how you can actually run an embedded SQL database right inside of OpenShift to gain all of the agility and automation and performance benefits that NuoDB can provide.
BILLY: About ten more seconds and I’ll close out the poll.
JOE LESLIE: All right. Sounds good, Billy.
BILLY: All right. Here are the results.
JOE LESLIE: All right. So, just about 25 percent. One in four are going down this path right now in the next six months of deploying SQL-based in OpenShift. So, that’s fantastic. And I do imagine over the next twelve months for those numbers to increase. But that’s very useful. Thank you, Billy.
BILLY: Of course. I’ll hide the results how. And would you like to do the Q and A portion now?
JOE LESLIE: Yeah, that would be terrific.
BILLY: Awesome. So, we had a few questions come in from you guys. The first one is how can I start using NuoDB on OpenShift.
JOE LESLIE: Well, one of the easiest ways to start is we have a container that’s available along with a YAML that will allow you to deploy the environment. In fact, you’re seeing right now an OpenShift environment. And that’s one I’ve done is I basically did add project and it started a database right inside OpenShift. We are currently updating our container right now to add stageful sets, the persistent value claims, as well at the YAML file deployment. So, within the next week or so look for our container up on the -- in the OpenShift catalog. And of course, you’re always welcome to contact us through our website or through the contact information we’ve provided in today’s presentation. It’s very easy to get up and going and running with NuoDB inside of OpenShift.
BILLY: Great. The next question is do you have an optimal way to implement event sourcing?
JOE LESLIE: An optimal way to implement event -- sorry, I missed that.
BILLY: Implement event sourcing.
JOE LESLIE: Event sourcing. Yeah, so, I think the question is around when the database has events in the words how we can then propagate those. So, we do have our own set of events we’re sourcing. And these would go into the environment that you see in the upper part of your scream -- the monitoring place. We’re currently building this out where we’re showing the more OS-related metrics and transactional pieces. But we’re also going to source events and show alerts -- so what may be happening in the environment and, then, how to root cause those types of database issues and -- go ahead.
MARIUS BOGOEVICI: I’d like to add, to compliment what Joe has just said -- so, for event sourcing on -- especially for event sourcing applications on OpenShift, you have support through -- so, there is a project that I strongly recommend you to take a look out, which is called debesium that is used for implementing CDC on top of a number of relational databases. I don’t think -- I’m not sure about the news plans right now for integrating with NuoDB, but, essentially, what it does is it monitors all your transactions and/or changes to the transactional log and it sends them to Kafkas as standardized events. So, that’s one of the things to look at in the OpenShift universe. That’s kind of a big compliment -- again, I don’t think it applies necessarily to NuoDB right now, but, Joe, what do you think? I think that could be an interesting addition for the future.
JOE LESLIE: Absolutely. Thank you, Marius. That’s really good information.
BILLY: Awesome. Our next question is, any tool migration from RDBMS to NuoDB?
JOE LESLIE: So, RDBMS is a traditional, relational database. NuoDB works with traditional databases from a data migration standpoint. It will easily migrate the data from that database into NuoDB using our native migration tools. So, we try to make that very simple. NuoDB is an acid-compliment and full NC standard SQL database. So, all those SQLs that run against RDB is going to run against NuoDB. And that’s really what we talked about in the earlier part is that the no-SQL is a tough transition, but when you transition from an already existing SQL database to NuoDB it’s quite simple.
BILLY: Great. Our next question is, can I copy single server databases into NuoDB?
JOE LESLIE: Yup. That’s very similar to the previous question. Again, we work with all of the main database products. We will read and extract data from a SQL server. One of the important pieces of what NuoDB does is it’s looking at the source data types and how to, then, translate them into the receiving NuoDB data type. So, the data movement and streaming is probably the easier part. What NuoDB does for you is it makes sure that your data is coming over into NuoDB as the proper strain or [char?] or date or numeric type integer or decimal so that your applications continue to work as expected. We also have a migration webinar on our website if you care to look more and learn more about that. Just go up to NuoDB.com and look for the data migration webinar. We actually integrated my SQL database during the webinar in three easy steps. So, it’s kind of fun. Please go out and enjoy that as well.
BILLY: Thank you all for the questions and I’d also like to thank John and Marius for a great presentation. And I’d also like to thank today’s sponsor, NuoDB, for providing the DZone audience with a great webinar presentation. And I’d also like to thank everyone who joined today. We hope you learned something new today that will help you in your developer career. Have a great day and we’ll see you next time.
JOE LESLIE: Thank you very much. Have a great day.
MARIUS BOGOEVICI: Thank you.