I recently sat down with Jonas Bonér, CTO and co-founder of Lightbend, and Viktor Klang, deputy CTO of Lightbend, to discuss their recent O’Reilly Radar article titled “Reactive Programming vs Reactive Systems”, and the subsequent expanded white paper released on Lightbend.com. The purpose? Demystifying the different aspects of Reactive, from programming in a Reactive style, to deploying and managing resilient, elastic, Reactive systems in production.
Note: here is a quick link to get the full white paper in HTML (25 min read) or as a PDF download:
Lightbend: [Intro truncated] Jonas and Viktor, how did the conversation around Reactive start for you originally?
Jonas Bonér: Viktor and I both have a history of building scalable, resilient systems, often highly concurrent and message-driven, etc., for quite a long time. That experience sort of resulted in the Akka project that we started in 2009. After that, we've had a lot of customers building these type of applications. It's hard to talk about these type of applications without sort of sounding like buzzword bingo. It's just too much to explain, too many moving parts.
So we essentially felt there was a need to try to capture this sort of style of development into a set of principles, distilling the essence of what's needed to build these types of applications that are solving all the hard problems that we're facing today as an industry. We just decided to call it Reactive.
Reactive programming had been around for some time, but we want to sort of expand on that idea, into what we call Reactive systems. That's also sort of what resulted in the Reactive Manifesto, as a way to sort of try to capture vocabulary for these type of applications and distilling the essence of the principles.
Lightbend: Did you ever imagine that major enterprise vendors like Red Hat, Pivotal, and Oracle would embrace these concepts to such an extent?
Jonas Bonér: Yes and no. I strongly believe that it's the best and the most sensible way of building systems today and trying to tackle all the hard problems we face as an industry, with cloud computing, with multicore architectures, with Fast Data and all these things. In a way, I'm not surprised that other people are sort of solving it the same way.
That said, writing the Reactive Manifesto does not mean we invented it–it's been used the last 20 years by a lot of people, but serving under the radar. It goes all the way back to the '80s. But I'd say I'm also surprised, in a way, to have this impact in such a short time, and I'm also very pleased about it. It's a bit of both.
Viktor Klang: From my perspective, I think it's really important that the conversation was lifted to a completely different level, as Jonas said. I mean, having conversations about implementation details of a system and trying to extrapolate the benefits is really hard. I think that most people have actually a much easier time to discuss sort of the high-level value propositions and then start to talk about technology or implementations.
I think the manifesto itself raised the abstraction level, so that we could talk about the benefits and sort of the quote/unquote "best practices" that we have sort of adopted for how we think that system should be built.
I would probably agree with Jonas. I'm also a bit surprised, but also, ideal-wise, these are really good ideas, so I'm not surprised that smart people sort of pick up great ideas, at all.
Lightbend: It started with the Reactive Manifesto, more or less, in a formal way, in 2013. We're coming up on 2017. We've seen Reactive as a term get embraced to a pretty high extent. It's entered the world of buzzword bingo, as Jonas would say. We've actually noticed that the term Reactive is getting a little bit overloaded and confused. One of the reasons I wanted to invite you both to join us today on this podcast is to discuss the recent article that you both authored that was featured on O'Reilly Radar and expanded upon in an extended white paper by Lightbend. On the cover, the subtitle is "How to Land on a Set of Simple Reactive Design Principles in a Sea of Constant Confusion and Overloaded Expectations." That's really colorful language. So what exactly was the point for kind of defining what Reactive means from different perspectives for example, Reactive programming and how it plays a role in building Reactive systems?
Jonas Bonér: As with all sort of successful concepts, it easily becomes overloaded and means different things to different people. I'm not surprised that happened to Reactive, especially since I alluded at the beginning that Reactive programming has actually been around for quite some time.
With Reactive systems, we want to expand and build upon that idea. Of course, people then think that Reactive Programming is the only thing that there exists and the only thing that's needed and perhaps choose to not see beyond that, and that's what we're seeing today, I think. I think that's a big loss, as well, because Reactive systems are really about taking a bigger grip on things and trying to address harder problems than are just partially addressed with Reactive programming.
That said, they build on each other. In many ways, one is a subset of the other, so they are both very important. We saw the need to clear up the confusion there and explain what is what and what it's used for, essentially. What do you think, Viktor?
Viktor Klang: I think the thing with Reactive systems is that it's a much broader concept. It's about solving, as you say, sort of broader problems.
Lightbend: Maybe now's a decent time to kind of dig a little bit deeper. How are we looking at Reactive programming as a subset of Reactive systems? Maybe we could discuss some of the details around that, Viktor.
Viktor Klang: Sure. One of the things with Reactive programming is that it's about programming. So it's about how you write a program or a piece of logic. Reactive systems is about how you design systems. And systems could be composed or be parts of multiple programs or multiple applications written with Reactive programming, so it's at different levels of scale, I would say.
One of the core things with Reactive programming is the inherent eventfulness or using events to communicate. One of the biggest things about Reactive systems is using messages to communicate. It's always possible to sort of sandwich one on top of the other, emulating messages with events or emulating events with messages, but that tends to sort of conflate the two things, so that we felt like it was time to sort of making the definitions more clear.
What is an event about, and what is event-based programming about? What's a message, and what's that about? When are they useful concepts, and where shouldn't you try to shoehorn one where the other should be?
Jonas Bonér: Right. As Viktor said, if you go down to the fundamentals, then I think the core difference between these is exactly event-driven versus message-driven, and that Reactive programming is essentially event-driven, while Reactive systems rely on messages.
Messages define communication, while events, in a way, represent facts that have happened in the system. Clearly, for multiple components to communicate effectively in a system, it needs to be based on a message-driven core. Messages have a recipient. They have a clear destination of where to send them, while, as we write in the paper, events are facts just for others to observe.
I think the important implication that messages are essentially the essence of distributed systems. I mean, as soon as you communicate across a network, you have to send messages. So having a programming model that fully embraces messages means that you are sort of true to the nature of distributed systems, including all the constraints and problems that can arise. I think this is extremely important as the foundational fabric for building distributed systems, which is actually all that Reactive systems are all about.
This sort of decoupling between the sender and the receiver with sort of an asynchronous network boundary is what gives us things that we really value when building systems, like isolation, for example–true isolation, so components can fail in isolation, and they can be rebooted and restarted, operate in isolation, et cetera.
This level of interaction also gives the possibility to achieve what we call location transparency, which means that the components can be migrated while they are being used, being moved around in the clusters. The topology of the system is not fixed, but can actually change. These two are sort of the essence for resilience, meaning isolation and being able to observe failures outside the failed component in a safe context, and then elasticity, meaning being able to scale on demand and also shrink on demand.
This is the reason why Reactive systems sort of expand on Reactive programming and adding sort of concepts around resilience and elasticity, while Reactive programming focuses more within the node, on resource efficiency and performance, et cetera.
Lightbend: A lot of the times when people are thinking of Reactive systems, microservices architecture ends up coming into play, but as we've learned, Reactive systems encompasses a much broader group of different types of applications and use cases, such as fast-data streaming, pipelines, mobile and IOT applications, as well as microservices. Could you give us a quick rundown on what's important for the traditional Java developer, for example, to know about the limitations of Reactive programming, when it comes to web apps and microservices and fast-data apps, and how Reactive system design and the principles behind that addresses those concerns? Maybe we can start at just kind of the traditional web app.
Jonas Bonér: Sure. Do you want to take a stab at that, Viktor?
Viktor Klang: Yeah, sure. I think, first and foremost, right now we are seeing this drive for having developers doing operations like DevOps and site reliability engineering and essentially having developers being responsible for the operational behavior of what they create, and I think that's really important.
So the question is: How do I write a web application? And if I'm responsible for making sure that it's going to be available, it's going to be able to respond within the latencies that the business has set for making sure the users are happy, how do I make sure that it can handle the variable loads? How do I know that it's going to be able to deal with future loads?
How do I make sure that, whenever something goes wrong, because things eventually go wrong, and especially in a distributed system, which essentially every single thing that developers create today are, and we have to account for that.
If all I have is an abstraction that allows me to create an in-process really nice thing, then that doesn't necessarily address any of those. If I, as a developer, am responsible for the operational concerns, I need to create a system. I need to make sure that, if this web app fails, there is going to be another copy of the web app that will continue to be able to serve the requests. If parts of the database goes down, is there replicas that can take over the load?
These concerns that Reactive systems address are becoming more and more important for more and more developers as they get more and more operational responsibilities. So it's no longer someone else's problem to sort of scale the application server or scale the database. There's an end-to-end responsibility that developers have today.
One of the most important things is, once you're responsible for the operational concerns of a system, do you need to trade away all your productivity? What does my tooling look like when I'm responsible for all these extra concerns? I think what we've tried to do, from the very first day, is ask if there is any need at all for there to be a tradeoff between developer productivity and scalability and elasticity and resilience? And I think there doesn't need to be that.
It's really non-connected, but the technologies that we've had in the past have really traded away productivity to gain resilience or to gain elasticity, or vice-versa, where you have developer tools that allows you to create applications really quickly, but then their sort of operational aspects of it is terrible. There's really no solution at all.
I think what we're seeing is both a trend towards full end-to-end responsibility by developers, as well as a really strong drive to drive down the time to market. If you really need to have productivity and you're also responsible for the operational concerns of your system, I don't see how you could get by without Reactive programming and Reactive systems. This is a no-brainer to me.
Jonas Bonér: Yeah. I think it's very important to try and understand that both of these are essential to building both web apps and microservices and Fast Data applications. Reactive programming is excellent in web development, for example, to serve to fix resources asynchronously and sort of adjust and fan out to multiple back-end systems in a fully asynchronous fashion and then compose the result and send it back to the client.
All that can then be done in a very resource-efficient way without blocking and wait time, essentially, while there are more advanced web development scenarios, like distributed caching and data consistency, notification across nodes, etc relies on thinking more holistically about a system. Therefore, you need to think about things in terms of a Reactive system.
The same thing, I think, holds for microservices, where Reactive programming is great within the single microservice to do things like internal logic sort of at the edge of the service, to talk to other services, and managing persistence, etc., while in Reactive systems, it's essential to address the space between microservices.
Actually, that is always where we enter the world of distributed systems and things get really hard, and relying on the solid principles for Reactive systems within the microservices here helps a lot, and makes it possible to build on a solid foundation.
You also mentioned Fast Data streaming, and I think that we also see that both Reactive programming and the Reactive systems can be of service there. I mean, often these type of streaming applications expose their Reactive programming and event-driven APIs to the user, because that does a lot of simplification, and it's a very natural model that a lot of people find intuitive. It's using message-driven and Reactive systems underneath to bridge these APIs and to maintain their simplicity across machines in a distributed system. So both are needed there as well.
Lightbend: All right, guys. Well, thank you very much for sharing your ideas with us today. Look forward to reading your white paper, which you can find on our website. Thanks, once again, to Jonas Bonér and Viktor Klang. Have a wonderful day!