"Akka In Action": Q/A With Manning Author On Why Akka For All Things Distributed
Akka In Action - A JVM Architect's Guide
Akka is seeing a high level of interest these days, as deploying to the cloud continues to skyrocket. According to Forrester Research, enterprises are looking at the Actor Model and Akka to solve the challenges of concurrency, performance, scalability and, ultimately, resilience in the cloud.
So it was with great interest that I recently had a chat with Raymond Roestenburg, co-author of Akka In Action, to ask him about how Akka brings the brilliance of the Actor Model, first developed over 40 years ago, to the mainstream embrace of the JVM, and namely Java 8.
If you're too busy, feel free to go directly to our EBooks page for Akka In Action to download sample chapters and get a 40% discount on the book ...
Hi Raymond! When did you first start using Akka, and what was the use case?
Raymond Roestenburg (RR): I first started using Akka together with Rob Bakker, one of the co-authors of the book, in March 2010. The use case was a camera and sensor system with which the Royal Netherlands Marechaussee supports mobile security monitoring near the borders of Germany and Belgium. There is acutally a case study on the Lightbend website about this called Keeping Borders Safe With Akka.
That's a fascinating real-life use case–so what is it about Akka that inspired you to write a book?
RR: We had completed a couple of projects using Akka; the benefits of using Akka were obvious every time.
Akka made me really excited, as it was this new toolkit that essentially was a game-changer for us. Even at the very beginning, it provided an unmatched combination of ease of use, ease of testing, performance and reliability, we were using it in production since version 0.7. We ran into some issues, but nothing major, which is what you would hope. The source code had always been solid, which is really remarkable as far as most OSS projects go.
Even at the first use case, we built a reliable distributed system of 30+ nodes, which scaled very easily over time. We needed to communicate over unreliable cellphone networks and we managed to gracefully handle all kinds of intermittent failures.
"We had completed a couple of projects using Akka; the benefits of using Akka were obvious every time."
Set that off against previous experiences with mixing low level java concurrency tools and business logic and all kinds of bespoke RPC technology, you really want to start and spread the word that there is a far, far better way to do this kind of thing.
In those days, there wasn’t a lot of information available online, so I decided to start blogging about it as well as contribute to the Akka project. We started providing case studies to Lightbend (Typesafe at the time) of some of the projects we did and I started explaining Akka at user group meetings and the like. I started to really like this new aspect of my work, which was to try and teach more. The book was the next step in this process.
Do you have any advice for Java developers looking to shift to distributed systems with Akka?
RR: My advice has always been to "keep it simple".
When it comes to distributed systems, keeping it simple is not just advice; it's the only way you will _ever_ succeed.
In many cases you can get away with a less powerful tool in the Akka toolkit. Using Akka Streams (editor's note: as found in Lightbend's Lagom Framework) instead of Akka Actors directly, for instance. As an example, in one of our projects we tried to build a guaranteed delivery mechanism, built on the idea of repeating data and an idempotent receiver using remote actors (which after the fact was a bad idea). There were so many weird error conditions that, after quite a while, we _almost_ got it right, which means, we didn't get it right at all. There was always some edge case we didn't foresee.
We then figured out we could solve the problem in a far simpler and less generic way by just broadcasting periodically and changing how we encoded commands to the sensors.
Try to take advantage of application-level knowledge to achieve a higher level of consistency.
Shifting gears, what new projects in the Akka ecosystem are making you excited?
RR: There are four in particular I'd like to focus on here: Alpakka, Akka Streams Kafka, Akka HTTP and Artery.
One of my contributions to Akka was the second version of Akka Camel with Piotr Gabryanczyk, which was originally written by Martin Krasser. We wanted to use Akka 2.0 for projects, and so much had changed that Akka Camel had to be ported.
The name Alpakka was already coined by Viktor Klang at that time as a name for the Akka Camel integration work (visit this Google Group). Akka referring to a mountain, and an Alpakka being a kind of 'domesticated mountain camel' (Hey, I'm no biologist). The name never got used, but I'm definitely excited that it is going to be used now :-)
More importantly though, I've used akka-camel in many projects where it really worked out well. The concept of Apache Camel is great, pluggable components, enterprise integration patterns (EIP).
"Alpakka is going to try and provide a modern, streaming version of the best parts of Apache Camel"
It did require a lot of attention to detail though when the Apache Camel components used blocking I/O, which can get tricky at times. Another issue has been that streaming messages was always very hard with Akka Camel, specifically in providing means to prevent overload, basically the problem that Akka Streams is so good at solving. Also, a lot of the EIPs focus on a request / response or fire and forget style of communication.
The fact that Alpakka is going to try and provide a modern, streaming version of the best parts of Apache Camel (the pluggable endpoint components) I think is really exciting.
Akka Streams Kafka
I'm really excited about Akka Streams and the API it provides in general. I think it has the right balance between elegance, performance and extensibility. Internally, Akka Streams components have enough structure to make it possible for the core-team guys to performance tune it.
Although the GraphStage API takes some getting used to, being able to code 'on the inside' of the model, responding to pull and push feels very natural to me, it is very flexible as well (maybe because it reminds me a bit of the actor model). I'm expecting that many coders will be able to pick this up and write really cool components for Alpakka.
As for Akka Streams Kafka, I have been working with Apache Kafka for about 2 years now, starting with version 0.8. Let's just say that using the Kafka client API has always been quite tricky (especially what used to be called the SimpleConsumer, which required you to handle leader election changes yourself).
Although the Kafka client libraries have slightly improved over time, there are a lot of tricky details, especially for handling error conditions in the cluster. I wrote a streaming HTTP proxy for Kafka for a customer, before Confluent's REST proxy was available. This would be almost trivial now combining Akka HTTP with Akka Streams Kafka.
Not having to deal with a lot of Kafka specifics and just staying in the Akka Streams programming model provides a lot of benefits.
I'm also very excited about Akka HTTP, especially now that it has passed the performance numbers of Spray, which I have used in many projects.
Akka HTTP really shines when it comes to streaming over HTTP. Spray already had some support for streaming, but could not handle all overload scenarios. It's also great that akka-http provides an almost identical DSL for routing as Spray used to, and again it has great test support. I'm looking forward to features to come and HTTP/2 support.
Artery, the new remoting layer for Akka, is another project I'm interested in. The first use case we did used the Akka Remoting module directly to minimize overhead. I think it will be an interesting option for projects that require a very low latency and where it is possible to not use TCP. I think it was a great decision to not try to build a custom message transport, but use Aeron instead, which has been under development for quite some time and has a very good reputation. The Akka team already did some benchmarking for Akka 2.4.11 and achieved a breathtaking 700,000, 100-byte-sized messages per second.
You mention back-pressure, resilience, elasticity and minimizing overhead. How does Akka's approach to Reactive system design differ from Reactive projects in Spring and Java EE?
RR: First of all, it’s been quite a while since I have used Spring or Java EE, at least 8 years or so. I think both Spring and Java EE are moving into the right direction again, both are starting to support Reactive Streams and it is great to see that asynchronous, non-blocking streaming is now on everyone’s agenda.
But streaming is just one of the many aspects you have to deal with.
What sets Akka apart is that everything builds on top of the actor model. Instead of pretending that local and remote invocation are the same, even with the simplest application you basically start with a distributed message passing model. This automatically gives you resilience, elasticity and scalability on a message driven core.
The actor model provides the backbone for both local and distributed applications. Akka Streams for example, are implemented in terms of tried and fast actor messaging, but users don’t necessarily see that. The fact remains that inter-op between them feels native and just right.
We try to explain in our book how far reaching this decision is.
In short, it means that you don’t have to learn a lot of different, bespoke technologies to build resilient clusters. You send message between actors, that’s it. It also means that you can really take advantage of the combination of Akka modules, for instance Akka Cluster and Akka Persistence to shard persistent state in a cluster. It all automatically fits together. So you learn one concurrency / distribution model, and once it “clicks” you see how all the pieces of the puzzle fit together. You don’t have to fiddle around with various models depending on what library you’re suddenly forced to work with.
"What sets Akka apart is that everything builds on top of the actor model. Instead of pretending that local and remote invocation are the same, even with the simplest application you basically start with a distributed message passing model. This automatically gives you resilience, elasticity and scalability on a message driven core."
Lagom takes the next step and builds on top of existing Akka modules and provides an opinionated set of tools to build microservices, taking away a lot of complexity; it’s like Akka with batteries included.
Apart from that, Akka has very efficient serialization and communication options. Upgrading to Artery is done by simply changing the scheme that the actor system addresses. That does mean that you are not using HTTP between every component in your system. I’m sure some will see that as a drawback. But is important not to forget how much overhead you are choosing for with JSON over HTTP.
REST is obviously useful, and Akka HTTP is a great library to build RESTful services. But when it comes to distributed system patterns like service discovery, group membership, distribution of state, it is quite obvious when you look around in our industry that no one is purely building their own REST services for that.
What you do find is that systems like Zookeeper, Consul, etcd and many others are used, which take away the need to tackle this complexity yourself. These are all great tools in their own right. But all of these again require specific expertise. With Akka all these concepts are simply natural, they are not retrofitted as an afterthought.
Finally, what are the top takeaways that you hope readers will get from reading your book?
RR: Here are my top 4 final points:
1. That an asynchronous message-driven core (of actors) provides an efficient means to scale up and out, wasting as little resources as possible, taking full advantage of the available hardware and that all the tools in the Akka toolkit benefit from this.
2. How to build resilient systems with Akka. This is really a core concept. How to build fault tolerant actor hierarchies. How to cluster nodes in case some fail. How to persist data and recover from it. How to continue in a degraded mode or fail gracefully.
3. That Akka has amazing test support and how to leverage it. Unit tests, integration tests, they are a big part of your day to day work. I think almost all chapters show how to test stuff. The fact that you can actually test all facets of a distributed actor system, both local and on actual nodes with the provided test kits is simply amazing and I hope we can get that across.
4. There are quite a few tools in Akka right now, you might not need all of them all at once, but the moment you do need one, I hope the book will give a good background story on why you should use it and what it's good at.
Thank you Raymond, it's been great speaking with you!
To get free sample chapters of Akka In Action emailed to you, as well as get a 40% discount on the physical book, continue to our EBooks page!