Akka message delivery part one: at most once
This article is the first in a series of three articles that dive into the some of the interesting aspects of messaging within distributed systems environments. While the particular focus of these articles covers messaging within an Akka distributed actor systems, the concepts covered here also apply to messaging within other distributed systems, such as messaging within microservice systems.
Object and Actor Messaging Basics
In object-oriented programming languages, objects respond to method calls. Object method calls are a form of sending messages to an object. The object philosophy is that “everything” is an object. In Akka, with its implementation of the actor model, the philosophy is that “everything” is an actor. (The term “everything” is used loosely here. Of course, not everything is an object or an actor; the idea is that these are the dominant players in these software systems.) Both objects and actors react to messages. However, things quickly diverge from there.
One of the most fundamental differences between the object philosophy and the actor model philosophy is the act of invoking an object method is a synchronous operation while the act of sending a message to an actor is an asynchronous operation.
When a client invokes an object method, the method caller waits for the method to complete and when the method is finished the caller resumes execution. On the other hand, when a message is sent to an actor the message sender is not suspended. The message sender sends a message, and it continues running. In fact, sending a message to an actor is handled by the actor system via method calls to what is referred to as an actor reference. Actor references represent a transparent reference to an actor that may be located somewhere within an actor system, and the actor system may actually be distributed across multiple JVMs that reside on multiple network nodes.
Objects typically process within a single thread of execution. Of course, using multiple threads is an option, but the most common case is that a single thread handles the flow of objects invoking methods of other objects. With actors, the sender of a message and the receiving actor are separated. That is the message sender does not directly interact with the message receiver. As a result, the message sender and the message receiver are running on separate threads. Also, the message sender and the message receiver may be running in separate processes, and those processes may be running on separate network nodes. This separation of the message sender and message receiver also applies to message passing between networked services, such as microservices.
You may want to skip to the next section if you are interested messaging details without getting into Akka actor specific code examples.
Shown in Figure 1 is an example of a simple actor written in Java. As you can see the actor is a composed of a class that extends an Akka base class. Note the
createReceive() method. This is the method that provides the incoming message handler. The Receive object returned from the createReceive() method is invoked by the actor system when an actor receives incoming messages. The receiver defines which messages the actor can handle, along with the implementation of how the messages should be processed. Note also that there is this class has an instance variable that is set by a constructor. An actor is much like any other class. The only Akka specific requirement is that the actor class must extend one of the Akka actor base classes and implement an abstract method.
An instance of the actor is created (as shown in Figure 2) with a call to the actor system
actorOf() method. Once the actor is created the
actorOf() method returns an actor reference. Notice the call to
DemoActor.props(42). This invokes the
DemoActor static method
props() method in turn calls the
Props.create() method. This method is used by the actor system to create an instance of the actor.
Now that an instance of an actor has been created and we have an actor reference we can send it a message. As shown in Figure 3, we send the actor a simple integer message by calling the
tell() method. The second parameter to the tell method is used to pass the actor reference of the sending actor to the receiving actor. Since in this example we are not sending a message from another actor we use
The key point is that while object methods are directly invoked synchronously by the caller messages sent to actors are sent asynchronously via the actor system.
Creating an actor in software is very much like creating an object. Each actor is written as a class with methods. The big difference is that the only way to communicate with an actor is by asynchronously sending it a message via the actor system. Only the actor system itself creates actor instances and directly invokes an actor’s methods. No user code directly invokes any actor methods.
At-Most-Once Asynchronous Message Delivery
With a synchronous object method invocation, the caller knows when the called method is done when the method call returns. When sending an asynchronous message, such as sending a message to an actor or to a microservice, the sender only knows that an attempt was made to send the message. There is no indication that the target recipient received the message.
This is how the at-most-once message delivery process works. An attempt is made to send a message to the receiver. However, there are no guarantees that the message will be delivered. Any number of things may happen that prevent the successful delivery of an asynchronous message.
It is possible for the recipient to send a reply back to the sender but this is a deliberate action done by the receiver and not a natural action that occurs as with an object method call. Also, as with the attempt to send the initial request message, sending a response message may not be received.
An asynchronous request, reaction, and a response is shown in Figure 4. The process involves three distinct steps. The first step is the sender on the left sends an asynchronous message to the receiver on the right. In the second step, the receiver on the right performs some reaction operation when the message is received. Finally, in the third step, the receiver on the right sends an asynchronous message back to the initial sender on the left.
In this three step process, there are three opportunities where a failure will break the asynchronous request response cycle. The initial message may never be sent to the receiver. The receiver may get the message, but it may fail before it can send a response back to the sender. Finally, in the third leg of the journey, the receiver may attempt to send a response message back to the sender, but the response message is never received.
From the perspective of the message sender, there are no clues as to where the failure occurred. As shown in Figure 5 above, a failure may occur in any one of the three legs of the request to response journey. One of the biggest issue is that the sender does not know if the receiver performed the requested re-action. There is no indication that the failure happened before, during, or after the receiver reacted to the message. In situations where the receiver is performing an operation that must occur, like persisting a required state change, the sender has no idea if the state change happened or not.
When dealing with message transmission failures, there are two categories of messages, messages that must be delivered and messages that may or may not be delivered.
Starting with the easy one, messages that may or may not be delivered. In this case, when a message is not delivered to the receiver, or if the receiver is unable to respond to the sender, the result of one of these failures does not lead to the overall system being in some inconsistent state. For example, in the case where the sender is requesting that the receiver retrieve some information and return it in a response message. For stateless operations, such as read-only operations, when a request operation fails, the requestor may give up, or it may retry the operation again until it the message is successfully delivered. However, we are getting ahead of ourselves. Resending a message after a failure is a form of at-least-once delivery. We will be looking at that in detail in Part 2 of this series.
Things are very different when a message must be processed by the receiver where a failure to deliver and process the message results in the system being in an inconsistent state. Consider a scenario with two collaborating microservices, such as the scenario described in the article Developing Transactional Microservices Using Aggregates, Event Sourcing and CQRS - Part 1. In this scenario, there is an order microservice and a customer microservice. When a new order is created by the order microservice, an order-created message is sent to the customer microservice. When the customer microservice receives the order-created message, it performs a customer credit check. If the customer has sufficient credit to place the order the customer microservice reserves the credit, which is a stateful persistent operation, and then it sends a credit-reserved response message back to the order service. This triggers the order service to change the state of the order from new to approved. It is also possible that the customer has insufficient credit to place the order. In this case, the customer service replies with an insufficient-credit response message, which triggers the order microservice to change the state of the order to canceled.
When to Use and Not Use At-Most-Once messaging
While there are certainly differences between sending messages synchronously versus asynchronously, the primary consideration here is what is happening on each end of the conversation. What you need to consider is this, are you trying to synchronize state changes on both sides of the conversation or not? Using at-most-once messaging is fine when there is no need to synchronize state changes between the message sender and the message receiver. However, using at-most-once message delivery when it is necessary to synchronize state changes will burn you when things break.
The fatal flaw with at-most-once delivery is that the problems only crop up when things break. When everything is working correctly, the message flow is not interrupted, and no messages are lost, which means synchronized state changes happen as they should. This all falls apart when things break, and some messages are not delivered. These message delivery failures will result in corrupted system state.
Consider the order and customer processing scenario. When an order is created the customer must be notified. The act of creating an order is a state change. When an order is created, it notifies the customer. This notification triggers a customer state change. The customer state change then triggers an event that causes the order to do another state change.
The sequence of events is a new order is created in the “new” state. The customer updates the “credit limit” state. This finally triggers the order to be changed to an “approved” or “canceled” state.
In this scenario, there are two messages that must be delivered, the initial order-created message and the credit-approved or insufficient-credit message. When either of these messages is not delivered the state of the system is corrupted.
Synchronous versus Asynchronous and Network Reliability
Hopefully, at this point, we have provided some guidance on when to use or not use at-most-once delivery but what about when to use synchronous or asynchronous messaging? Is there a difference between synchronous or asynchronous message delivery in regards to reliability?
Let’s again consider the order and customer message exchange. In the examples discussed so far, we walked through the process using an asynchronous message processing flow. Would this processing flow be more reliable if we used a synchronous messaging approach?
The answer is no. Using either a synchronous or an asynchronous messaging approach has no real impact when dealing with failures. Take a look at the diagram in Figure 6. Is this a synchronous or an asynchronous message flow? The answer is that is could be either synchronous or asynchronous. In both cases, the exchange of a request and response is a three step journey. There are three opportunities where failures will break the flow from a request, to a reaction, to response. It does not matter if the implementation of the message processing flow is synchronous or asynchronous.
When At-Most-Once is Not Good Enough
In Part 2 of this series, we will take a look at at-least-once message delivery. While the at-most-once approach indeed is useful and works with many processing scenarios it also has its limitations. With at-least-once, we will look at how to overcome the limitations of the at-most-once approach.
If you are interested, this topic is discussed in the Akka documentation - Message Ordering, Delivery Guarantees.