Summary: Kubernetes (K8s) and Akka Cluster are typically like two peas in a pod, but changes in the way that K8s handles rolling updates have challenged that friendship. Like any good friend, Akka has now adapted to K8s change in behaviour and provides a new extension so deployments work as fast and reliably as ever.
Kubernetes and Akka have a symbiotic relationship when it comes to building scalable, resilient and distributed applications. Several Akka features, such as Akka Management and the Akka Discovery modules, have been developed specifically to work with Kubernetes, making it easier to deploy and manage Akka clusters in Kubernetes environments. Kubernetes, in turn, provides powerful features such as automatic scaling, rolling updates, and self-healing that are well-suited for managing Akka clusters. However, recent changes to the default behavior in Kubernetes can have a negative impact on Akka clusters when it comes to performance and reliability of rolling update operations.
Rolling updates are a feature of Kubernetes that allow you to update an application (in the form of a deployment or ReplicaSet) by gradually replacing old pods with new ones. This ensures that the application remains available throughout the update process, with minimal disruption to users. In previous versions of Kubernetes, the default behavior for scaling down replicas during a rolling update was to select the youngest pods first. This was a good fit for Akka clusters that use Cluster Singleton, as the oldest node in the cluster is responsible for hosting singletons. By selecting the youngest pods first, Kubernetes ensured that the oldest node was not affected by the rolling update until all other nodes had been updated. This reduced the need for singleton rebalancing and minimized the impact on the overall system.
However, starting from Kubernetes v1.22 (see details here), the default behavior for scaling down ReplicaSets during a rolling update has changed to LogarithmicScaleDown. This behavior treats all pods that were brought up in the same time bucket as equally old, and selects a pod to scale down randomly. This can have a significant impact on Akka clusters: when a pod is stopped during a rolling update, the Cluster Singleton (and thereby the Cluster Sharding coordinator) will move to the next oldest pod in the cluster. However, the lack of guarantee that the oldest pods will be rotated last can result in frequent move of the Sharding coordinator at the same time as shards are rebalanced, which might lead to instability, reduced performance with potential dropped or delayed requests, and longer rolling update durations.
To address this issue, Akka Management v1.3.0 introduces a new feature that sets the pod deletion cost annotation automatically. This annotation tells Kubernetes to prioritize scaling down pods that have a lower deletion cost, reducing the likelihood that the oldest node in the cluster will be scaled down early in the rolling update process. As the cluster topology changes, this Akka extension ensures the pod-deletion-cost value is updated for the N oldest nodes (3 by default).
We tested the new feature under two different scenarios: a 5-node cluster and a steady load of 150 requests per second, and a 10-node cluster with a steady load of 100 requests per second. For each of these scenarios, we forced the application into a rolling update of all nodes while processing the incoming requests and ran the test against 2 versions of the same application (with and without the new feature enabled). For each scenario, we ran the test multiple times, recording the total number of requests processed and the response time for each request, and finally aggregating the runs into averages per scenario.
In the first scenario, where we used a 5-node cluster, the differences between the applications with and without the new feature were hardly noticeable, apart from very small decreases in P99 and max values for request times. Both with and without the feature, the number of failed requests was very similar and insignificant (less than 5 in 40000 total requests).
When we ran the same tests against a 10-node cluster, the results became more interesting. There, we saw significant improvements when the new feature was enabled: P95, P99, and max values for request time all decreased by around 50% (e.g., P99 dropped from ~5.7s to ~2.5s), and the percentage of failed requests fell from an average of ~8.5% to ~1.2%.
Note that these numbers are averages of multiple runs. In reality, when doing a rolling update and the singleton is forced to jump between nodes randomly, there is still a chance it ends up on a fresh node on the first try, and so the behavior ends up being similar to using the new feature. We were able to attest to just that in our runs without a pod-deletion-cost set: sometimes, we were lucky enough that the rolling update worked fine even with a 10-node cluster. However, it would go bad often, and that could lead to a much higher frequency of dropped and delayed requests than what the above averages might reveal.
Although we firmly believe that the use of this feature will translate into smoother and more efficient rolling updates for Akka cluster applications, such impact depends on the specifics of each application and deployment setup. Thus, we encourage you to try it out for yourself!
To take advantage of this extension, you will need to include the new Akka dependency
akka-rolling-update-kubernetes (from Akka Management v1.3.0) and add some configurations. Additionally, in your Kubernetes setup, the pod will need to be able to annotate itself. For details on how to configure and use this Pod Deletion Cost extension, refer to the official documentation.