With the release of 22.10 and specifically Akka Projections 1.3.0 we introduced a Brokerless Pub/Sub over gRPC. When we benchmarked it against Kafka we were very happy with the results seeing great reductions in latency.
We run this benchmark on typical cloud hardware, in this case in a GKE cluster in google cloud. We provision the hardware in advance and clear the database / message broker between runs.
We are using the latest version of Akka and a custom application for calculating the benchmark.
We run the software in 2 passes; Kafka and Akka’s Brokerless. We ignore the first messages until the liveness of both producer and consumer and we want to benchmark stable operation.
For the Kafka pass:
JAVA_TOOL_OPTIONS. Kafka is enabled by
For the Akka Brokerles pass:
JAVA_TOOL_OPTIONS command line argument value to switch broker to 'off'
Consumer lag (end-to-end latency) should be stable below a few seconds and not increase.
8 pods producer service (shopping-cart-service)
4 pods consumer service (shopping-analytics-service)
Akka Brokerless parameters:
This generated a stable load of 16,000 events/s. For both Kafka and Akka Brokerless the consumer lag (end-to-end latency) was below a few seconds and didn’t increase. The bottleneck was writing the events to the database on the producer side. CPU utilization of the database was close to 100%.
Throughput was measured by counting rows inserted into the event_journal table during a period of 60 seconds.
4 pods producer service (shopping-cart-service)
4 pods consumer service (shopping-cart-service)
Akka Brokerless parameters:
This configuration generated a stable load of 940 events/s with the following latency distribution.
View the results in the repo here.
Consumer lag (end-to-end latency) was measured by the difference between the timestamp when the consumer processed the event and the timestamp when the producer wrote the event. Note that this is comparing wall clock timestamps on different machines so clock skew may have introduced measurement errors, but it is assumed that clocks are fairly well synchronized in this Google Cloud GKE environment.
Consumer lag percentiles were collected with HdrHistogram over 3 minutes.
In our throughput tests we found that both Kafka and Akka Brokerless had great results and showed that the bottleneck was the database in both cases, however, latency results showed that Akka Brokerless Pub/Sub connections is vastly superior.
We use benchmarks as a way of validating ideas and looking for potential undiscovered issues under load. We want to avoid benchmarking to improve performance under very specific scenarios as this can lead to optimization fallacies. This test for example only validates one size of message and we may see changes for tiny and large messages.
We also find benchmarks to be a useful tool for development. In developing this benchmark we were able to diagnose and repair issues in an early version of this software. It enabled us to validate our hypothesis that latency would be positively affected by this change and we were able to verify the benefit of this feature both in overhead saving and performance gains. We are pleased with this result and it helps pave the way for more Akka features that will allow you to ensure your data is where it needs to be when it needs to be there.