Distributed Tracing, Split Brain Resolver, Ask Pattern, and more

We are pleased to announce a new release of Lightbend Telemetry, a suite of monitoring and observability tooling for Lightbend technologies. This release brings some highly requested capabilities to Telemetry, including:

  • Distributed tracing and context propagation over Kafka messages using Alpakka, including Lagom applications relying on Alpakka.
  • New metrics and Grafana dashboards for:

Lightbend Telemetry provides deep observability into the inner-workings of distributed applications built with Akka, Lagom, Play, Java, and Scala. This lets users tap into the “black box” of distributed systems via events, metrics and distributed tracing. Telemetry provides a wide variety of integrations, including: Grafana, Prometheus, Elasticsearch, Jaeger, Zipkin, New Relic, Datadog, and others, including any tool that supports OpenTracing.

Alpakka Kafka Tracing

Telemetry 2.14 introduces support for automatic context propagation across instrumented services communicating via Kafka using Alpakka. Generation of tracing spans can also be enabled, providing traces like the following:

In combination with the Akka Streams tracing support, this can provide end-to-end traces where your Akka streams communicate via Kafka (using Alpakka):

By default, context propagation across Alpakka Kafka is enabled but span generation is not. The documentation covers mechanisms for enabling span generation and other useful configuration options including filtering of sensitive headers/keys and disabling propagation altogether.

Akka Ask Pattern Metrics

This release provides new metrics for the ask request-response pattern used in Akka and Lagom. The screenshots below show the Prometheus/Grafana-backed dashboard alongside an explanation of the metrics included.

Asked requests — The number of messages that actors have been asked for in the selected time frame.

Requests failed — The number of times an actor responds with a Failure to a request message.

Expired requests — The number of messages that haven’t been answered before a specified timeout in the selected time frame.

Expired requests by-timeout — The number of requests that timed out in the selected time frame grouped by their specific timeout values.

Success response time — The time (in milliseconds) between a message being sent and an answer being received, before the specified timeout, counting only successful responses.

Late response time — The time (in milliseconds) between a message being sent and an answer being received, counting only responses received after their timeout.

Lagom Read-side Processor Metrics

This release provides new metrics for Lagom projection insights, the screenshots below show the Prometheus/Grafana-backed dashboard alongside an explanation of the metrics included.

Processing time — The time (in milliseconds) taken to process an event.

Events processed successfully (per projection) — The number of events processed successfully within the selected time period, broken down by projection.

Latency — The time (in milliseconds) between an event being created and the moment it is considered as fully processed (which includes updating the offset store in all cases).

Events processed successfully — The number of events processed successfully within the selected time period.

Query latency — The time (in milliseconds) between an event being created and the moment it was read, to be processed.

Event processing failures — The number of event processing failures that occurred within the selected time period.

Support for Open Source Split Brain Resolver

Akka 2.6.6 brings Lightbend’s Split Brain Resolver functionality into the open-source of Akka Cluster. Telemetry 2.14 supports emitting events from the now open-source Split Brain Resolver.

Upgrades to OpenTracing Dependencies

Telemetry 2.14 updates the versions of its OpenTracing related dependencies:

  • OpenTracing now uses version 0.32
  • Datadog OpenTracing now uses version 0.43
  • Jaeger dependencies now use version 0.35.5
    • Jaeger dependencies are used to report tracing to both Jaeger and Zipkin backends

OpenTracing 0.32 has some deprecation warnings for methods that are subsequently removed in 0.33.

Fixes, Enhancements and Performance Improvements

This release also contains a number of fixes, enhancements and performance improvements that are fully documented in the release notes with links through to the relevant documentation where appropriate.

sbt Commercial Resolver Changes

To simplify the setup of Lightbend’s commercial Libraries, from Lightbend Telemetry 2.13 onwards, the sbt-cinnamon plugin will no longer require the commercial resolver in the plugins.sbt file.

This change brings the behaviour in line with other commercial products (such as Akka Enhancements) and means that the commercial resolver only needs to be defined once at the project level in build.sbt. If the commercial resolver cannot be found, a warning will be generated in the logs.

Connect with Lightbend

As always, we’re interested in your feedback or ideas for Lightbend Telemetry, to improve the visibility into Reactive applications and distributed systems. We encourage our clients to reach out to us through the Lightbend Support Portal, or get in touch with your Lightbend representative.

Not a Lightbend client yet? Why not schedule a demo of Lightbend Telemetry with one of our excellent teammates. Let us know when the time is right for a chat:

SCHEDULE A DEMO

 

Share



Comments


View All Posts or Filter By Tag


Questions?