By now you have likely heard the news that IBM has made a strategic investment in Lightbend to bring Reactive solutions to IBM Platforms. IBM and Lightbend have a history of collaboration, but this strategic investment and relationship will allow enterprise developers to reap the benefits of both Lightbend’s expertise in Reactive microservices and IBM’s expertise in AI and Machine Learning.
In 2016, IBM and Lightbend began cooperating on mass education initiatives for the developer community with free university-style curriculums focused on Scala, Apache Spark and Reactive Programming. Within IBM, teams began using Akka and Scala for initiatives including OpenWhisk and Watson Data Platform, to name a few. Today, we are excited to formalize our collaboration to jointly deliver an integrated platform for this new era of computing.
In this new era, we see that the fundamental shift from “data at rest” to “data in motion” has accelerated. Our data used to be batched offline, but now it’s streaming online. Applications today need to react to changes in data in as close to real time as possible to perform continuous queries or aggregations of inbound data and feed it—also in real time—back into the application to affect the way it is operating. Many of these use cases revolve around enabling advanced AI and Machine Learning capabilities, such as those pioneered by IBM, to be applied as part of the application.
This type of stream data processing is about more than just extracting information faster; it’s about embracing wholesale change in how we build data-centric applications. The demands for availability, scalability, and resilience are forcing Fast Data architectures to become more like microservice architectures. Conversely, successful organizations building microservices find their data needs grow with their organization. Hence, there is a unification happening between data and microservice architectures to support distributed streaming.
Distributed streaming can be defined as partitioned and distributed streams, for maximum scalability, working with infinite streams of data—as done in Apache Flink, Apache Spark Streaming and Apache Beam. It is different from application-specific streaming, performed locally within the service, or between services and client/service in a point-to-point fashion—which includes techniques such as Reactive Streams, Akka Streams, Reactive Socket, WebSockets, HTTP 2, gRPC etc.
If we look at things from a distributed streaming perspective, microservices make great stream pipeline endpoints, bridging the application side of things with the streaming side. Here they can either ingest data into the pipeline—data coming from a user, generated by the application itself, or from other systems—or query it, passing the results on to other applications or systems.
From a microservices perspective, distributed streaming has emerged as a powerful tool alongside the application, where it can be used to crunch application data, and provide analytics functionality to the application itself. It can help with analyzing both user provided business data, as well as metadata and metrics data generated by the application itself—something that can be used to influence how the application behaves under load or failure, by employing predictive actions. All of this in real-time.
Lately, we have also started to see distributed streaming being used as the data distribution fabric for microservices, where it serves as the main communication backbone in the application. The growing use of Apache Kafka in microservices architecture is a good example of this pattern.
Another important change is that while traditional (overnight) batch processing platforms, like Hadoop, could get away with high latency and unavailability at times, modern distributed streaming platforms like Apache Spark and Apache Flink need to be Reactive. They need to scale elastically, reacting adaptively to usage patterns and data volumes; be resilient, always available, and never lose data; and be responsive, always deliver results in a timely fashion.
We are also starting to see more microservices-based systems grow to be dominated by data, making their architectures look more like big pipelines of streaming data.
Distributed streaming and microservices are unifying, both relying on Reactive architectures and techniques to get the job done. Enabling the huge investments that organizations are making in data science to be operationalized into real-time “run-the-business” applications means that there has never been a better time for IBM and Lightbend to team up. To stay current with the latest news, see lightbend.com/IBM.