cloud-native java-ee streaming

Going From Java EE To Cloud-Native, Real-Time Streaming Applications

Markus Eisele Director Developer Advocacy, Lightbend, Inc.

The Shift Towards Real-Time Streaming Systems

To address all the shortcomings of traditional monolithic Java EE applications and the heavyweight middleware and infrastructure needed to run them, developers must shift their thinking. Both systems and organizations themselves must increase flexibility, adapt to complex environments, quickly roll out new changes without rigid dependencies and coordination, know how and when to behave in certain ways, scale massively at any given moment without compromising infrastructure, and most importantly, be able to rapidly identify, isolate, and self-heal in the face of failure at any level. The steps for achieving these goals include:

Design distributed systems, ideally following Reactive principles
On the path to Microservices, take a lesson from Domain Driven Design (DDD)
Prioritize resilience before thinking about scaling elastically in the cloud
Utilize a streaming architecture to achieve distribution, concurrency, supervision, and resilience

Design distributed systems and Reactive principles

Designers need to build systems for flexibility and resiliency, not just efficiency and robustness. This necessitates the redesign of existing Java EE applications into more flexible modules that are self-contained, autonomous, and can be scaled independently, as they are responsible for their own business context from individual features down to the relevant data. To make it easier for business leaders, IT professionals, and third-party vendors to innovate and collaborate around these new systems, a common vocabulary was established in the Reactive Manifesto to cover these requirements: Reactive systems are Responsive, Resilient, Elastic, and Message-Driven:

*Image courtesy of www.reactivemanifesto.org*

Message-driven means more than just non-blocking I/O. Reactive systems at their foundation are powered by asynchronous, non-blocking, message-driven communication. This enables supervision, isolation, and replication of failed processes.
Resilience goes further than fault tolerance. The ability to self-heal in an automated and predictable way is treated as part of the full service/application lifecycle and made possible by a message-driven approach to communication.
Elasticity means efficient, cost-conscious scalability. A message-driven foundation enables a level of indirection and loose coupling. This helps create systems that can boost performance by scaling out, as well as up, across all physical and cloud infrastructure during busy times, and lower costs by dynamically scaling in/down during slow times.
Responsive systems always serve customers. Reactive systems provide a consistently responsive user experience that is highly available, never fails during busy times, and isn’t susceptible to blocked processes and cascading failures.

Take a lesson from Events-First Domain Driven Design (DDD)

Rather than thinking of microservices architecture as Service-Oriented Architecture (SOA) 2.0, developers now have the Reactive Manifesto to help them apply the principles of Reactive systems to real-world domains. The requirements of microservices architecture can best be identified with the help of Domain-Driven Design (DDD), an architectural principle that recommends designing systems to reflect real-world domains, considering the elements, behavior, and interactions between business domains.

Microservices operate on principles similar to those of DDD. Each microservice owns its data and must be responsible for a specific feature or functionality, and be able to work together as a system to form an aggregation of cohesive functionality. A good rule of thumb is to gather services that change for the same reason while separating those services that can change for a different reason. This can be achieved by designing systems that:

Use encapsulation to improve flexibility. Microservices must encapsulate all internal implementation details so that external systems utilizing them in the cloud or on-premise never need to worry about the internals. Encapsulation reduces the complexity and enhances the flexibility of the system, making it more amenable to changes.
Apply isolation to encourage loose coupling and avoid the cascade effect. The changes to a single microservice should have no negative impact on other services. As synchronous communication introduces a host of interrelated dependencies, this principle aligns with the message-driven communication approach to distributed systems by enforcing asynchronous, non-blocking stream-based communication between microservices. As per SOA, RESTful APIs are more suitable than Java RMI, as the latter enforces a technology on other system components.
Separate domains of concern to reduce complexity. Creating microservices based on distinct functions with zero overlap of concerns with other components lets designers reduce the complexity of interaction between services. While it is important for each microservice to own its data, there is considerable flexibility in how that data is stored. Of course, the data may be stored in a traditional database. However, it is also common for some microservice implementations to store data into multiple databases. For example, store data in a SQL database for flexible queries and also in ElasticSearch to provide more advanced search options. Another common data storage approach is to save all data change requests in an event log and also store the data in a more queryable form in one or more databases, referring to Event Sourcing and CQRS. The advantage here is that the event log can be treated as a stream, allowing consistent and resilient propagation of state changes throughout a system.

Prioritize resilience before thinking about elastic scaling in the cloud

Most applications are designed and developed for blue skies. But all software across all time has failed and will continue to fail. Today’s applications, therefore, must be designed with the inevitability of failure in mind.

With cloud-based microservices architectures, things are even more complex: these applications are composed of a large number of individual services, adding a level of complexity that touches all relevant parts of an application in order to measure availability and recover from failures.

These new requirements force designers to reconsider how they incorporate error handling and fault tolerance into applications. Modern applications must be resilient on all levels, not just a few. Reactive systems, therefore, place a critical focus on resilience, which enables systems to self-heal automatically and adopt a confident attitude to routine errors or failures that are managed quickly.

The key to achieving this is message-driven service interaction, using streams, which automatically provides the core building blocks that enable systems to be resilient against failures at many different levels. In turn, these building blocks serve as a rock-solid foundation that is capable of scaling in and out elastically across all system resources.

Automate supervision to minimize human intervention. The goal of building resilience against failures into systems is to minimize human intervention. Supervision—the ability to identify successful or unsuccessful task completion across the entire system—is at the core of system performance, endurance, and security. Supervision based on a message-driven approach enables location transparency so that processes can run and interact on completely different cluster nodes as easily as in-process on the same VM.
Isolate and contain failures to enable self-healing. When isolation is in place, systems can separate different types of work based on a number of factors, like the risk of failure, performance characteristics, CPU and memory usage, etc. Failure in one isolated component won’t impact the responsiveness of the overall system and the failing component will have a chance to heal. A dedicated separate error channel allows redirection of an error rather than just throwing it back to the caller.
Master resilience and elasticity to achieve system responsiveness. Modern applications must be resilient at their core in order to scale and remain responsive under a variety of real-world, less than ideal conditions. The result is a consistently responsive system ready for business.

The Bottom Line

Traditional Java EE monolithic applications rely on technologies and architectural approaches that are in conflict with the properties needed to create resilient and scalable systems. This is seen most prominently in the use of a central database and distributed transactions to handle all distribution, concurrency, and supervision concerns. These techniques, however, violate isolation, coupling all the systems that use the database and coordinate transactions together, preventing resilience, scalability, and leading to a lack of responsiveness.

A Reactive, streaming architecture is the major booster that provides distribution, concurrency, and supervision without the use of a central database and distributed transactions, allowing isolation between services in order to achieve resilience and scalability. Streaming architectures supervise operations and processes at the stream level, ensuring progression of operations through consumption and distribution of streams.

This is where we can help. Lightbend helps developers create real-time streaming applications that are responsive, resilient, flexible, and message-driven, providing the perfect architecture for creating powerful, adaptable applications that thrive in the cloud.

ASK US ANYTHING

Author

Markus Eisele

Director Developer Advocacy, Lightbend, Inc.

Twitter: @myfear GitHub: myfear

Markus is a Java Champion, former Java EE Expert Group member, founder of JavaLand, reputed speaker at Java conferences around the world, and a very well known figure in the Enterprise Java world.

The Total Economic Impact™
Of Lightbend Akka

139% ROI
50% to 75% faster time-to-market
20x increase in developer throughput
<6 months Akka pays for itself

Read the full report

May 14, 2019