The 7 Ways to Wash Dishes and the Case for Message-driven Reactive Systems
Understanding Async, Non-Blocking, Concurrent, Parallel and More
I’ve been working for several years to try to find a meaningful way to describe the core concepts of building efficient, Reactive applications - being asynchronous and non-blocking while minimizing concurrency and supporting linear scalability by enhancing parallelism. That is a veritable soup of esoteric terms that are difficult to grasp for even the most experienced developers. Yet understanding them is critical to building truly Reactive applications.
In the past few months, I think I may have found a way to express these concepts more clearly. It also highlights some other interesting concepts, such as pipelining, batching, fork/join and Amdahl’s Law via an everyday metaphor. This kind of real-world analogy has always helped me understand concepts, and when I’ve run through this one with customers, they’ve found it to be helpful as well.
I like to wash dishes. Seriously, I do. And while you may think to yourself, “That’s pretty odd,” anyone who has met me will tell you that sums me up pretty well. Not only that, I’m one of those particularly strange people who not only hand-washes dishes, but also puts them into the dishwasher when I’m done for further cleaning. I’m quite scarred from having a college roommate who would eat cheesy pasta, let the dishes and pots sit on the sink for days, and then think the 1980s era dishwasher would simply remove all of the crusty food he left behind. In using this analogy, I’ve discovered a few more people out there who have similar dish neuroses, which has been gratifying for me in coping with my obsessive-compulsive dishwashing behavior. But this story is not about my need for professional therapeutic help.
Each night after dinner, I have a stack of dishes on my sink that need to be cleaned. This process is much like what you see in the handling of streaming data. I use a pipelined process of transformations: rinsing the crud off of each dish, scrubbing it in soapy water, rinsing it again to remove the soap and putting the dish into the dishwasher. This process is purely synchronous (all being handled by one processor/person) and sequential (cannot be reordered), as I can only do one task at a time. I could send each dish into the work pipeline individually, or I could loosen my sequential guarantees and try to batch the dishes through each transformation in the pipeline by rinsing all of the dishes first, scrubbing all of the dishes next, rinsing all of them again and then placing them all into the dishwasher as a group. In doing so, I may have increased my performance marginally by increasing the locality of the data (each dish) to the place of execution (the faucet, the scrubbing sponge, the dishwasher, etc), but my performance is bound by the fact that I’m still only a single processor doing all of the work.
Imagine I have a good friend who also loves dishwashing. Knowing how much he or she also enjoys washing dishes, I invite that person over to my house to help out one evening. I’ve now created a thread pool - multiple threads of execution which may or may not contribute to the work I’d like to accomplish. When that person arrives, I show them the stack of dishes and ask them if they would start washing them. This friend of mine goes to the sink and starts working dishes through the pipelined process. I have now spawned asynchronous work, just as if I had used a Java Callable or Runnable on a thread pool. If I stand behind them and wait for them to finish and do not do anything else, I am a blocked thread. I have spawned the work to be done and delegated it to some thread of execution who will perform the work at an arbitrary time, but I am not doing anything until that other task is completed, much like a Java Future instance. Instead of standing around, I could go have a lemonade, and now I am non-blocking (like a Scala Future or Java 8 CompletableFuture) but also not productive to the task at hand. And my friend is likely becoming quite irritated with me. Moreover, unless they are a considerably more efficient washer of dishes than I am (a faster processor, for example), the work is not getting done much faster.
What I really want is for both of us to be contributing towards getting this work done more efficiently and faster. At this point, I join my friend in performing the work. My friend is responsible for grabbing a dish from the stack, rinsing it and scrubbing it. I take the dish from them at that point, rinse it again and put it into the dishwasher. I am now non-blocking and productive to the task, but by staging the work this way, we have shared resources that affect our ability to do our work optimally. As the thread of execution responsible for handling work delegated by my friend, I have to wait for each dish, which could take an indeterminate amount of time to be scrubbed depending on how dirty it is (the essence of CPU-intensive work). Worse, we both have to use the faucet to rinse dishes, so we have a mutually exclusive operation over state (the faucet) that must be arbitrated via communication, much like the arbitration of contended, mutually exclusive (mutex) locks by the kernel of a computer. This is the essence of concurrency, typically over shared mutable state. If the state (the faucet) is uncontended (not being used by either of us at the time it is required), we can quickly progress through our tasks. But if there is contention (one of us is using the faucet when the other one needs it), we are stuck waiting until they other is done to progress.
The way to ameliorate concurrency and contention is to increase footprint. If my house was big enough that I had another sink, I could take a stack of dishes and go do my work independently of my friend, and that would be more efficient. Think of this as similar to parallel collections in Java 8 or Scala using a ForkJoinPool. I’m forking the work by grabbing a stack of dishes, my friend and I are performing the work as the available cores of execution, and we will have a join when we need to reassemble the dishes into the dishwashing machine. However, like parallel collections, the fork and join phases are still concurrent - we must divide the data, the dishes, between ourselves to be processed, and we must join the data (again, the dishes) in the transformed collection (the dishwasher).
This fork can be cheap or expensive, depending on how the work is distributed. If I were to simply grab the top half of the plates, it’s a simple operation. However, what if I want to distribute them between my friend and I in some ordered fashion? That would be significantly more costly. And the join point can have similar costs depending on order as well. This brings up the cost of Amdahl’s Law - even though we parallelized the work, it could still take longer than if we did it sequentially if the time to fork and join the work is too high.
We need for our work to be as parallel as possible in order to maximize our processing efficiency. To do this, we need to increase our footprint even more. It would be ideal if we could somehow broker the dishes to both my friend and I without us trying to figure out who is doing what, or for us to steal work from a common queue. And if I have two dishwashers, I don’t have contention on a single join point. This increase in footprint does mean additional cost, but the decrease in concurrency and increase in parallelism means that my scalability has become linear - as I add more processors/sinks/dishwashers, I increase the number of dishes I can process (and data I can transform) by the same factor. The value of this increased scalability and efficiency may well justify the increase in cost of commodity hardware to my business.
In the end, my ultimate goal is to create asynchronous, non-blocking and parallelized execution with minimal points of concurrency. By brokering the work, I am treating each dish (or batch of dishes) as a message, where I do not care which pipeline performs the transformation (washing of dishes). This is the essence of location transparency, which drives elasticity in a Reactive system by allowing you to spin up additional nodes to handle increasing workloads, or shut down nodes as work decreases. Location transparency is also supportive of Resilience because you don’t care that a pipeline failed and a dish that never successfully completed its transformation to cleanliness can be re-handled. And in being both elastic in the face of changing load and resilient to failures of every kind in a system, a responsive user experience can be assured. In this way, Message Driven architectures are the essence of Reactive applications.