Audience: Developers and Architects
Technical level: Beginning-Intermediate
Why have real-time, stream-oriented data systems become so popular, when batch-oriented systems have served Big Data needs for many years? While batch-mode processing isn’t going away, it’s clear that exclusive use of these systems is now a competitive disadvantage.
In this 2nd Edition of the O’Reilly eBook, Dr. Dean Wampler examines the rise of streaming systems–known as Fast Data architectures–for handling time-sensitive problems like detecting fraudulent financial activity as it happens.
Using several open source tools, you’ll explore the characteristics needed to implement real-time, streaming data architectures. You will also learn that while these systems are much harder to build, they represent the state of the art for dealing with mountains of data that require immediate attention.
- Learn step-by-step how a basic Fast Data architecture works
- Understand why event logs are the core abstraction for streaming architectures, while message queues are the core integration tool
- Use methods for analyzing infinite data sets, where you don’t have all the data and never will
- Take a tour of open source streaming engines, and discover which ones work best for different use cases
- Get recommendations for making real-world streaming system responsive, resilient, elastic, and message driven
- Explore two example applications, data ETL and analysis, and predictive analytics in IoT (Internet of Things) for telemetry ingestion and anomaly detection in home automation systems