Spark in Action
by Marko Bonaći and Petar Zečević
Working with big data can be complex and challenging, in part because of the multiple analysis frameworks and tools required. Apache Spark is a big data processing framework perfect for analyzing near-real-time streams and discovering historical patterns in batched data sets. But Spark goes much further than other frameworks. By including machine learning and graph processing capabilities, it makes many specialized data processing platforms obsolete. Spark's unified framework and programming model significantly lowers the initial infrastructure investment, and Spark's core abstractions are intuitive for most Scala, Java, and Python developers.