(2 min read)
In our 2019 survey, Streaming Data And The Future Tech Stack, over 800 software engineers from around the world shared their experiences. One of the most interesting results shows that companies processing data in real time for AI/ML use cases had a more than five-fold increase from 2017 to 2019.
Compared to our survey numbers in 2017, where only 6% of respondents reported using stream processing for AI/ML applications, we now see that just over 1/3 have now adopted this–an increase of 500% in two years.
Respondents also revealed that those already using streaming for AI/ML expect this trend to continue with even broader use in the coming year. Other notable findings include:
Production-level adoption widened dramatically, with several use cases seeing big jumps over the last two years. The sharp rise in real-time processing for IoT pipelines, ETL and integration of different data streams indicates that organizations need to extract insights from their data and leverage advanced analytics (such as AI/ML) as quickly as possible.
Adoption of stream processing for business operations was relatively smaller, likely because operational insights and consolidated views of customers can usually be successfully implemented with the time lags associated with batch processing.
Similar stagnation is seen for traditional statistical analysis, which had previously seen wide consideration among companies that used Hadoop.
Dean Wampler, Ph.D. - VP of Fast Data Engineering at Lightbend
"These comparative results demonstrate the continued growth of stream processing, as we expected, due to the never-ending importance of finding a competitive advantage in all industries. Compared to batch, stream processing provides both faster access to valuable information and new capabilities that batch can’t provide, like interactive services that require near-instantaneous results.
The rapid growth of streaming ETL is evidence for the competitive advantage streaming provides. ETL is an old and established process, of course; however, the value of some information declines with time, so having it available sooner and leveraging it as soon as its available maximizes that value.
The combination of ML/AI and streaming is a good example of enabling new capabilities that were previously not possible. ML/AI automates and scales sophisticated analysis that previously required direct human intervention. Training of ML/AI models is still often done with periodic batch jobs, so they don’t grow stale, but using these models in streaming applications delivers immediate benefits.
None of these advantages would be possible without the right data sources integrated together quickly, which is why “integration of data streams” is the third area that has grown so much in the last two years."