fast-data scala spark

Fast Data for Telecommunications: Swisscom Q/A On Choosing Scala And Spark For New Streaming Data Platform

Going from concept to production in just 9 months...

Recently, we published a case study with our customer Swisscom, the leading mobile service provider in Switzerland. While the case study focuses more on the solution and implementation details of how Swisscom used Scala and Spark to build a fast data streaming platform from the ground up in just 9 months, there is more to the story.

To get a fuller picture, we sat down with Francois Garillot, Big Data Scientist at Swisscom.

In the following Q/A, Francois explains how the Mobility Insights team at Swisscom unlocked new opportunities using comprehensive, real-time data for accurate population density mapping, city planning, and intelligent traffic routing, and discusses the decision making process behind working with Lightbend and technologies like Scala, Spark, Kafka, Cassandra and others.

How does telecommunications provider’s customers benefit from real-time streaming data?

Francois Garillot (FG): To start with, Swisscom's network was deemed the overall best in the country in a 2015/2016 test by Connect magazine, so Swisscom is commited to delivering a high-quality service to customers and so uses data mining and machine learning techniques to analyze and monitor the functionality of its cellular network.

But a benefit just as valuable is being able leverage the data generated by the network into a deep, dynamic analysis of the terrain and daily lives of the population of Switzerland. This sort of analysis becomes particularly insightful when it helps customers make decisions in real time, as movements and events happen.

We can derive aggregates of the location of populations from the regular functioning of the mobile network. The participants in this aggregated view are roughly a fraction of mobile devices across Switzerland equal to its market share. This asset can be turned into insight, through the use of various data analytics and machine learning techniques. And the team in charge of this is called Mobility Insights.

The focus of this work is not on individuals: Swisscom is of course happy to comply with data privacy and telecommunications regulations, and the bread and butter of this team is hence to analyze the terrain and the geography of mobility for populations. Moreover, we are dealing only with anonymized data. Yet the opportunities for insight that has an impact are legion.

Why was it important to shift from annually-collected data to real-time origin destination matrices?

FG: Our mobile users’ devices contribute anonymously to a real-time view of the head count of a population group in a given location, which can help derive a density map of the area. This data can be refined to the density maps with close to street-level granularity: this already supersedes the census surveys of old, which tended to be static, outdated and limited. The relevance of this information is much better than a census since it can be obtained continuously, in real time, as opposed to the manual headcount usually collected by a government administration on a yearly basis (at best) and on a small sample.

A second insight, that can be built from this stepping stone, is an origin-to-destination matrix that can be used to figure out where people come from and where they are going with respect to a given area of interest . This is valuable, for example, for civil engineers working on the particular tools needed to travel through a location, but also for traffic planning, transportation planning, and ultimately can be used as one of the blocks for building a smart city.

The end results can be visualized in the next figure, which shows a dynamic display of sections of Switzerland that the user can select — here showing the city center of Fribourg — and shows the flow of people going in-and-out, represented as an arrow of a proportional size:

Ideal users of this kind of visualization include organizations working in:

Civil engineering
Public transportation planning
Administration and location management
Marketing
Tourism

The above analysis is centered on administrative segments of Switzerland — in this case, neighborhoods. But some similar work can be performed focusing on a landmark or a particular point of interest.

We tested this recently on football and hockey stadiums — with the intent of helping people interested in those highly commercial locations to understand where the spectators come from and where they go back to after a particular event. Here you can see the result when focusing on the Fribourg-Gotéron stadium:

One of the consequences of comparing the two displays produced for origin-destination matrices of the town of Fribourg (CH) and for the corresponding hockey stadium of Gotéron leads to discover a few key differences in the way that the town and the stadium are attracting visitors.

It can be apparent that on the one hand the town of Fribourg is very metropolitan, attracting visitors coming from Swiss cities such as Bern, Zurich and Lausanne. The stadium, on the other hand, has an audience which is mostly local — attracting people from nearby cities including Payerne and Romont. And yet the image above shows a match day with the town of Bern.

The practical actions that can be undertaken from that apparently simple piece of insight are many: they include ideas on how public transport could differ during sporting events and whether some train or bus lines should have special times. But they also include which roads it is more important to work on, but from the point of view of the city or from the Stadium’s administrator.

This event becomes even more interesting, when it is being generated over time, letting an analyst pick up on patterns as well as — given real time processing — exceptional events.

How did Scala and Spark help you build a low-latency, scalable platform?

FG: It was critical that Swisscom be able to deliver customers and stakeholders a detailed analysis at the scale of a country, and with very low latency on a scalable and productive data analytics platform.

Swisscom is able to reconstruct anonymized histories of visits, figuring out short trips out of a string of network events, using Apache Spark.

Those histories are then aggregated into density maps and origin-destination matrices that allow thorough analysis and planning of areas. Users can drill down to the municipality level, or even at a more granular level, down to specific points of interest. The code for the analysis is written in Scala a programming language, of which Lightbend is nowadays the principal maintainer.

In developing this analysis, Swisscom was happy to find that some promises of the language indeed did deliver significant advantages. We enjoyed the great interoperability with Java libraries and legacy Java code. Moreover, writing code for scientific computation in the modern world needs to leverage the existing Scala ecosystem (such as the Breeze library), as well as the Java landmark libraries in the field but also letting us plug in native code in the back end for performance (something achieved through Spark through netlib-java). This seamless interface with the Java world avoided a lot of time spent reimplementing basic computations in an optimal manner.

Moreover, Scala is a strongly typed language, which helped guard against common bugs in unit mangling. But that is one advantage of types which needs no more introduction. On the other hand, we found ourselves using short time series repeatedly in our Spark and Scala code, and for various reasons (including performance and tail recursion) had to ask ourselves repeatedly about the sorting of these time series: is the list of events I’m receiving in chronological order ? Is it in ante-chronological order (starting with the latest events)?

To help ourselves ensure this — and avoid bugs — at the point of call, we were able to implement a custom version of unboxed tagged types in less than thirty lines of code, helping us check our assumptions on the order of sequences of events. Without Scala’s rich type system and implicits, we would have had to rely on tests alone.

trait Chronological
trait AnteChronological

/**
 * Provides unboxed tagged types for tracking sortedness of user histories
 * @see http://etorreborre.blogspot.de/2011/11/practical-uses-for-unboxed-tagged-types.html
 */
type Tagged[U] = { type Tag = U }
type @@[T, U] = T with Tagged[U]

type ChronoHistory = List[UEupdate] @@ Chronological
type AnteChronoHistory = List[UEupdate] @@ AnteChronological

implicit class chrono(l: List[UEupdate]) {
  /**
   * These coercions should only be used in cases where it's locally clear the history is
   * (ante)chronological.
   */
  def asChrono: ChronoHistory = l.asInstanceOf[ChronoHistory]
  def asAnteChrono: AnteChronoHistory = l.asInstanceOf[AnteChronoHistory]
}

/**
 * Lets different contexts automatically reverse the user history in case
 * the opposite order is being passed.
*/
implicit def reverseChrono(l: ChronoHistory): AnteChronoHistory = l.reverse.asAnteChrono
implicit def reverseAnteChrono(l: AnteChronoHistory): ChronoHistory = l.reverse.asChrono

Naturally, we also noticed and enjoyed the concision — and "semi-column-less-ness" — of the language. In summary, Scala helped equip developers with the right tools to face this analytics challenge.

And the challenge is not to be underestimated: even a quick look at the 3GPP standards will reveal that the telecommunications industry is rife with standards that provide a difficult challenge in parsing and interpreting data, as it is being received — "in production" — by the communication network.

This particular case was no exception, and extracting geolocalisation data from mobile device history, and comparing it with existing topographical as well as publicly available data for Switzerland was no easy task. We experienced that Scala’s benefits in pattern matching, regular expression integration, as well as extractor objects are very appropriate for this kind of task.

But even with those advantages all of this would have been moot if the data processing pipeline had not being able to face the challenge of treating the large amount of data generating by a mobile network — even at the scale of a relatively small country like Switzerland. The logs generated by the Swisscom internal data ingestion already consist of data which is extremely structured and filtered, yet it amounts to 1.5 terabytes of new data per day, all of which have to be treated in a heterogeneous cluster which forms a multi-tenant, shared environment across many teams.

This processing is orchestrated on Yet Another Resource Negotiator (YARN), the cluster manager of choice in Hadoop distributions. It was chosen mostly for maturity reasons¸but Swisscom has been keen to look forward to its future technology choices and considers the advice of Lightbend on its preferred cluster manager — namely Apache Mesos. Swisscom is currently considering running some data processing in Apache Mesos in the future.

But this overview of the platform overlooks one key aspect of Swisscom’s analyses: the ability to deliver insight continuously in real time. For this part, we require going back to our latest analyses to give a bit more context.

How do these deep insights improve your business and UX?

FG: In order to help customers make better travel choices in a daily life where they share travel resources (roads and paths) with other people, Swisscom has been looking at what it could infer from the location of groups in real time: understanding the flow of people through different locations.

For example, during the last tube strike in 2014 in London, there are research results that show that the city was better off after the strike, despite having lost a lot of money because of delays incurred during the strike. Indeed, Londoners have reacted to the perturbation of the strike and with the necessity to go around it in order to reach their place of work, have actually optimized their chosen path during their daily commute.

This is what the Mobility Insights team with Asesh Bhaumick and Andrea Schwaller from the Enterprise Customer Department have tried to look at recently at Swisscom. But a first step towards being able to count people traveling on a path of interest is to be able to recognize those people as being on this path.

That may sound simple, but when dealing with data coming from a mobile network, this is not as obvious as it looks. In particular, users are located thanks to their association with successive cell towers.

Those cell towers cover a somewhat imprecise range, which is fuzzy for positioning a user. As a consequence, we need to look at the sequence of several communications events. Looking at that sequence, we try to see if it is similar with the associations that would be met by a user known to be on a path of interest. That latter part — which network communication events a well-located user would generate — is called the ground truth.

However, there may be many possible cell re-selections for a given path of interest: if you were traveling or commuting with a friend going to the same place, using the same path, from the same location, it’s well possible that you and your friend may have a mobile device trace that differs. The reasons include things such as load balancing across the network, or the particulars of the radio transmission properties of your mobile device for example.

As a consequence, we need to consider the time series of cell tower associations, and compare it to known patterns of cell tower associations that is considered to represent what real users traveling that path effectively encounter.

But even comparison against this graph is too rigid, since the inconsistencies of radio transmission, as well as the expense of collecting the variety of mobile device ground truth to generate that graph for a given path of interest make this matching approximative. It is perfectly possible that you may be traveling on the intended path of interest for your commute and that once in awhile, your mobile device may choose for some reason to contact a cell that no experimenter has encountered.

How did Scala help you leverage Spark Streaming for real time fuzzy graph matching?

FG: As a consequence Swisscom had to leverage the best of both the Scala and the Spark ecosystems to face these challenges. The Scala-graph open source library — originally created and now maintained by the EPFL — was a key asset in the construction, representation, modification, and easy computation on a large cell graph. Moreover, it scales well enough to represent all the paths of interest that we are testing, each numbering in the hundreds of cells.

To match against this graph in real time, however, we couldn’t wait for an user to arrive at the end of his commute: if a user is in an unfavorable or congested path, we want to alert him instantly, not when he’s stuck in traffic. We had to do instant decisions in classifying users based on their last few cells.

Because of the scale of our data analysis, we distributed the matching of users against paths of interest, using Spark Streaming and Kafka as an event delivery engine. In terms of algorithmics, crafting an interesting graph of cells representing a path of interest was done using scala-graph, but the actual matching had to switch to techniques based on locality-sensitive hashing, distributing an index of selected short paths across a cluster of machines. This technique, implemented by hand from techniques born in the 90s, let us do approximate matching against user histories.

That complex piece of algorithmics was helped by the numerous tests libraries existing in the Scala/Spark ecosystem, including ScalaTest and the spark-testing-base of Holden Karau.

Finally, our streaming library is able to understand the occupation of paths of interest in a relatively small amount of code that we take great care in maintaining — since Swisscom is a large organization with an interested in making every library part of reusable engineering know-how.

To that effect, we encountered issues with the simple build tool (SBT) for future compatibility with Scala 2.11 (cross-building), for which the use of the Lightbend support was a fantastic asset — having access to the creators of the language let us do our best to make sure that Swisscom stays close to the cutting edge of technology in the Scala and Spark ecosystem.

Any final comments?

FG: Our platform, based on Spark and Scala leverages many other components, among which we count Kafka, Cassandra and HDFS. With this Q/A, we hoped to make the link between our use case and our core technologies more clear: how producing and running real-time analytics quickly benefits from Spark Streaming’s agility, how the involved analytics benefits from Scala’s interoperability with the Java world, and how dealing with complex structures (from graphs to ordered time series) safely and concisely is helped by Scala’s type system and DSL-friendliness.

For us, this was a good match.

READ THE SWISSCOM CASE STUDY

The Total Economic Impact™
Of Lightbend Akka

139% ROI
50% to 75% faster time-to-market
20x increase in developer throughput
<6 months Akka pays for itself

Read the full report

September 20, 2016