As the General Data Protection Regulation (GDPR) is effective from May 2018, many considerations regarding data obtaining, storage, and most notably allowing for data to be deleted at the request of users are introduced into systems we build.
Lightbend has investigated the various considerations that you may need to take into account to make your systems GDPR-compliant in a continuous way for the long term. To help our customers with this daunting task we can now offer a new module, GDPR for Akka Persistence, that will assist you in building a compliant system.
Within GDPR, the sections that technology can be most helpful with compliance are around conforming to Chapter 3: Rights of the data subject. The technically most challenging is the "Right to be Forgotten" (Art 17).
Akka and Lagom can be used with Event Sourcing techniques, which means storing the entire event sequence leading to the total state of some entity. This also means, that deleting or modifying historical state in order to be compliant with GDPR means that all those events, which carry personal information, will have to be modified rather than just one single state.
It can be difficult to modify all such events and all places where the information may be stored, such as denormalized projections, snapshots and backups. Data shredding can be used to forget information instead of deleting or modifying it. This is achieved by encrypting the data with a key for a given data subject id (e.g. person) and deleting the key when that data subject is to be forgotten.
GDPR for Akka Persistence provides serialization and encryption utilities for this approach of using cryptographic erasure to support the "Right to be Forgotten", and it can be used with Event Sourced data or other data storage approaches.
Let's look at an example: say that we have a persistent entity representing a blog post that many different users may add comments to. Each comment is associated with a user that may request to be forgotten, so this means that all their comments should become anonymous.
In this example, it would be difficult to find all comments by a given user, since the comments are added to the blog post rather than being indexed by the user identifier. We would have to maintain such an additional cross-index or traverse all blog posts to be able to find and update all comments.
Instead, we encrypt the author part of the events that represent the comments with a specific encryption key for the user that wrote the comment. The blog post entity would know that if the author part of a comment can't be decrypted because the key has been removed the author of the comment should not be shown any more.
GDPR for Akka Persistence also provides migration tools if you have an existing Akka or Lagom application that you want to move to use data encryption for payloads, and eventually also make use of the data shredding technique.