A technical series on getting long-term value from Machine Learning (ML) with Kubeflow on Lightbend Platform with Red Hat OpenShift

Lightbend Platform provides a set of capabilities leveraging streaming engines -  Apache Spark and Apache Flink - and streaming libraries - Akka Streams and Kafka Streams. An important part of such streaming applications is the usage of models created using machine learning. Currently our platform only provides limited  support for machine learning, through Spark’s MLlib. Adding Kubeflow as a certified component of Lightbend Platform allows us to fill this void and enable users to seamlessly transition from model training to model serving inside a single platform. Future releases will enhance the integration between Lightbend Platform and Kubeflow.

A foundation of Lightbend Platform is OpenShift, an PaaS for “enterprise Kubernetes” widely adopted by many organizations. Although based on Kubernetes, OpenShift adds a lot of features, most importantly enhanced security. The additional security often makes it non-trivial to port applications from Kubernetes to OpenShift. We believe the these enhancements result in more secure production deployments, in the long term; however, for this blog series, we will simplify things a bit to keep the examples as straightforward as possible. We hope you enjoy it!

The Challenges Of Machine Learning In Production

A major challenge when developing ML systems is deploying them to production and maintaining them over time, such as updating models as new ones are trained. One of the main drivers for this situation, as explained here, is the fact that actual machine learning is just a tip of the iceberg and the majority of the implementation has to do with “plumbing” (image source):

Kubeflow is a Kubernetes-based tool that was developed to address many of these “plumbing” concerns in a single comprehensive system. In this series of posts, I will show how to install and use Kubeflow’s main components (version 0.4.1 at the time of this writing) on Red Hat OpenShift

What Is Kubeflow?

Kubeflow started as an OSS version of Google’s internal TensorFlow implementation, based on a pipeline called TensorFlow Extended (TFX). It began at the end of 2017 as a simple way to run TensorFlow jobs on Kubernetes, but has since expanded to be a multi-architecture, multi-cloud framework for running entire ML pipelines. It's current mission is to make it easy for everyone to develop, deploy, and manage composable, portable, and scalable machine learning applications on Kubernetes everywhere (see the Kubeflow source code on GitHub).

The main purpose of Kubeflow is to simplify the job of data scientists by simplifying the DevOps aspects of running ML in production. The picture below shows a view of data scientist’s usage of Kubeflow (image source).

In this series of posts, I will describe various topics, which we’ve broken down into nine parts (including this introduction) and released all at once, like Netflix.

That covers Part 1 of this series. Check out the next section installing Kubeflow on OpenShift, and thanks for reading!

p.s. If you would like professional guidance on best-practices and how-tos with Machine Learning, simply contact us to learn how Lightbend can help. Lightbend is evaluating whether and to how to provide built-in integration and support for Kubeflow in Lightbend Platform.

PART 2: KUBEFLOW INSTALLATION

Share



Comments


View All Posts or Filter By Tag


Questions?