openshift kubeflow deploy machine-learning installation kubernetes tensorflow pipelines data-science

How To Deploy Kubeflow On Lightbend Platform With OpenShift - Part 1: Introduction

Boris Lublinsky Principal Architect, Lightbend, Inc.

A technical series on getting long-term value from Machine Learning (ML) with Kubeflow on Lightbend Platform with Red Hat OpenShift

Lightbend Platform provides a set of capabilities leveraging streaming engines - Apache Spark and Apache Flink - and streaming libraries - Akka Streams and Kafka Streams. An important part of such streaming applications is the usage of models created using machine learning. Currently our platform only provides limited support for machine learning, through Spark’s MLlib. Adding Kubeflow as a certified component of Lightbend Platform allows us to fill this void and enable users to seamlessly transition from model training to model serving inside a single platform. Future releases will enhance the integration between Lightbend Platform and Kubeflow.

A foundation of Lightbend Platform is OpenShift, an PaaS for “enterprise Kubernetes” widely adopted by many organizations. Although based on Kubernetes, OpenShift adds a lot of features, most importantly enhanced security. The additional security often makes it non-trivial to port applications from Kubernetes to OpenShift. We believe the these enhancements result in more secure production deployments, in the long term; however, for this blog series, we will simplify things a bit to keep the examples as straightforward as possible. We hope you enjoy it!

The Challenges Of Machine Learning In Production

A major challenge when developing ML systems is deploying them to production and maintaining them over time, such as updating models as new ones are trained. One of the main drivers for this situation, as explained here, is the fact that actual machine learning is just a tip of the iceberg and the majority of the implementation has to do with “plumbing” (image source):

Kubeflow is a Kubernetes-based tool that was developed to address many of these “plumbing” concerns in a single comprehensive system. In this series of posts, I will show how to install and use Kubeflow’s main components (version 0.4.1 at the time of this writing) on Red Hat OpenShift.

What Is Kubeflow?

Kubeflow started as an OSS version of Google’s internal TensorFlow implementation, based on a pipeline called TensorFlow Extended (TFX). It began at the end of 2017 as a simple way to run TensorFlow jobs on Kubernetes, but has since expanded to be a multi-architecture, multi-cloud framework for running entire ML pipelines. It's current mission is to make it easy for everyone to develop, deploy, and manage composable, portable, and scalable machine learning applications on Kubernetes everywhere (see the Kubeflow source code on GitHub).

The main purpose of Kubeflow is to simplify the job of data scientists by simplifying the DevOps aspects of running ML in production. The picture below shows a view of data scientist’s usage of Kubeflow (image source).

In this series of posts, I will describe various topics, which we’ve broken down into nine parts (including this introduction) and released all at once, like Netflix.

That covers Part 1 of this series. Check out the next section installing Kubeflow on OpenShift, and thanks for reading!

p.s. If you would like professional guidance on best-practices and how-tos with Machine Learning, simply contact us to learn how Lightbend can help. Lightbend is evaluating whether and to how to provide built-in integration and support for Kubeflow in Lightbend Platform.

PART 2: KUBEFLOW INSTALLATION

Author

Boris Lublinsky

Principal Architect, Lightbend, Inc.

Boris Lublinsky is a Principal Architect at Lightbend. Boris has over 30 years experience in enterprise, technical architecture, and software engineering. He is an active member of OASIS SOA RM committee, co-author of Applied SOA: Service-Oriented Architecture and Design Strategies (Wiley), Professional Hadoop Solutions (Wiley), Serving Machine Learning Models (O’Reilly) and Kubeflow for Machine Learning: From Lab to production (O’Reilly).

The Total Economic Impact™
Of Lightbend Akka

139% ROI
50% to 75% faster time-to-market
20x increase in developer throughput
<6 months Akka pays for itself

Read the full report

February 28, 2019