In Part 1 of this series “How To Deploy And Use Kubeflow On Red Hat OpenShift”, we discussed what Kubeflow is, and how it can be useful for running Machine Learning applications in production. In this post, we discuss installation of Kubeflow.
Kubeflow’s installation is currently based on ksonnet1, a configurable, typed, templating system for the Kubernetes application developer. The goal of ksonnet is to improve the developer experience by providing options beyond writing YAML or text templating. Additionally, ksonnet supports separation of Kubernetes object definitions from the actual cluster destination, thus simplifying moving a developed system to any platform (image source):
The first step is to install ksonnet following these instructions. In the case of Mac OS, the installation can be done using Homebrew:
$ brew install ksonnet/tap/ks
Once this is done, you can use ks --help
to validate that the installation is successful and the ksonnet CLI is on the your PATH.
The next step is to connect to the OpenShift cluster. We’ll use an OpenShift 3.11 cluster for the following steps. Once the OpenShift CLI tool oc is configured correctly, we can create a ksonnet Kubeflow project. Recall that I am using version 0.4.1 of Kubeflow for this blog post series.
Now we can run the following commands to download the installation shell script, kfctl.sh:
$ export KUBEFLOW_SRC=kubeflow
$ cd ${KUBEFLOW_SRC}
$ export KUBEFLOW_TAG=v0.4.1
$ curl https://raw.githubusercontent.com/kubeflow/kubeflow/${KUBEFLOW_TAG}/scripts/download.sh | bash
Now run the following commands to setup Kubeflow:
$ export KFAPP=openshift
$ scripts/kfctl.sh init ${KFAPP} --platform none
$ cd ${KFAPP}
$ ../scripts/kfctl.sh generate k8s
This will copy all Kubeflow artifacts locally and will create a disk layout that should look as follows:
On the top level there are 4 main folders:
The Kubeflow install process has been mostly tested on GKE and as a result, in order to deploy it on OpenShift, it is necessary to relax some of the security constraints, running the following commands (giving permissions to the pods to run “as user”)2:
$ oc adm policy add-scc-to-user anyuid -z ambassador -nkubeflow
$ oc adm policy add-scc-to-user anyuid -z jupyter -nkubeflow
$ oc adm policy add-scc-to-user anyuid -z katib-ui -nkubeflow
$ oc adm policy add-scc-to-user anyuid -z default -nkubeflow
Refer to this blog post for what these commands mean. Additionally, the current version of kubeflow is using this image, gcr.io/kubeflow-images-public/tf_operator:v0.4.0, which has bugs. It is necessary to update to the image gcr.io/kubeflow-images-public/tf_operator:latest, where those bugs are fixed3. To update the image version, run the following command:
$ ks param set tf-job-operator tfJobImage gcr.io/kubeflow-images-public/tf_operator:latest
Once this is done, we can use the following command to install Kubeflow on the cluster:
$ ../scripts/kfctl.sh apply k8s
NOTE If you want to see what is installed by this command, take a look inside kfctl.sh. You will see:
ks apply default -c ambassador
ks apply default -c jupyter
ks apply default -c centraldashboard
ks apply default -c tf-job-operator
ks apply default -c pytorch-operator
ks apply default -c metacontroller
ks apply default -c spartakus
ks apply default -c argo
ks apply default -c pipeline
You can always comment out some of these lines if you do not want to install some of the components or add additional components.
The installation creates a new project on the cluster called kubeflow and deploys everything there.
Once the installation is complete, you should see the following components running:
To verify that the installation finished correctly:
1. Verify that none of the pods are failing by going to the OpenShift console and viewing the running pods in the kubeflow project:
2. Go to the ambassador service and create a route:
3. Go to the URL exposed by the route. You should see the main Kubeflow page allowing you to interact with different Kubeflow components:
If you need to delete an existing Kubeflow installation, you can use the same script that you used for installation. Run the following command:
$ ../scripts/kfctl.sh delete k8s
NOTE: The delete script will uninstall all the Kubeflow applications and delete the kubeflow namespace. So any additional installations you have done to this namespace will be deleted.
That’s all for this part. Check out the next post on Kubeflow’s support components, and thanks for reading!
p.s. If you’d like to get professional guidance on best-practices and how-tos with Machine Learning, simply contact us to learn how Lightbend can help.
PART 3: KUBEFLOW SUPPORT COMPONENTS
1 Although the current version of Kubeflow uses ksonnet for installation, this might change moving forward. ↩
2 Here I assume that installation is going to be done in the project kubeflow (default). If you want to install to another project, adjust commands accordingly. ↩
3 At the time of writing, the latest version is the only one available where these fixes exist. ↩