GithubHelp home page GithubHelp logo

lightbend / kubeflow-recommender Goto Github PK

View Code? Open in Web Editor NEW
36.0 9.0 11.0 27.12 MB

Kubeflow example of machine learning/model serving

License: Apache License 2.0

Dockerfile 0.41% Shell 0.12% Scala 28.46% Jupyter Notebook 50.28% Python 8.28% Jsonnet 11.76% Smarty 0.70%
kubeflow model-serving machine-learning

kubeflow-recommender's Introduction

Recommender example for kubeflow

This is an end-to-end example showing how to use Kubeflow for both machine learning and model serving.

The example consists of the following steps:

  • Building machine learning using Kubeflow's Jupiter The machine learning implementation is using Collaborative filtering to recommend products to the user (out of the set of products) based on his purchsing history. An implementation is first converts purchasing history to the rating matrix, based on this blog post and then uses this matrix to build a prediction model, following this repository THe actual notebook can be found here
  • Once model is build, the python code is exported (see here) and is used for building TFJob. The Dockerfile is here. In addition there is a bash file.
  • Model serving is based on TF-serving. Due to limitations of TF-serving I have decided to run two instances of TF-serving and alterate their usage for serving.
  • KubeFlow Pipeline used to coordinate execution of steps. Notebook for creation and execution of pipeline is here. Python code for pipeline creation is here. Once the Python code runs, it creates a file, called pipeline.tar.gz. Following this example, the definition can be uploaded to the pipelines UI. Now we can view the pipeline there Pipelines. In addition we can also run pipeline from there, that produces the following result: Pipelines Currectly pipelines do not allow to define periodic definition in the pipeline definition, but from UI, it is possible to configure run as recurring and specify how often the run is executed Pipelines

Additional components included in this implementation include the following:

  • Data Publister is a project used for preparing new data for machine learning. Whichever code is necessary to get the list of users and their current purchasing history goes here. For the simple implementation here I am not doing here anything - just give a code sample oh how to update data used for learning.
  • Model server is a project implementing an actual model serving. It gets a stream of data and leverages TF Serving for the actual model serving. Additionally, it implements the second stream, that allows to change the URL of TF-serving based on the model update
  • Model Publisher is a project responsible for updating model for Model server. It reads a current TF-server from a data file, makes sure that it is operational (by sending it HTTP request) and if it is, publishes a new model (new URL) to the model server. The acual code is here.
  • Client is a project responsible for publishing recommendation requests to the model server. Code is here.

For storing data used in the project (models, data) we are using Minio, which is part of Kubeflow installation.

Finally we are using Kubeflow pipelines for organizing and scheduling overall execution.

The overall architecture of implementation is presented below: Overal Architecture

Building

Different pieces are build differently. Python code - recommender ML - is directly build into docker (see above) The rest of the code is is leveraging [SBT Docker plugin] and can be build using the following command:

sbt docker

that produces all images locally. These images have to be pushed into repository accessable from the cluster. I was using Docker Hub

Installation

Installation requires several steps:

  • install kubeflow following the blog posts
  • Install kafka as described here
  • Populate minio with test data following this post
  • Start Jupiter, following this blog post and test the notebook
  • Try usage of TFJob for machine learning, following this blog post Ksonnet definitions for these can be found here
  • Deploy model serving components recommender and recommender1 following this blog posts Ksonnet definitions for these can be found here
  • Deploy Strimzi following this documentation. After the operator is installed, use this yaml file to create Kafka cluster
  • Deploy model server and request provider using this chart
  • Enable usage of Argo following blog post
  • Enable usage of Kubeflow pipelines following blog post
  • Test pipeline from the notebook
  • Build pipeline definition using Python code and upload it to the pipeline UI
  • Start recurring pipeline execution.

Installation update for version 0.6

  • install kubeflow following the following documentation. To run successfully on OpenShift (4.1) set the following service account to scc (this is a superset) anyuid:
system:serviceaccount:kubeflow:admission-webhook-service-account,
system:serviceaccount:kubeflow:default,
system:serviceaccount:kubeflow:katib-controller,
system:serviceaccount:kubeflow:katib-ui,
system:serviceaccount:kubeflow:ml-pipeline,
system:serviceaccount:istio-system:prometheus,
system:serviceaccount:kubeflow:argo-ui,
system:serviceaccount:istio-system:istio-citadel-service-account,
system:serviceaccount:istio-system:istio-galley-service-account,
system:serviceaccount:istio-system:istio-mixer-service-account,
system:serviceaccount:istio-system:istio-pilot-service-account,
system:serviceaccount:istio-system:istio-egressgateway-service-account,
system:serviceaccount:istio-system:istio-ingressgateway-service-account,
system:serviceaccount:istio-system:istio-sidecar-injector-service-account,
system:serviceaccount:istio-system:grafana,
system:serviceaccount:istio-system:default,
system:serviceaccount:kubeflow:jupyter,
system:serviceaccount:kubeflow:jupyter-notebook,
system:serviceaccount:kubeflow:jupyter-hub,
system:serviceaccount:boris:default-editor,
system:serviceaccount:kubeflow:tf-job-operator,
system:serviceaccount:istio-system:kiali-service-account,
system:serviceaccount:boris:strimzi-cluster-operator

set the following service account to scc (this is a superset) privileged:

system:serviceaccount:openshift-infra:build-controller,
system:serviceaccount:kubeflow:admission-webhook-service-account,
system:serviceaccount:kubeflow:default,
system:serviceaccount:kubeflow:katib-controller,
system:serviceaccount:kubeflow:katib-ui,
system:serviceaccount:kubeflow:ml-pipeline,
system:serviceaccount:istio-system:jaeger,
system:serviceaccount:bookinfo:default,
system:serviceaccount:kubeflow:jupyter-web-app-service-account,
system:serviceaccount:kubeflow:argo,
system:serviceaccount:kubeflow:pipeline-runner
  • To make Kiali running, update Kiali cluster role as follows
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: kiali
  selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/kiali
  uid: cd0de883-b9ff-11e9-bd33-023708277e46
  resourceVersion: '11149746'
  creationTimestamp: '2019-08-08T17:13:11Z'
  labels:
    app: kiali
    chart: kiali
    heritage: Tiller
    release: istio
rules:
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - ''
    resources:
      - configmaps
      - endpoints
      - namespaces
      - nodes
      - pods
      - services
      - replicationcontrollers
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - extensions
      - apps
      - apps.openshift.io
    resources:
      - deployments
      - deploymentconfigs
      - statefulsets
      - replicasets
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - project.openshift.io
    resources:
      - projects
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - autoscaling
    resources:
      - horizontalpodautoscalers
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - batch
    resources:
      - cronjobs
      - jobs
  - verbs:
      - create
      - delete
      - get
      - list
      - patch
      - watch
    apiGroups:
      - config.istio.io
    resources:
      - apikeys
      - authorizations
      - checknothings
      - circonuses
      - deniers
      - fluentds
      - handlers
      - kubernetesenvs
      - kuberneteses
      - listcheckers
      - listentries
      - logentries
      - memquotas
      - metrics
      - opas
      - prometheuses
      - quotas
      - quotaspecbindings
      - quotaspecs
      - rbacs
      - reportnothings
      - rules
      - solarwindses
      - stackdrivers
      - statsds
      - stdios
  - verbs:
      - create
      - delete
      - get
      - list
      - patch
      - watch
    apiGroups:
      - networking.istio.io
    resources:
      - destinationrules
      - gateways
      - serviceentries
      - virtualservices
  - verbs:
      - create
      - delete
      - get
      - list
      - patch
      - watch
    apiGroups:
      - authentication.istio.io
    resources:
      - policies
      - meshpolicies
  - verbs:
      - create
      - delete
      - get
      - list
      - patch
      - watch
    apiGroups:
      - rbac.istio.io
    resources:
      - clusterrbacconfigs
      - rbacconfigs
      - serviceroles
      - servicerolebindings
  - verbs:
      - get
    apiGroups:
      - monitoring.kiali.io
    resources:
      - monitoringdashboards
  • Populate minio with test data following using the following commands:
mc mb minio/data
mc mb minio/models
mc cp /Users/boris/Projects/Recommender/data/users.csv minio/data/recommender/users.csv
mc cp /Users/boris/Projects/Recommender/data/transactions.csv minio/data/recommender/transactions.csv
mc cp /Users/boris/Projects/Recommender/data/directory.txt minio/data/recommender/directory.txt
  • Starting Jupiter server. Had to do several things:

    • Following issue update notebooks-controller-role to include notebooks/finalizers and os adm policy
    • Following issue created service account and gave it anyuid role
  • Creating TFJob. Several things:

    • Without ksonet, it is necessary to create a yaml file for TFJob. Note, that container name has to be tensorflow
    • tf-job-operator has to be added to anyuid
    • Add tfjobs/finalizers to tf-job-operator role
    • TFJob UI is not integrated yet. Go to /tfjobs/ui/
  • According to Kubeflow documentation, Tensorflow serving has not yet been converted to kustomize. So we are using a custom deployment (modeled after deployment in Kubeflow 0.4).

  • Argo. Several things:

    • update argo and argo-ui role to add workflows/finalizers
    • update workflow-controller-configmap to add - containerRuntimeExecutor: k8sapi

Installation update for version 0.7 on Openshift 4.1

First install Istio following this documentation Next, install KNative following this documentation Follow installation steps here, setting up for the later deployment. Go to kfctl_k8s_istio.0.7.0.yaml and comment out Istio and KNative installs. Also go to the generated kustomize files and update the last definition of the file to:

apiVersion: rbac.istio.io/v1alpha1
kind: RbacConfig
metadata:
  name: default
spec:
  mode: $(clusterRbacConfig)

FInally run the following commands:

oc adm policy add-scc-to-user anyuid -z admission-webhook-service-account -nkubeflow
oc adm policy add-scc-to-user anyuid -z katib-controller -nkubeflow
oc adm policy add-scc-to-user anyuid -z katib-ui -nkubeflow
oc adm policy add-scc-to-user anyuid -z default -nkubeflow
oc adm policy add-scc-to-user anyuid -z ml-pipeline -nkubeflow
oc adm policy add-scc-to-user anyuid -z pipeline-runner -nkubeflow

Install Kubeflow using kfctl command. Make sure that in your Istio configuration contains kubeflow namespace in ServiceMeshMemberRoll

apiVersion: maistra.io/v1
kind: ServiceMeshMemberRoll
metadata:
  selfLink: /apis/maistra.io/v1/namespaces/istio-system/servicemeshmemberrolls/default
  resourceVersion: '63163135'
  name: default
  uid: eedb5e27-da19-11e9-aa18-12a7ea357834
  creationTimestamp: '2019-09-18T13:40:52Z'
  generation: 2
  namespace: istio-system
  ownerReferences:
    - apiVersion: maistra.io/v1
      kind: ServiceMeshControlPlane
      name: basic-install
      uid: a2fde1b5-d9a1-11e9-aa18-12a7ea357834
  finalizers:
    - maistra.io/istio-operator
spec:
  members:
    - knative-serving
    - kfserving-system
    - kubeflow
status:
  configuredMembers:
    - knative-serving
    - kubeflow
  meshGeneration: 2
  observedGeneration: 2

And a kubeflow-gateway is added to a ServiceMeshControlPlane

apiVersion: maistra.io/v1
kind: ServiceMeshControlPlane
metadata:
  creationTimestamp: '2019-09-17T23:19:45Z'
  finalizers:
    - maistra.io/istio-operator
  generation: 2
  name: basic-install
  namespace: istio-system
  resourceVersion: '63163077'
  selfLink: >-
    /apis/maistra.io/v1/namespaces/istio-system/servicemeshcontrolplanes/basic-install
  uid: a2fde1b5-d9a1-11e9-aa18-12a7ea357834
spec:
  istio:
    gateways:
      istio-egressgateway:
        autoscaleEnabled: false
      istio-ingressgateway:
        autoscaleEnabled: false
      kubeflow-gateway:
        autoscaleEnabled: false
....

Once this is done you can use Istio Ingress to access Kubeflow.

Notebook image

gcr.io/kubeflow-images-public/tensorflow-1.13.1-notebook-cpu:v-base-08f3cbc-1166369568336121856

License

Copyright (C) 2019 Lightbend Inc. (https://www.lightbend.com).

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this project except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

kubeflow-recommender's People

Contributors

deanwampler avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.