GithubHelp home page GithubHelp logo

sclr's Introduction

Sparse Conditional Linear Regression

This project runs a system of linear regression problems on a dataset in order to find hidden patterns in a subset of the data.

Status

Build Status

Deployment

This project is designed to deploy an Akka Cluster on Kubernetes. By using Kubernetes, we hope to be able to scale our ComputeActor instances to run our linear regression fits quickly.

The project uses Docker to build several container images. The provided YAML files let us push these Docker images to Kubernetes . A new feature of Kubernetes StatefulSet lets us instantiate our nodes in a specific order so that we can hard-code a set of "known" seed nodes for our Akka Cluster to use. These "seed node(s)" allow all nodes to register themselves with the cluster so it can bootstrap.

Components

  • Akka: A free and open-source toolkit and runtime simplifying the construction of concurrent and distributed applications on the JVM.
  • Kubernetes: An open-source system for automating deployment, scaling, and management of containers.
  • Docker: Container building,shipping and running platform
  • sbt: The interactive build tool for Scala
  • Scala: A general-purpose programming language providing support for functional programming and a strong static type system.

Notes

The setup for this project was inspired by the following projects:

Steps

We can deploy and run on Google Kubernetes Engine (GKE) or locally!

Note that I have instructions for this using a Mac.

  1. Clone project somewhere.
    • git clone https://github.com/johnhainline/sclr.git
    • cd sclr
  2. Install the Kubernetes CLI.
    • Mac: brew cask install google-cloud-sdk which installs gcloud and other utilities. gcloud components install kubectl to get, kubectl.
  3. Install docker.
    • Mac: brew cask install docker
  4. Run docker.
  5. Build our base docker image:
    • cd src/main/resources/docker/; docker build -t local/openjdk-custom:latest .; cd ../../../../;
    • This builds the docker image referenced in our build.sbt as "local/openjdk-custom".
  6. Create a secret in kubectl for our MySQL password.
    • kubectl create secret generic mysql-password --from-literal=password=MYSQL_PASSWORD
  7. kubectl can point to the cloud, or to a local minikube instance.
    • kubectl config get-contexts and kubectl cluster-info

Build and Run Locally

  1. Install minikube, a locally running Kubernetes cluster.
    • Mac: brew cask install minikube
  2. Start minikube, enable DNS support, connect to docker, and open the dashboard.
    • minikube start
    • minikube addons enable kube-dns
    • eval $(minikube docker-env)
    • minikube dashboard
  3. Build project and push two docker images to our local docker install.
    • sbt manage/docker:publishLocal
    • sbt compute/docker:publishLocal

Build and Run on Google Kubernetes Engine

  1. See GKE Quickstart
  2. Get short-lived access to us.gcr.io, the Google Container Registry.
    • gcloud docker -a
  3. Build project and publish it to the Google Container Registry.
    • sbt manage/docker:publish
    • sbt compute/docker:publish
  4. Create a remote cluster for running our Kubernetes scripts on.
    • gcloud container clusters list
    • gcloud container clusters create sclr-01 --zone us-central1-a --num-nodes 1 --cluster-version=1.9.2-gke.1
    • gcloud container clusters get-credentials sclr-01
    • gcloud container clusters describe sclr-01

Common Deploy/Run commands

  1. Deploy using Kubernetes scripts.
    • cd src/main/resources/kubernetes/; kubectl create -f mysql.yaml; kubectl create -f compute-pods.yaml; kubectl create -f manage-pods.yaml; cd ../../../..;
  2. Check running pods, services, etc.
    • kubectl get all -o wide
  3. Send a single POST request (from the compute-0 pod) to the http-service endpoint. This kicks off the job.
    • kubectl exec -ti compute-0 -- curl -vH "Content-Type: application/json" -X POST -d '{"name":"m5000","dnfSize":2,"optionalSample":200,"useLPNorm":true,"mu":0.24}' http-service.default.svc.cluster.local:8080/begin
  4. Scale the compute nodes to 50
    • kubectl scale statefulsets compute --replicas=50
  5. Make a connection to the MySQL server.
    • kubectl run -it --rm --image=mysql:5.7 --restart=Never mysql-client -- mysql -h mysql-service -pMYSQL_PASSWORD
  6. Dump an entire schema from the MySQL server to our local directory.
    • kubectl exec -ti MYSQL_POD_NAME -- mysqldump --add-drop-database --databases medium -pMYSQL_PASSWORD > backup.sql
  7. Delete all local Kubernetes pods, including MySQL, etc.
    • kubectl delete pvc mysql-pv-claim; kubectl delete all -l app=sclr
    • Note DOUBLE CHECK EVERYTHING IS DOWN. On error things may keep running, costing money.

Benchmarks

We are using the sbt-jmh benchmarking framework. See sbt-jmh

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.