GithubHelp home page GithubHelp logo

squat / kubeconeu2018 Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 0.0 21 KB

KubeCon EU 2018 talk on automating GPU infrastructure for Kubernetes on Container Linux

License: MIT License

HCL 100.00%
kubenetes kubecon nvidia gpu terraform container-linux

kubeconeu2018's Introduction

KubeCon EU 2018

This repository contains the demo code for my KubeCon EU 2018 talk about automating GPU infrastructure for Kubernetes on Container Linux.

youtube asciicast

Prerequisites

You will need a Google Cloud account with available quota for NVIDIA GPUs.

Getting Started

Edit the require.tf Terraform file and uncomment and add the details for your Google Cloud project:

$EDITOR require.tf

Modify the provided terraform.tfvars file to suit your project:

$EDITOR terraform.tfvars

Running

  1. create cluster:

    terraform apply --auto-approve
  2. get nodes:

    export KUBECONFIG="$(pwd)"/assets/auth/kubeconfig
    watch -n 1 kubectl get nodes
  3. create GPU manifests:

    kubectl apply -f manifests
  4. check status of driver installer:

    kubectl logs $(kubectl get pods -n kube-system | grep nvidia-driver-installer | awk '{print $1}') -c modulus -n kube-system -f
  5. check status of device plugin:

    kubectl logs $(kubectl get pods -n kube-system | grep nvidia-gpu-device-plugin | awk '{print $1}' | head -n1 | tail -n1) -n kube-system -f
  6. verify worker node has allocatable GPUs:

    kubectl describe node $(kubectl get nodes | grep worker | awk '{print $1}')
  7. let's inspect the GPU workload:

    less manifests/darkapi.yaml
  8. let's see if the GPU workload has been scheduled:

    watch -n 2 kubectl get pods
    kubectl logs $(kubectl get pods | grep darkapi | awk '{print $1}') -f
  9. for fun, let's test the GPU workload:

    export INGRESS=$(terraform output | grep ingress_static_ip | awk '{print $3}')
    ~/code/darkapi/client http://$INGRESS/api/yolo
  10. finally, let's clean up:

    terraform destroy --auto-approve

Projects Leveraged In This Demo

Component URL
Kubernetes installer https://github.com/poseidon/typhoon
GPU driver installer https://github.com/squat/modulus
Kubernetes device plugin https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/device-plugins/nvidia-gpu/daemonset.yaml
sample workload https://github.com/squat/darkapi

kubeconeu2018's People

Contributors

squat avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.