GithubHelp home page GithubHelp logo

huifu1018 / kubeshare Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nthu-lsalab/kubeshare

0.0 0.0 0.0 1.32 MB

Share GPU between Pods in Kubernetes

License: Apache License 2.0

Makefile 0.56% Go 93.53% Shell 0.51% Dockerfile 2.44% C 0.55% Python 2.42%

kubeshare's Introduction

KubeShare

Share GPU between Pods in Kubernetes

Features

  • Treat GPU as a first class resource.
  • Compatible with native "nvidia.com/gpu" system.
  • Extensible architecture supports custom scheduling policies without modifing KubeShare.

Prerequisite & Limitation

  • A Kubernetes cluster with garbage collection, DNS enabled, and Nvidia GPU device plugin installed.
  • GPU attachment setting of container should be going through NVIDIA_VISIBLE_DEVICES environment variable (docker and nvidia-docker2 version < 19).
  • One GPU model within one node.

CUDA Version Compatibility

CUDA Version Status
9.0 Unknown
9.1 Unknown
9.2 Unknown
10.0 Yes
10.1 Unknown
10.2 Unknown

Run

Installation

kubectl create -f https://lsalab.cs.nthu.edu.tw/~ericyeh/KubeShare/v0.9/crd.yaml
kubectl create -f https://lsalab.cs.nthu.edu.tw/~ericyeh/KubeShare/v0.9/device-manager.yaml
kubectl create -f https://lsalab.cs.nthu.edu.tw/~ericyeh/KubeShare/v0.9/scheduler.yaml

Uninstallation

kubectl delete -f https://lsalab.cs.nthu.edu.tw/~ericyeh/KubeShare/v0.9/crd.yaml
kubectl delete -f https://lsalab.cs.nthu.edu.tw/~ericyeh/KubeShare/v0.9/device-manager.yaml
kubectl delete -f https://lsalab.cs.nthu.edu.tw/~ericyeh/KubeShare/v0.9/scheduler.yaml

SharePod

SharePod Lifecycle

SharePod Lifecycle

  1. User create a SharePod to requiring portion GPU.
  2. kubeshare-scheduler schedules pending SharePods.
  3. kubeshare-device-manager will create a corresponding Pod object behind the SharePod with same namespace and name, and some extra critical settings. (Pod started to run)
  4. kubeshare-device-manager will synchronize Pod's ObjectMeta and PodStatus to SharePodStatus.
  5. SharePod was deleted by user. (Pod was also garbage collected by K8s)

SharePod Specification

apiVersion: kubeshare.nthu/v1
kind: SharePod
metadata:
  name: sharepod1
  annotations:
    "kubeshare/gpu_request": "0.5" # required if allocating GPU
    "kubeshare/gpu_limit": "1.0" # required if allocating GPU
    "kubeshare/gpu_mem": "1073741824" # required if allocating GPU # 1Gi, in bytes
    "kubeshare/sched_affinity": "red" # optional
    "kubeshare/sched_anti-affinity": "green" # optional
    "kubeshare/sched_exclusion": "blue" # optional
spec: # PodSpec
  containers:
  - name: cuda
    image: nvidia/cuda:9.0-base
    command: ["nvidia-smi", "-L"]
    resources:
      limits:
        cpu: "1"
        memory: "500Mi"

Because floating point custom device requests is forbidden by K8s, we move GPU resource usage definitions to Annotations.

  • kubeshare/gpu_request (required if allocating GPU): guaranteed GPU usage of Pod, gpu_request <= "1.0".
  • kubeshare/gpu_limit (required if allocating GPU): maximum extra usage if GPU still has free resources, gpu_request <= gpu_limit <= "1.0".
  • kubeshare/gpu_mem (required if allocating GPU): maximum GPU memory usage of Pod, in bytes.
  • spec (required): a normal PodSpec definition to be running in K8s.
  • kubeshare/sched_affinity (optional): only schedules SharePod with same sched_affinity label or schedules to an idle GPU.
  • kubeshare/sched_anti-affinity (optional): do not schedules SharedPods together which have the same sched_anti-affinity label.
  • kubeshare/sched_exclusion (optional): only one sched_exclusion label exists on a device, including empty label.

SharePod usage demo clip

All yaml files in clip are located in REPO_ROOT/doc/yaml.

asciicast

SharePod with NodeName and GPUID (advanced)

Follow this section to understand how to locate a SharePod on a GPU which is used by others.
kubeshare-scheduler fills metadata.annotations["kubeshare/GPUID"] and spec.nodeName to schedule a SharePod.

apiVersion: kubeshare.nthu/v1
kind: SharePod
metadata:
  name: sharepod1
  annotations:
    "kubeshare/gpu_request": "0.5"
    "kubeshare/gpu_limit": "1.0"
    "kubeshare/gpu_mem": "1073741824" # 1Gi, in bytes
    "kubeshare/GPUID": "abcde"
spec: # PodSpec
  nodeName: node01
  containers:
  - name: cuda
    image: nvidia/cuda:9.0-base
    command: ["nvidia-smi", "-L"]
    resources:
      limits:
        cpu: "1"
        memory: "500Mi"

A GPU is shared between mulitple SharePods if the SharePods own the same <nodeName, GPUID> pair.

Following is a demonstration about how kubeshare-scheduler schedule SharePods with GPUID mechanism in a single node with two physical GPUs:

Initial status

GPU1(null)       GPU2(null)
+--------------+ +--------------+
|              | |              |
|              | |              |
|              | |              |
+--------------+ +--------------+

Pending list: Pod1(0.2)
kubeshare-scheduler decides to bind Pod1 on an idle GPU:
    randomString(5) => "zxcvb"
    Register Pod1 with GPUID: "zxcvb"

GPU1(null)       GPU2(zxcvb)
+--------------+ +--------------+
|              | |   Pod1:0.2   |
|              | |              |
|              | |              |
+--------------+ +--------------+

Pending list: Pod2(0.3)
kubeshare-scheduler decides to bind Pod2 on an idle GPU:
    randomString(5) => "qwert"
    Register Pod2 with GPUID: "qwert"

GPU1(qwert)      GPU2(zxcvb)
+--------------+ +--------------+
|   Pod2:0.3   | |   Pod1:0.2   |
|              | |              |
|              | |              |
+--------------+ +--------------+

Pending list: Pod3(0.4)
kubeshare-scheduler decides to share the GPU which Pod1 is using with Pod3:
    Register Pod2 with GPUID: "zxcvb"

GPU1(qwert)      GPU2(zxcvb)
+--------------+ +--------------+
|   Pod2:0.3   | |   Pod1:0.2   |
|              | |   Pod3:0.4   |
|              | |              |
+--------------+ +--------------+

Delete Pod2 (GPUID qwert is no longer exist)

GPU1(null)       GPU2(zxcvb)
+--------------+ +--------------+
|              | |   Pod1:0.2   |
|              | |   Pod3:0.4   |
|              | |              |
+--------------+ +--------------+

Pending list: Pod4(0.5)
kubeshare-scheduler decides to bind Pod4 on an idle GPU:
    randomString(5) => "asdfg"
    Register Pod4 with GPUID: "asdfg"

GPU1(asdfg)      GPU2(zxcvb)
+--------------+ +--------------+
|   Pod4:0.5   | |   Pod1:0.2   |
|              | |   Pod3:0.4   |
|              | |              |
+--------------+ +--------------+

More details in System Architecture

Build

Compiling

git clone https://github.com/NTHU-LSALAB/KubeShare.git
cd KubeShare
make
  • bin/kubeshare-scheduler: schedules pending SharePods to node and device, i.e. <nodeName, GPUID>.
  • bin/kubeshare-device-manager: handles scheduled SharePods and create the Pod object. Communicate with kubeshare-config-client on every nodes.
  • bin/kubeshare-config-client: daemonset on every node which configure the GPU isolation settings.

Directories & Files

  • cmd/: where main function located of three binaries.
  • crd/: CRD specification yaml file.
  • docker/: materials of all docker images in yaml files
  • pkg/: includes KubeShare core components, SharePod, and API server clientset produced by code-generater.
  • code-gen.sh: code-generator script.
  • go.mod: KubeShare dependencies.

GPU Isolation Library

Please refer to Gemini.

TODO

Convert vGPU UUID update trigger method from dummy Pod creation handler to dummy Pod sending data to controller.
Add PodSpec.SchedulerName support to kubeshare-scheduler.

Issues

Any issues please open a GitHub issue, thanks.

kubeshare's People

Contributors

ncy9371 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.