comcast / kube-yarn Goto Github PK

View Code? Open in Web Editor NEW

166.0 23.0 81.0 92 KB

Running YARN on Kubernetes with PetSet controller.

License: Apache License 2.0

Makefile 64.33% Shell 35.67%

kube-yarn's Introduction

YARN on Kubernetes

Tags and Dockerfiles

v1.0.0, latest (Dockerfile), Hadoop 2.7.2, Zeppelin 0.7.0

Docker image for kube-yarn

Image for deploying the kube-yarn artifacts without any local dependencies.

Example usage:

Invoke the default make target to deploy the full stack using the kubeconfig from your current home directory.

docker run -it --rm -v ${HOME}/.kube/config:/root/.kube/config:ro danisla/kube-yarn:latest

Make sure to mount any additional volumes for files referenced within the kube config.

To remove the resources by invoking the clean target:

docker run -it --rm -v ${HOME}/.kube/config:/root/.kube/config:ro danisla/kube-yarn:latest clean

Architecture Diagram

StatefulSet Overview

The hadoop components are boostrapped using files from a ConfigMap to provide the init script and config xml files. This allows users to fully customize their distribution for their use cases.

The ACP env var can also be set to a comma-separated list of URLs to be downloaded and added to the classpath at runtime.

Logs from the ${HADOOP_PREFIX}/logs directory are tailed so that they can be viewed by attaching to the container.

`hdfs-nn` - HDFS Name Node

The namenode daemon runs in this pod container.

Currently, only 1 namenode is supported (no HA w/Zookeeper or secondary namenode).

`hdfs-dn` - HDFS Data Node

The datanode daemon runs in this pod container.

There can be 1 or more of these, scaled by changing the number of replicas in the spec.

`yarn-rm` - YARN Resource Manager

The resource manager daemon runs in this pod container.

Currently, only 1 resource manager is supported (no HA w/Zookeeper).

The WebUI can be accessed using the service port 8088 or at localhost:8088 after running make pf.

`yarn-nm` - YARN Node Manager

The node manager daemon runs in this pod container.

There can be 1 or more of these, scaled by changing the number of replicas in the pod spec.

The amount of vcores and memory registered with the resource manager is reflected and infered from the pod spec resources using the Downward API. These are made available to the container via the env vars MY_CPU_LIMIT and MY_MEM_LIMIT respectively and then added to the yarn-site.xml at runtime using the bootstrap script.

`zeppelin` - Zeppelin Notebook

The Zeppelin notebook is run in this pod container and can be used to run Spark jobs or hadoop jobs using the %sh shell interpreter.

The K8S yarn cluster config is mounted from the same ConfigMap over the top of the Zeppelin image dir to give it access to the cluster without modifying the base image.

The Zeppelin web app can be accessed using the service port 8080 or at localhost:8081 after running make pf.

Running locally

This repo uses minikube to start a k8s cluster locally.

The Makefile contains targets for starting the cluster and helper targets for kubectl to apply the K8S manifests and interact with the pods.

When running locally with minikube, make sure your VM has enough resources, you should set the number of cpus to 8 and memory to 8192. If not set, the pods won't have enough resources to fully start and will be stuck in the Pending creation phase.

Starting minikube manually:

minikube start --cpus 8 --memory 8192

Or, start with the Makefile which will download minikube and start the cluster:

make minikube

Start the YARN cluster:

make

This will create all of the components for the cluster.

Run this to create port forwards to localhost:

make pf

You should now be able to access the following:

YARN WebUI: http://localhost:8088
Zeppelin: http://localhost:8081

Full stack test

Test hdfs, yarn and mapred using TestDFSIO, submitted from one of the node managers:

make test

Which runs this command on yarn-nm-0:

/usr/local/hadoop/bin/hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.6.0-tests.jar TestDFSIO -write -nrFiles 5 -fileSize 128MB -resFile /tmp/TestDFSIOwrite.txt

Spark on YARN in Zeppelin

In your browser, go to Zeppelin at: http://localhost:8081

Create a new note and run this in a paragraph:

sc.parallelize(1 to 1000).count

Press shift-enter to execute the paragraph

The first command executed creates the spark job on yarn and will take a few seconds, then you should get the result 1000 when complete.

Make targets:

`init`

Create the namespace, configmaps service account and hosts-disco service.

`create-apps`

Creates hdfs, yarn and zeppelin apps.

`pf`

Creates local port forwards for yarn and zeppelin.

`dfsreport`

Gets the state of HDFS.

`get-yarn-nodes`

Lists the registered yarn nodes by executing this in the node manager pod: yarn node -list

`shell-hdfs-nn-0`

Drops into a shell on the namenode

`shell-yarn-rm-0`

Drops into a shell on the resource manager.

`shell-zeppelin-0`

Drops into a shell on the zeppelin container.

Shutting down

make clean

Shutdown the cluster

minikube stop

Or:

make stop-minikube

kube-yarn's People

Contributors

Stargazers

Watchers

kube-yarn's Issues

Are you all still using this?

Just wondering what the support / community looks like

Manifest is not available when run official docker image

i followed the instruction and ran: docker run -it --rm -v ${HOME}/.kube/config:/root/.kube/config:ro danisla/kube-yarn:latest

always error like below:
kubectl --namespace yarn-cluster create -f manifests/yarn-cluster-namespace.yaml
error: error validating "manifests/yarn-cluster-namespace.yaml": error validating data: the server could not find the requested resource; if you choose to ignore these errors, turn validation off with --validate=false
Makefile:48: recipe for target 'manifests/yarn-cluster-namespace.yaml' failed
make: *** [manifests/yarn-cluster-namespace.yaml] Error 1

from Dockerfile, manifest are builtin docker image along with Makefile, looks like Makefile is avaiable but manifest is not available?

update hyperkube version in Dockerfile

It seems current build cannot start a yarn cluster.

HDFS data/metadata will lose when pod crash or restart.

I saw that hdfs nn/dn statefulset store hdfs data and metadata on container local dir(/root/hdfs/namenode, /root/hdfs/datanode) directly, not use k8s volume or persistvolume to persist hdfs data/metadata.
So that HDFS data/metadata will lose when pod crash or restart, we need find a way to persist data by leverage k8s storage tech, like pv/pvc.
.

Not able to run the examples programs on YARN

Hi - I created a kubernetes cluster (3 node cluster - 8 VCPU & 30 GB RAM on each node) and executed the makefile in this repo to create the hdfs + yarn clusters. I am able to read/write to hdfs. When I execute the sample programs on YARN, its stuck. Please see the below output. Any help would be really appreciated.

root@yarn-rm-0:/usr/local/hadoop# ./bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar pi 16 1000
Number of Maps  = 16
Samples per Map = 1000
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Wrote input for Map #10
Wrote input for Map #11
Wrote input for Map #12
Wrote input for Map #13
Wrote input for Map #14
Wrote input for Map #15
Starting Job
17/06/11 08:15:27 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/06/11 08:15:28 INFO input.FileInputFormat: Total input files to process : 16
17/06/11 08:15:28 INFO mapreduce.JobSubmitter: number of splits:16
17/06/11 08:15:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1497166645429_0006
17/06/11 08:15:28 INFO impl.YarnClientImpl: Submitted application application_1497166645429_0006
17/06/11 08:15:28 INFO mapreduce.Job: The url to track the job: http://yarn-rm-0:8088/proxy/application_1497166645429_0006/
17/06/11 08:15:28 INFO mapreduce.Job: Running job: job_1497166645429_0006

DataNode register to NameNode using the node machine's IP address

After I started HDFS cluster with kubernetes, the DataNode using local machine's IP address to register to NameNode so that NameNode can not communicate with DataNode, actually this HDFS cluster is not available, because I can't put data into it!

For example

Pod IP: 172.30.10.22
Node IP: 172.20.0.115
DataNode register to NameNode with IP and port 172.20.0.115:50010
In fact the real address of DataNode is 172.30.10.22

How to fix this problem?

Any plan to support HA w/Zookeeper ?

Any plan to support HA w/Zookeeper for Name Node and Resource Manager?

Namenode start failed

Namenode start faild, because it can not assigned to hdfs-nn:9000.
There is a bug on bootstrap.sh.

sed -i s/hdfs://hdfs-nn:9000/0.0.0.0:9000/ /usr/local/hadoop/etc/hadoop/core-site.xml

Exposing namenode to external application

I deployed your setup on our existing kubernetes cluster. As of now, i am only able to access zeppelin UI externally. i want my external client application to push data to hdfs. But looks like it is a headless service and not exposed outside the cluster. If i try to change the type of service to NodePort or LoadBalancer, then the jobs fail. What changes can i make inorder to expose the namenode to external applications?

Have you tried running a Spark Cluster on top of this?

Sorry not really an issue.

why not use volumeClaimTemplates in dn?

expose dn replicas config, meanwhile using one pvc????

comcast / kube-yarn Goto Github PK

kube-yarn's Introduction

YARN on Kubernetes

Tags and Dockerfiles

Docker image for kube-yarn

Example usage:

Architecture Diagram

StatefulSet Overview

hdfs-nn - HDFS Name Node

hdfs-dn - HDFS Data Node

yarn-rm - YARN Resource Manager

yarn-nm - YARN Node Manager

zeppelin - Zeppelin Notebook

Running locally

Start the YARN cluster:

Full stack test

Spark on YARN in Zeppelin

Make targets:

init

create-apps

pf

dfsreport

get-yarn-nodes

shell-hdfs-nn-0

shell-yarn-rm-0

shell-zeppelin-0