GithubHelp home page GithubHelp logo

hhy5277 / raspi-cluster Goto Github PK

View Code? Open in Web Editor NEW

This project forked from rcarmo/raspi-cluster

0.0 1.0 0.0 12.39 MB

Notes and scripts for setting up (yet another) Raspberry Pi computing cluster

License: Other

Makefile 8.43% Go 26.88% Dockerfile 6.43% Python 25.64% C 32.63%

raspi-cluster's Introduction

Stories in Ready

raspi-cluster

Pi 2

What?

A while ago I decided to build a small cluster of Raspberry Pi boards. I've since upgraded to Pi 2 boards, and this repository is used for versioning design notes, configuration files and sundry.

Why?

I wanted something challenging to do in terms of distributed processing, and lacked dedicated hardware to do it. There's a lot to be learned even from simple, unsophisticated solutions, and virtual machines can only do so much.

How?

The cluster consists of five nodes: a master and four slaves. The master acts as a gateway, DHCP and NFS server and the slave nodes get their IP address and /srv/jobs directory from it.

All slave nodes are identical -- completely identical, except for hostname and MAC address, and there is no need to log in and configure things manually for each node.

Here's a few more shots of the original version, with the 5-port PSU and the old Model B boards:

Cabled Power cords First assembly

In retrospect I probably ought to have gone for longer USB cables and moved all of the cabling to the USB side (since it leaves the SD card slot clear), but I also need to be able to see the activity lights, and the Pi isn't exactly designed for easy stacking.

A larger cluster is certainly feasible, but 5 boards is as much as I can power with the PSU I have.

Hardware

This is a partial list of the stuff I'm using (Amazon UK affiliate links):

Software

As a base OS, I'm currently using the Ubuntu 16.04 official image for the Pi 2, which works much better than Raspbian for my purposes (nevertheless, the configuration files in this repo should work in both systems)

The cluster is now running a mix of Docker Swarm and the occasional Clojure program using Hazelcast atop JDK 1.8, as well as Jupyter, which runs very nicely indeed and provides me with an agnostic, notebook-oriented front-end.

I have also set up Disco (and now Spark) on it and intend to fiddle with MPI, but so far I have plenty of ways to parallelize things.

It's a bit ironic to do some kinds of processing on merely 5GB of aggregated RAM, but I'm interested in the algorithms themselves and don't plan on doing something silly like tackling the next Netflix Prize with this -- besides, running things on low-end hardware is often the only way to do proper optimization.

List of packages involved so far:

  • etcd, which I'm now using to store (and distribute) configurations across nodes
  • Docker, which ships with Ubuntu 14.04 and makes it a lot easier to build and tear down environments. Currently trying to get 1.7 to build so I can use swarm and other niceties.
  • OpenVSwitch, which I'm using for playing around with network topologies
  • Jupyter, which provides me with a nice web front-end and basic Python parallel computing.
  • Spark, which has mostly replaced Disco for map/reduce jobs.
  • Dash, a real-time status dashboard (rewritten in Go, available under the dashboard folder here, and still being worked on)
  • A custom daemon that sends out a JSON-formatted multicast packet with system load, CPU usage and RAM statistics (written in raw C, available in tools)
  • ElasticSearch, which I'm using for storing metrics.
  • Oracle JDK 8
  • leiningen (which fetches Hazelcast and other dependencies for me, via this library)
  • Nightcode as a development environment (LightTable doesn't run on ARM, and a lot of my hobby coding these days is actually done on an ODROID)
  • distcc for building binaries slightly faster
  • dnsmasq for DHCP and DNS service

Here's what the cluster dashboard looks like:

Updated dashboard

But isn't the Raspberry Pi slow?

Well spotted, young person. It was, and the Pi 2, despite being a marked improvement, isn't exactly a supercomputer. But it's also cheap, and beggars can't be choosers.

Nevertheless, the current configuration provides me with 20 ARMv7 cores clocked at 1GHz, and that's nothing to sneeze at.

But I'm open to sponsoring so that I can upgrade this to have at least twice as many boards...

raspi-cluster's People

Contributors

ax42 avatar rcarmo avatar waffle-iron avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.