Sparkling Water

Requirements
Contributing
Issues
Mailing List
Binary downloads
Making a build
Sparkling Shell
Running Examples
Additional Examples
Docker Support
FAQ

Sparkling Water integrates H₂O's fast scalable machine learning engine with Spark.

Requirements

Linux or OS X (Windows support is pending)
Java 7
Spark 1.2.0
- SPARK_HOME shell variable must point to your local Spark installation

Contributing

Look at our list of JIRA tasks for new contributors or send your idea to [email protected].

Issues

For issues reporting please use JIRA at http://jira.h2o.ai/.

Mailing list

Follow our H2O Stream.

Downloads of binaries

Sparkling Water - Latest version

Making a build

Use the provided gradlew to build project:

./gradlew build

To avoid running tests, please, use -x test option

Sparkling shell

The Sparkling shell provides a regular Spark shell that supports creation of an H₂O cloud and execution of H₂O algorithms.

First, build a package containing Sparkling water:

./gradlew assemble

Configure the location of Spark cluster:

export SPARK_HOME="/path/to/spark/installation"
export MASTER="local-cluster[3,2,1024]"

In this case local-cluster[3,2,1024] points to embedded cluster of 3 worker nodes, each with 2 cores and 1G of memory.

And run Sparkling Shell:

bin/sparkling-shell

Sparkling Shell accepts common Spark Shell arguments. For example, to increase memory allocated by each executor use the spark.executor.memory parameter: bin/sparkling-shell --conf "spark.executor.memory=4g"

Running examples

Build a package that can be submitted to Spark cluster:

./gradlew assemble

Set the configuration of the demo Spark cluster, for example; local-cluster[3,2,1024]

export SPARK_HOME="/path/to/spark/installation"
export MASTER="local-cluster[3,2,1024]"

In this example, the description local-cluster[3,2,1024] causes the creation of an embedded cluster consisting of 3 workers.

And run the example:

bin/run-example.sh

For more details about the demo, please see the README.md file in the examples directory.

Additional Examples

You can find more examples in the examples folder.

Docker Support

See docker/README.md to learn about Docker support.

FAQ

Where do I find the Spark logs?

Look for $SPARK_HOME/work/app-XXX. The last part of the address is the name of your application.

Spark is too slow during start or H2O is not able to cluster.

Configure the Spark variable SPARK_LOCAL_IP. For example:

export SPARK_LOCAL_IP='127.0.0.1'

How do I increase the amount of memory assigned to the Spark executors in Sparkling Shell?

Sparkling Shell accepts common Spark Shell arguments. For example, to increase the amount of memory allocated by each executor, use the spark.executor.memory parameter: bin/sparkling-shell --conf "spark.executor.memory=4g"

dsdinter / sparkling-water Goto Github PK

sparkling-water's Introduction

Sparkling Water

Requirements

Contributing

Issues

Mailing list

Downloads of binaries

Making a build

Sparkling shell

Running examples

Additional Examples

Docker Support

FAQ

sparkling-water's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs