Hydrosphere Mist is a service for exposing analytical jobs and machine learning models as web services.
Mist provides an API for Scala & Python Apache Spark jobs and for machine learning models trained in Apache Spark.
It implements Spark as a Service and creates a unified API layer for building enterprise solutions and services on top of a big data stack.
Discover more Hydrosphere Mist use cases.
Table of Contents
- Realtime low latency models serving/scoring
- Spark Contexts orchestration - Cluster of Sark Clusters: manages multiple Spark contexts in separate JVMs or Dockers
- Exposing Apache Spark jobs through REST API
- Spark 2.1.0 support!
- HTTP & Messaging (MQTT) API
- Scala and Python Spark jobs support
- Support for Spark SQL and Hive
- High Availability and Fault Tolerance
- Self Healing after driver program failure
- Powerful logging
- Clear end-user API
- jdk = 8
- spark >= 1.5.2 (earlier versions were not tested)
- MQTT Server (optional)
Run Docker:
docker run -p 2003:2003 -v /var/run/docker.sock:/var/run/docker.sock -d hydrosphere/mist:master-2.1.0 mist
Run Jar:
sbt -DsparkVersion=${SPARK_VERSION} mistRun
sbt "project examples" package
curl --header "Content-Type: application/json" -X POST http://localhost:2003/api/simple-context --data '{"numbers": [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]}'
Check out Complete Getting Started Guide
- Build the project
git clone https://github.com/hydrospheredata/mist.git
cd mist
sbt -DsparkVersion=2.1.0 assembly
- Run
./bin/mist start master
# clone mist repo
git clone https://github.com/Hydrospheredata/mist
# available spark versions: 1.5.2, 1.6.2, 2.0.2, 2.1.0
export SPARK_VERSION=2.1.0
docker create --name mist-${SPARK_VERSION} -v /usr/share/mist hydrosphere/mist:tests-${SPARK_VERSION}
docker run --name mosquitto-${SPARK_VERSION} -d ansi/mosquitto
docker run --name hdfs-${SPARK_VERSION} --volumes-from mist-${SPARK_VERSION} -d hydrosphere/hdfs start
# run tests
docker run -v /var/run/docker.sock:/var/run/docker.sock --link mosquitto-${SPARK_VERSION}:mosquitto --link hdfs-${SPARK_VERSION}:hdfs -v $PWD:/usr/share/mist hydrosphere/mist:tests-${SPARK_VERSION} tests
# or run mist
docker run -v /var/run/docker.sock:/var/run/docker.sock --link mosquitto-${SPARK_VERSION}:mosquitto --link hdfs-${SPARK_VERSION}:hdfs -v $PWD:/usr/share/mist hydrosphere/mist:tests-${SPARK_VERSION} mist
- Complete Getting Started Guide
- Learn from Use Cases and Tutorials
- Learn about Mist Routers
- Configure mist to make it fast and reliable
Mist Version | Scala Version | Python Version | Spark Version |
---|---|---|---|
0.1.4 | 2.10.6 | 2.7.6 | >=1.5.2 |
0.2.0 | 2.10.6 | 2.7.6 | >=1.5.2 |
0.3.0 | 2.10.6 | 2.7.6 | >=1.5.2 |
0.4.0 | 2.10.6, 2.11.8 | 2.7.6 | >=1.5.2 |
0.5.0 | 2.10.6, 2.11.8 | 2.7.6 | >=1.5.2 |
0.6.5 | 2.10.6, 2.11.8 | 2.7.6 | >=1.5.2 |
0.7.0 | 2.10.6, 2.11.8 | 2.7.6 | >=1.5.2 |
0.8.0 | 2.10.6, 2.11.8 | 2.7.6 | >=1.5.2 |
0.9.1 | 2.10.6, 2.11.8 | 2.7.6 | >=1.5.2 |
0.10.0 | 2.10.6, 2.11.8 | 2.7.6 | >=1.5.2 |
master | 2.10.6, 2.11.8 | 2.7.6 | >=1.5.2 |
- Persist job state for self healing
- Super parallel mode: run Spark contexts in separate JVMs
- Powerful logging
- RESTification
- Support streaming contexts/jobs
- Reactive API
- Realtime ML models serving/scoring
- CLI
- Web Interface
- Apache Kafka support
- Bi-directional streaming API
- AMQP support
- Getting Started
- Use Cases & Tutorials
- CLI
- Scala & Python Mist DSL
- REST API
- Streaming API
- Code Examples
- Configuration
- License
- Logging
- Low level API Reference
- Namespaces
- Changelog
- Tests
Please report bugs/problems to: https://github.com/Hydrospheredata/mist/issues.