GithubHelp home page GithubHelp logo

eigenflow's Introduction

Eigenflow

Eigenflow is an orchestration platform for building resilient and scalable data pipelines.

Pipelines can be split into multiple process stages which are persisted, resumed and monitored automatically.

Build Status Latest Version

Quick example:

case object Download extends ProcessStage
case object Transform extends ProcessStage
case object Analyze extends ProcessStage
case object SendReport extends ProcessStage

val download = Download {
  downloadReport() // returns file path/url to downloaded report
} retry (1.minute, 10) // in case of error retry every minute 10 times before failing

val transform = Transform { reportFile =>
  buildParquetFile(reportFile) // returns file path/url
}

val analyze = Analyze { parquetFile =>
  callSparkToAnalyze(parquetFile) // returns new report file path/url
}

val sendReport = SendReport { newReportFile =>
  sendReportToDashboard(newReportFile)
}

override def executionPlan = download ~> transform ~> analyze ~> sendReport

Once the stage methods (downloadReport, buildParquetFile etc in the example above) are defined the rest is done automatically: see complete list of features.

What it is good for

Eigenflow was created for managing periodic long-running ETL processes with automatic recovery of failures. When stages performance is important and there is a need to collect statistics and monitor processes.

What it may not be good for

Eigenflow is a platform somewhere between "simple cron jobs" and complex enterprise processes, where an ESB software would usually be used. Thus, it probably should not be considered for primitive jobs and very complex processes where SOA is involved.

Main Features

  • Stages: DSL for building type safe data pipeline (Scala only).
  • Time Management: configurable strategy for catching up when "run cycles" are missing.
  • Recovery: if a process failed the platform starts replaying the failed stage, by default.
  • Error Handling: configurable recovery strategies per stage.
  • Metrics: Each process run, stage switch or failure and custom messages are published as events to a messaging system. Kafka is supported by default, but a custom messaging system adapter can be integrated.
  • Monitoring: provided as a module, which uses grafana and influxDB to store and display statistics.
  • Notifications: provided as a module, supports email and slack notifications.

Custom monitoring and notification systems can be developed. Messages are pushed to a message queue (Kafka is supported out of the box) and can be consumed by a message queue consumer.

Note: there is no connectors to 3rd party systems out of the box.

System Requirements

Runtime

  • JVM 8

Development

  • Scala 2.11.0 or higher
  • Sbt 0.13.7
  • DevOps scripts are currently tested on Mac OS 10.10 only

eigenflow's People

Contributors

dmitri-carpov avatar jonas avatar wadewaldron avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.