GithubHelp home page GithubHelp logo

Comments (7)

thieman avatar thieman commented on July 19, 2024

Hey there! Unfortunately, I'm not familiar with chronos or azkaban, but by a quick glance at their product pages they seem to be good examples of why other scheduling projects weren't great fits for me. Namely:

  • Azkaban, like a bunch of other schedulers, was created with a specific task in mind (managing Hadoop jobs). Spotify's Luigi, and a bunch of others, also seem to target map-reduce workflows. Dagobah is meant to be pretty much a cron replacement.
  • Chronos is a cron replacement! That's awesome, but it seems like total overkill for what I needed to do. I wanted something simple that I could run on one machine to manage my nightly analytics updates.

Now, to your questions:

  1. Could you go into a bit more detail here? Dagobah is currently made for running on one machine, so if you want to influence other machines you'll need to do that in a different framework. Fabric, as you suggest, or something like Celery would be good for this. If I'm misunderstanding you, please let me know.
  2. The task logs do get permanently stored in your backend. There's no way to examine them currently in the web app, though.
  3. Yes! This is a good idea. I'll make an issue for it.

from dagobah.

utsengar avatar utsengar commented on July 19, 2024

Azkaban, like a bunch of other schedulers, was created with a specific task in mind (managing Hadoop jobs). Spotify's Luigi, and a bunch of others, also seem to target map-reduce workflows. Dagobah is meant to be pretty much a cron replacement. Chronos is a cron replacement! That's awesome, but it seems like total overkill for what I needed to do. I wanted something simple that I could run on one machine to manage my nightly analytics updates.

This is the exact reason why I liked dagobah. Its simple and is flexible enough to be used for other usecases. I am also looking for a cron replacement. Azkaban is meant for hadoop job automation but it can also act as a cron replacement (with retries, DAG etc). Chronos is awful and completely an overkill.

Could you go into a bit more detail here? Dagobah is currently made for running on one machine, so if you want to influence other machines you'll need to do that in a different framework. Fabric, as you suggest, or something like Celery would be good for this. If I'm misunderstanding you, please let me know.

You got it right. Dagobah currently runs on one machine. But I am trying to add fabric to Dagobah so that it can execute commands on a remote machine. Celery is good too, but the management overhead is less if I just use fabric and execute remote commands.

The task logs do get permanently stored in your backend. There's no way to examine them currently in the web app, though.

Good to know, I will try to expose this data in the web app.

from dagobah.

utsengar avatar utsengar commented on July 19, 2024

I have added remote task execution here: https://github.com/utkarsh2012/dagobah/blob/master/dagobah/core/core.py#L634 it works nicely with the existing stuff (UI needs some work like updating remote machine endpoint, needs tests). It uses paramiko and spawns processes for every remote request.

What do you think?

from dagobah.

thieman avatar thieman commented on July 19, 2024

Hey @utkarsh2012, this looks awesome! I will review your branch when I get some time and get back to you.

from dagobah.

utsengar avatar utsengar commented on July 19, 2024

Looks like Travis CI build is broken, missed to add dependency for paramiko in setup.py and requirements.txt.

Also the UI might need some work, I hacked up the solution in a day to see how will it work. I might submit more fixed if I find bugs.

from dagobah.

rclough avatar rclough commented on July 19, 2024

Did this ever get included? I can't seem to find remote options in the web UI

from dagobah.

levonk avatar levonk commented on July 19, 2024

Just a minor correction, Azkaban can be used for anything, not just Hadoop based Map-Reduce. It can be used only as a superior cron (dependencies, transparency, partial workflows, etc...) only. You don't need hadoop at all.

from dagobah.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.