Comments (7)
Hey there! Unfortunately, I'm not familiar with chronos or azkaban, but by a quick glance at their product pages they seem to be good examples of why other scheduling projects weren't great fits for me. Namely:
- Azkaban, like a bunch of other schedulers, was created with a specific task in mind (managing Hadoop jobs). Spotify's Luigi, and a bunch of others, also seem to target map-reduce workflows. Dagobah is meant to be pretty much a
cron
replacement. - Chronos is a
cron
replacement! That's awesome, but it seems like total overkill for what I needed to do. I wanted something simple that I could run on one machine to manage my nightly analytics updates.
Now, to your questions:
- Could you go into a bit more detail here? Dagobah is currently made for running on one machine, so if you want to influence other machines you'll need to do that in a different framework. Fabric, as you suggest, or something like Celery would be good for this. If I'm misunderstanding you, please let me know.
- The task logs do get permanently stored in your backend. There's no way to examine them currently in the web app, though.
- Yes! This is a good idea. I'll make an issue for it.
from dagobah.
Azkaban, like a bunch of other schedulers, was created with a specific task in mind (managing Hadoop jobs). Spotify's Luigi, and a bunch of others, also seem to target map-reduce workflows. Dagobah is meant to be pretty much a cron replacement. Chronos is a cron replacement! That's awesome, but it seems like total overkill for what I needed to do. I wanted something simple that I could run on one machine to manage my nightly analytics updates.
This is the exact reason why I liked dagobah. Its simple and is flexible enough to be used for other usecases. I am also looking for a cron replacement. Azkaban is meant for hadoop job automation but it can also act as a cron replacement (with retries, DAG etc). Chronos is awful and completely an overkill.
Could you go into a bit more detail here? Dagobah is currently made for running on one machine, so if you want to influence other machines you'll need to do that in a different framework. Fabric, as you suggest, or something like Celery would be good for this. If I'm misunderstanding you, please let me know.
You got it right. Dagobah currently runs on one machine. But I am trying to add fabric to Dagobah so that it can execute commands on a remote machine. Celery is good too, but the management overhead is less if I just use fabric and execute remote commands.
The task logs do get permanently stored in your backend. There's no way to examine them currently in the web app, though.
Good to know, I will try to expose this data in the web app.
from dagobah.
I have added remote task execution here: https://github.com/utkarsh2012/dagobah/blob/master/dagobah/core/core.py#L634 it works nicely with the existing stuff (UI needs some work like updating remote machine endpoint, needs tests). It uses paramiko and spawns processes for every remote request.
What do you think?
from dagobah.
Hey @utkarsh2012, this looks awesome! I will review your branch when I get some time and get back to you.
from dagobah.
Looks like Travis CI build is broken, missed to add dependency for paramiko in setup.py and requirements.txt.
Also the UI might need some work, I hacked up the solution in a day to see how will it work. I might submit more fixed if I find bugs.
from dagobah.
Did this ever get included? I can't seem to find remote options in the web UI
from dagobah.
Just a minor correction, Azkaban can be used for anything, not just Hadoop based Map-Reduce. It can be used only as a superior cron (dependencies, transparency, partial workflows, etc...) only. You don't need hadoop at all.
from dagobah.
Related Issues (20)
- error when start dagobah
- Installation issue, missing gpost HOT 1
- dagobahd not working on Mac OS X HOT 1
- get_status method for job HOT 2
- Changing code to use py-dag as a package instead of the internal file HOT 2
- RuntimeError: can't start new thread
- How to run and debug the whole project? HOT 1
- Is it distributed?
- Timezone cron / scheduler
- Updating config file, changes not reflected HOT 6
- Run Dagobah from folder HOT 1
- Address jump HOT 6
- [CLOSED] reformat code
- [CLOSED] Reformat code
- [CLOSED] Reformat code
- [CLOSED] update setup requirements
- [CLOSED] add amazon ses service in emails
- [CLOSED] Close SSH connection for remote tasks
- [CLOSED] UnicodeEncodeError while sending basic email
- Fail to install dagobah HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dagobah.