GithubHelp home page GithubHelp logo

walkerh / pipe-o-matic Goto Github PK

View Code? Open in Web Editor NEW
10.0 10.0 4.0 442 KB

a framework for combining third party executables to construct data pipelines

License: GNU General Public License v3.0

Shell 27.04% Python 72.96%

pipe-o-matic's People

Contributors

divyakalra avatar jerryatmda avatar walkerh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pipe-o-matic's Issues

Documentation for new users

Need to develop the first version of a "getting started" guide. Probably the best way to do this is for a pair of developers to work through a deployment. One will do the work while the other captures a tutorial.

Add tool for automatic generation of deployments.yaml

I continue to think about how to make setup of POM easier, especially deployments.yaml. Here is a more concrete description of how I see things working- please let me know if this matches up with what you think should happen:

A command ("pmaticsetup"?) should be available to generate a new deployments.yaml. This command would take a few parameters:

  • An optional location (the base directory) to write the deployments.yaml to, defaulting to somewhere like ~/.pmatic/ if not specified
  • If the directory doesn't exist, it will be created.
  • If it doesn't exist, a pipelines directory will be made inside the directory
  • If the deployments.yaml already exists, the program will abort with an appropriate error message unless a force flag is passed (maybe '-f').

When run, pmaticsetup will parse a YAML database of application names and parameters (interrogators? as we discussed in a previous email), possibly called interrogators.yaml. These executables will be detected using 'which', and if located, added to the deployments.yaml. Deployments.yaml will list (for each program) the command to invoke it, the name of the program, the flags necessary to print out a version number, and the regex necessary to extract the version number from the command's output.

When pmaticrun is invoked without finding a deployments.yaml, it should suggest how to configure a proper directory or invoke pmaticsetup.

Possible enhancements in the future:

  • When run, any applications that are found with this discovery process could be added to an existing deployments.yaml if they are not already in the file. This would allow you to update your existing deployments.yaml after installing a new application without overwriting any manual entries.
  • Add a parameter to manually specify the location of the interrogators.yaml.
  • Add a parameter to add search directories other than $PATH.

Create sensible defaults for PYTHONPATH & PMATIC_BASE

Use the location of the bin directory containing to top-level scripts. Assume the following deployed layout by default:

  • bin/
    • pmaticrevert
    • pmaticrun
    • pmaticstatus
    • etc
  • lib/
    • pmatic.py
    • yaml & other dependencies
  • pmatic_base/
    • deployments.yaml
    • pipelines/

Implementation

PMATIC_BASE = absolute path of "pmatic_base"

If PYTHONPATH is not set, we actually update sys.path as if...

PYTHONPATH = absolute path of "lib"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.