GithubHelp home page GithubHelp logo

global19-atlassian-net / pypeflow Goto Github PK

View Code? Open in Web Editor NEW

This project forked from pacificbiosciences/pypeflow

0.0 2.0 0.0 2.31 MB

a simple lightweight workflow engine for data analysis scripting

License: BSD 3-Clause Clear License

Python 15.57% HTML 83.81% Shell 0.23% Makefile 0.05% Java 0.34%

pypeflow's Introduction

What is pypeFLOW

pypeFLOW is light weight and reusable make / flow data process library written in Python.

Most of bioinformatics analysis or general data analysis includes various steps combining data files, transforming files between different formats and calculating statistics with a variety of tools. Ian Holmes has a great summary and opinions about bioinformatics workflow at http://biowiki.org/BioinformaticsWorkflows. It is interesting that such analysis workflow is really similar to constructing software without an IDE in general. Using a "makefile" file for managing bioinformatics analysis workflow is actually great for generating reproducible and reusable analysis procedure. Combining with a proper version control tool, one will be able to manage to work with a divergent set of data and tools over a period of time for a project especially when there are complicate dependence between the data, tools and customized code for the analysis tasks.

However, using "make" and "makefile" implies all data analysis steps are done by some command line tools. If you have some customized analysis tasks, you will have to write some scripts and to make them into command line tools. In my personal experience, I find it is convenient to bypass such burden and to combine those quick and simple steps in a single scripts. The only caveat is that if an analyst does not save the results of any intermediate steps, he or she has to repeat the computation all over again for every steps from the beginning. This will waste a lot of computation cycles and personal time. Well, the solution is simple, just like the traditional software building process, one have to track the dependencies and analyze them and only reprocess those parts that are necessary to get the most up-to-date final results.

General Design Principles

  • Explicitly modeling data and task dependency
  • Support declarative programming style within Python while maintaining some thing that imperative programming dose the best
  • Utilize RDF meta-data framework
  • Keep it simple if possible

Features

  • Multiple concurrent task scheduling and running
  • Support task as simple shell script (considering clustering job submission in mind)
  • reasonable simple interface for declarative programming

General Installation

pypeFlow uses the standard python setup.py for installation:

python setup.py install

Once install, a brief documentation can be generated by:

cd doc
make html

The generate sphinx html document can be viewed by point your web browser to _build/html/index.html in the doc directory.

DISCLAIMER

THIS WEBSITE AND CONTENT AND ALL SITE-RELATED SERVICES, INCLUDING ANY DATA, ARE PROVIDED "AS IS," WITH ALL FAULTS, WITH NO REPRESENTATIONS OR WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES OF MERCHANTABILITY, SATISFACTORY QUALITY, NON-INFRINGEMENT OR FITNESS FOR A PARTICULAR PURPOSE. YOU ASSUME TOTAL RESPONSIBILITY AND RISK FOR YOUR USE OF THIS SITE, ALL SITE-RELATED SERVICES, AND ANY THIRD PARTY WEBSITES OR APPLICATIONS. NO ORAL OR WRITTEN INFORMATION OR ADVICE SHALL CREATE A WARRANTY OF ANY KIND. ANY REFERENCES TO SPECIFIC PRODUCTS OR SERVICES ON THE WEBSITES DO NOT CONSTITUTE OR IMPLY A RECOMMENDATION OR ENDORSEMENT BY PACIFIC BIOSCIENCES.

pypeflow's People

Contributors

cschin avatar pb-jchin avatar pb-dseifert avatar cdunn2001 avatar bli2023 avatar pb-isovic avatar bredelings avatar raj76 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.