GithubHelp home page GithubHelp logo

pushkardakle / exgen Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 8.27 MB

An automated pipeline for parallel analysis of whole Exome and Genome data in an HPC environment

Shell 26.82% HTML 12.60% Perl 41.15% Groovy 19.44%

exgen's Introduction

ExGen[Archived]

ExGen is a bpipe and Perl based whole Exome and Genome analysis pipeline. The pipeline was designed primarily to take advantage of a multi-node HPC environment though it also supports single node environments. The pipeline can be used for

Major features of the pipeline are:-

  • Easy single command launch for any number of input samples
  • Customizable mail notifications on completion of pipeline/ individual stages
  • Monitoring of the run with sample and stage wise status
  • Out of the box support for running on HPC environment with multiple job managers eg. lsf, torque etc supported
  • Inbuilt validation of tools, parameters and input files
  • Customizable tool flow and easy resume from any intermediate step in case of failure
  • Integrated collection of relevant statistics from every stage and collation into a single excel file for easy comprison and summarization
  • All fastqc images are collected into a single directory for easy comparison of pre and post QC
  • Analysis of time taken per module/tool
  • Logging of all the executed commands for easy traceback of parameters

Limitations:-

  • Currently supports only paired end Illumina data out of the box. Though pipeline can be easily edited to single end mode, different toolbase etc.

TODO:-

  • Reduction of consumed disk space with compression of BAM files using CRAM or alternate toolkit
  • Relevant plots for the collected statistics
  • Creation of a dockerised container with all the dependencies
  • Add test cases for multi node/single node use case scenarios

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.