GithubHelp home page GithubHelp logo

raguirreg / gap Goto Github PK

View Code? Open in Web Editor NEW

This project forked from molgenis/gap

0.0 1.0 0.0 205 KB

Genotyping Array Pipeline

License: GNU Lesser General Public License v3.0

HTML 7.79% Shell 66.62% R 13.43% Python 5.52% Perl 4.16% FreeMarker 2.47%

gap's Introduction

GAP

Short for Genotyping Array Pipeline. Consist of the following workflow:

   ⎛¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯⎞
   ⎜                    Iscan writes IDAT files to GATTACA {01,02}machines                ⎜
   ⎜                                                                                      ⎜
   ⎝______________________________________________________________________________________⎠
                                         v
                                         v
                                         v
   ⎛¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯⎞
   ⎜                    DARWIN conversion of IDAT files to GTC files                      ⎜
   ⎜                    takes place on GATTACA {01,02}machines                            ⎜
   ⎝______________________________________________________________________________________⎠
                                         v
                                         v  > > > > > > > > > > GAP_Automated CopyRawDataToPRM [stores .IDAT and .GTC files on permanent storage system]
                                         v  > > > > > > > > > > GAP_Automated Start Pipeline [automatically start Pipeline when new run has finished]
                                         v
   ⎛¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯⎞
   ⎜                         Run GAP Pipeline                                             ⎜
   ⎜                         takes place on [zinc-finger of leucine-zipper]               ⎜
   ⎝______________________________________________________________________________________⎠
                                         v
                                         v  > > > > > > > > > > GAP_Automated CopyProjectDataToPRM [stores data produced by pipeline on permanent storage system]
                                         v
   ⎛¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯⎞
   ⎜                  DARWIN split pipeline output file per sample,                       ⎜
   ⎜                  calculate  standard deviation and store all data in array database  ⎜
   ⎝______________________________________________________________________________________⎠

GAP Pipeline

The GAP pipeline consist of 3 steps:

1 Create_Callrate_file

This step creates a file containing Callrate information per sample.
The Callrate gives an idea of how many percent of the SNPs on the array are performing well.
If this number is below 0.97 we know the data is of pour quality.

Fileformat:

Sample ID '/t' Call Rate '/t' Gender

2 Make_Final_PennCNV_report

This step creates a file containing per SNP information about the log ratio and the B allel frequency of the specific snp.

Fileformat:
Name '/t' Chromosome '/t' Position '/t' Sample1.GType '/t' Sample1.LogRratio '/t' Sample1.B Allele Freq '/t' Sample2.GType '/t' Sample2.LogRratio '/t' Sample2.B Allele Freq

The log R ratio and B allele Frequency  per SNP are used by Nexus (commercial software) to call CNV's

3 CopyToResults Dir

This step copies the results to the ${projectname}/${resultsDir}

GAP_Automated steps which are not implemented yet are:

GAP_Automated CopyProjectDataToPRM
GAP_Automated CopyRawdataToPRM

GAP_pipeline steps yet to come:

Create array input for concordance check NGS_Data 
Add extra location to put output files from a project so DARWIN can use this as input

gap's People

Contributors

gerbenvandervries avatar benjaminsm avatar marieke-bijlsma avatar kdelange avatar ealopera avatar pneerincx avatar roankanninga avatar raguirreg avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.