GithubHelp home page GithubHelp logo

igg-bioinfo / diva.wes Goto Github PK

View Code? Open in Web Editor NEW

This project forked from solida-core/diva

1.0 1.0 3.0 575 KB

DiVA (DNA Variant Analysis) is a pipeline for Next-Generation Sequencing Exome data analysis

License: GNU General Public License v3.0

Python 92.17% R 7.83%

diva.wes's Introduction

depends snakemake

DiVA.wes

This is a fork of DiVA (DNA Variant Analysis), a Snakemake-based pipeline for Next-Generation Sequencing Exome data analysis, developed at CRS4 Next Generation Sequencing Core Facility. Software dependencies are directly managed by Snakemake using Conda, ensuring the reproducibility of the workflow according to FAIR principles.

In this repo we retained the first part of the analysis, from FASTQ to the recalibrated VCF following GATK Best Practices, and quality control. This pipeline should be executed to generate a master VCf including all the samples, and should re-executed when new samples are available.

Annotation is implemented in DiVA.annotate, which can be used to extract subset of samples from the master VCF for variant annotation and prioritization.

This is an example of folder organization. In parenthesis the name of the pipeline executed in each folder:

   ROOT
    │
    ├── wes_master (diva.wes)
    |
    ├── project_A (diva.annotate)
    |
    ├── project_B (diva.annotate)
    |
    ├── project_N (diva.annotate)
    

Running DiVA.wes

  • Clone the repository from git-hub:
git clone https://github.com/igg-bioinfo/diva.wes.git
  • Rename the folder, from diva.wes to your PROJECT_NAME:
mv diva.wes PROJECT_NAME
  • cd into the newly created folder:
cd PROJECT_NAME
  • Edit the configuration files in conf subfolder:

    • config.yaml - paths to your reference files: genome, target regions, etc.
    • samples.tsv - associate samples to FASTQ files
    • samples.ped - pedigree file in ped format
    • units.tsv - paths to FASTQ files
  • Edit the Snakefile and uncomment the output files you need

  • If conda package manager is not available, install miniconda.

  • Create a virtual environment containing snakemake, as suggested here. First install mamba as a replacement of the default conda solver:

conda install -c conda-forge mamba
  • Then, install snakemake:
mamba env create --name snakemake --file environment.yaml
  • Activate the enviroment:
conda activate snakemake
  • Run snakemake in dry-run mode to check if everything is fine. YOUR_WORKING_DIR could follow the format: YYYY-MM-DD.
snakemake --cores 32 --use-conda --configfile conf/config.yaml --printshellcmds -d YOUR_WORKING_DIR --rerun-incomplete --keep-going --dryrun
  • For verbose output:
snakemake --cores 32 --use-conda --configfile conf/config.yaml --printshellcmds -d YOUR_WORKING_DIR --rerun-incomplete --keep-going --verbose --reason --dryrun
  • If you are happy with the --dryrun, run snakemake:
snakemake --cores 32 --use-conda --configfile conf/config.yaml --printshellcmds -d YOUR_WORKING_DIR --rerun-incomplete --keep-going --conda-frontend mamba

Tip: For large projects, we suggest to run snakemake in a screen session.

diva.wes's People

Contributors

massiddamt avatar gmauro avatar puva avatar martarusmini avatar miciac avatar ordet76 avatar vincenzorallo avatar

Stargazers

 avatar

Watchers

James Cloos avatar

diva.wes's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.