GithubHelp home page GithubHelp logo

jmaassen / sv-callers Goto Github PK

View Code? Open in Web Editor NEW

This project forked from googlingthecancergenome/sv-callers

0.0 3.0 0.0 153.25 MB

A user-friendly, portable and scalable workflow to detect structural variants in whole genome sequencing data

License: Apache License 2.0

Python 100.00%

sv-callers's Introduction

sv-callers

Build Status DOI

Structural variants (SVs) are an important class of genetic variation implicated in a wide array of genetic diseases. sv-callers is a Snakemake-based workflow that combines several state-of-the-art tools for detecting SVs in whole genome sequencing (WGS) data. The workflow is easy to use and deploy on any Linux-based machine. In particular, the workflow supports automated software deployment, easy configuration and addition of new analysis tools as well as enables to scale from a single computer to different HPC clusters with minimal effort.

Dependencies

1. Clone this repo.

git clone https://github.com/GooglingTheCancerGenome/sv-callers.git
cd sv-callers/snakemake

2. Install dependencies.

wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh # python 3
bash miniconda.sh # install & add conda to your PATH
source ~/.bashrc
conda update -y conda # update conda
conda create -y -n wf && source activate wf # create & activate the environment
conda install -y -c bioconda snakemake
conda install -y -c nlesc xenon-cli # optional but recommended;)

3. Configure and execute the workflow.

  • config files: analysis.yaml and environment.yaml
  • input files:
    • example data provided in the sv-callers/data directory
    • tumor/normal (T/N) samples in *.bam (incl. index files)
      • list T/N sample pairs to compare in samples.csv
    • reference genome in .fasta (incl. index files)
  • output files: somatic SVs in .vcf (incl. index files)

Note: One pair of T/N samples will generate eight SV calling jobs (i.e. 1 x Manta, 1 x LUMPY, 1 x GRIDSS and 5 x DELLY) and one post-processing job that merges DELLY (per SV type) call sets into one VCF file. A workflow instance can be found here.

# dry run doesn't execute anything only checks I/O files
snakemake -np
# dummy run (default) executes 'echo' for each caller and outputs (dummy) *.vcf files
snakemake -C echo_run=1

Submit to Grid Engine-based cluster

#   SV calling:
#     set echo_run=0 and increase the runtime limit e.g. to 60 (in minutes)
#     and/or selectively enable_callers="['manta','delly']" etc.
snakemake -C echo_run=1 --use-conda --latency-wait 30 --jobs  9 \
--cluster 'xenon scheduler gridengine --location local:// submit --name smk.{rule} --inherit-env --option parallel.environment=threaded --option parallel.slots={threads} --max-run-time 1 --max-memory {resources.mem_mb} --working-directory . --stderr stderr-\\\$JOB_ID.log --stdout stdout-\\\$JOB_ID.log' &>smk.log&

Submit to Slurm-based cluster

snakemake -C echo_run=1 --use-conda --latency-wait 30 --jobs  9 \
--cluster 'xenon scheduler slurm --location local:// submit --name smk.{rule} --inherit-env --procs-per-node {threads} --start-single-process --max-run-time 1 --max-memory {resources.mem_mb} --working-directory . --stderr stderr-%j.log --stdout stdout-%j.log' &>smk.log&

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.