GithubHelp home page GithubHelp logo

vaquita's Introduction

Build Status

Vaquita accurately identifies structural variations using split-reads, discordant read-pairs, soft-clipped reads, and read-depth information. Vaquita does not depend on external tools and very fast. You can analyze 50x WGS sample within an hour.

Download & Compile

git clone https://github.com/seqan/vaquita.git
mkdir vaquita-build && cd vaquita-build
cmake ../vaquita && make vaquita -j 4

Vaquita supports GCC≥4.9 and Clang≥3.8.

Usage

vaquita call -r [reference.fa] [input.bam] > [output.vcf]

The .bam file must be sorted by coordinate (eg. samtools sort). You can find more options using vaquita call --help.

Citation

Jongkyu Kim and Knut Reinert, Vaquita: Fast and Accurate Identification of Structural Variations using Combined Evidence. Workshop on Algorithmic Bioinformatics (WABI) 2017

DOI: 10.4230/LIPIcs.WABI.2017.13

  • You can find all the scripts and information about raw datasets that I used for benchmarking at this repository.

Contact

Jongkyu Kim ([email protected])

vaquita's People

Contributors

xenigmax avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

vaquita's Issues

take too much time to run vaquita

Hello
i run vaquita on NGS bam file ,and it take two days and still running.
command like this:
vaquita call -r {ref} {bam} >{dir}/vaquita.vcf
i test ,if i use bam file alignment on GRC38 ref,it will take too long time. if i use hg19,it work well so far .
Do u have any idea?

Converting to seqan3

I think now that SeqAn3 is officially out, we can start looking at converting to it.

I've started on my own fork and have already converted the Argument Parser and added seqan3 as a submodule.

Is there anything vaquita uses in seqan2 that is not yet in seqan3?

Use new SeqAn library

The current version is slightly modified version of SeqAn regarding BAM I/O.
This modification resolves

  1. Potential deadlock
  2. Thread limitation

So .. discuss this with Rene first

Don't segfault when no bam index file exists.

Currently, if there isn't a BAI file, vaquita will run all the way through and then exit on a segfault. It would be less confusing if this is checked for at the very beginning and the program interrupts.

seqan::UnexpectedEnd

Hi Jongkyu,

when I run Vaquita:
vaquita -cg hg19.fa input.bam > output.vcf

I'm getting the following exception:

[2017-3-2 15:15:51] [START] Split-read analysis
[2017-3-2 15:15:57] [END] Split-read analysis (6 seconds.)
[2017-3-2 15:15:57] [START] Paired-end analysis
[2017-3-2 15:15:57] [END] Paired-end analysis (0 seconds.)
[2017-3-2 15:15:57] [START] Clipped-read analysis
[2017-3-2 15:15:57] [END] Clipped-read analysis (0 seconds.)
[2017-3-2 15:15:57] [END] BREAKPOINT IDENTIFICATION (0 seconds.)
[2017-3-2 15:15:57] Found breakpoints
[2017-3-2 15:15:57] 24428 from split-read evidences.
[2017-3-2 15:15:57] 0 from read-pair evidences.
[2017-3-2 15:15:57] 1708878 from soft-clipped evidences.
[2017-3-2 15:15:57] [START] BREAKPOINT MERGING
[2017-3-2 15:15:57] [START] From split-read evidences.
[2017-3-2 15:15:57] [END] From split-read evidences. (0 seconds.)
[2017-3-2 15:15:57] [START] From read-pair evidences.
[2017-3-2 15:15:57] [END] From read-pair evidences. (0 seconds.)
[2017-3-2 15:15:57] [START] From soft-clipped evidences.
terminate called after throwing an instance of 'seqan::UnexpectedEnd'
  what():  Unexpected end of input.
Aborted

I already tried with different input files and recreated hg19.fa.fai but all with the same result. Any idea why this is happening? 🙂

Better .vcf outputs.

  • Better headers
  • More information for each SV (eg. hetero/homo)
  • What is the standard way of representing duplications and translocations
  • Is coordniates accurate ? - confusion between 0-based, 1-based coordinates (double check)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.