GithubHelp home page GithubHelp logo

nh13 / advntr Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mehrdadbakhtiari/advntr

0.0 3.0 0.0 977 KB

A tool for genotyping Variable Number Tandem Repeats (VNTR) from sequence data

Home Page: http://advntr.readthedocs.io/

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

advntr's Introduction

adVNTR - A tool for genotyping VNTRs

adVNTR is a tool for genotyping Variable Number Tandem Repeats (VNTR) from sequence data. It works with both NGS short reads (Illumina HiSeq) and SMRT reads (PacBio) and finds diploid repeating counts for VNTRs and identifies possible mutations in the VNTR sequences.

Software Requirements

  1. Following libraries are required
    • python2.7
    • python-pip
    • python-tk
    • libz-dev
    • samtools

You can install these requirement in Ubuntu Linux by running sudo apt-get install python2.7 python-pip python-tk libz-dev samtools

  1. Following python2.7 packages are required:
    • biopython
    • pysam version 0.9.1.4 or above
    • cython
    • networkx version 1.11
    • scipy
    • joblib

You can install required python libraries by running pip install -r requirements.txt

  1. In addition, ncbi-blast version 2.2.29 or above is required

Data Requirements

  • To run adVNTR on trained VNTR models:
    • Download vntr_data.zip and extract it inside the project directory.

Alternatively, you can add model for custom VNTR. See :ref:`add-custom-vntr-label` for more information.

Execution:

Use following command to see the help for running the tool.

python advntr.py --help

The program outputs the RU count genotypes for all VNTRs in vntr_data directory. To specify a single VNTR by its ID use --vntr_id <id> option.

Demo 1: input in BAM format

  • --alignment_file specifies the alignment file containing mapped and unmapped reads:
python advntr.py --alignment_file aligned_illumina_reads.bam --working_directory ./log_dir/
  • With --pacbio, adVNTR assumes the alignment file contains PacBio sequencing data:
python advntr.py --alignment_file aligned_pacbio_reads.bam --working_directory ./log_dir/ --pacbio
  • Use --frameshift to find the possible frameshifts in VNTR:
python advntr.py --alignment_file aligned_illumina_reads.bam --working_directory ./log_dir/ --frameshift

Demo 2: input in fasta format

  • Use the following command to genotype the RU count using fasta file:
python advntr.py --fasta unaligned_illumina_reads.fasta --working_directory ./log_dir/

Citation:

Bakhtiari, M., Shleizer-Burko, S., Gymrek, M., Bansal, V. and Bafna, V., 2017. Targeted Genotyping of Variable Number Tandem Repeats with adVNTR. bioRxiv, p.221754.

advntr's People

Contributors

mehrdadbakhtiari avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.