GithubHelp home page GithubHelp logo

fw1121 / riser Goto Github PK

View Code? Open in Web Editor NEW

This project forked from oicr-ibc/riser

0.0 2.0 0.0 24.21 MB

Evaluation of Alignment Algorithms for Discovery and Identification of Pathogens Using RNA-Seq

License: GNU General Public License v3.0

Perl 18.85% Shell 9.23% Python 71.93%

riser's Introduction

riser

RiSER

Downloading and using RiSER is free, if you use RiSER or its code in your work please acknowledge RiSER by referring to its GitHub homepage https://github.com/oicr-ibc/riser

This is important for us since obtaining grants is one significant way to fund planning and implementation for our projects. Also if you find RiSER useful in your research feel free to let us know.

RiSER is brought to you by:

  • Vincent Ferretti
  • Ivan Borozan
  • Stuart Watt

RiSER was originally developed by:

  • Ivan Borozan

Getting Started

Minimum Requirements

Tested on UBUNTU-12.04

R (2.14.1):

$ apt-get install r-base

Python (2.7.3) - the program assumes that Python is in /usr/bin/python perl (5.14.2) samtools (0.1.18)

Following Perl module needs to be installed:

bioperl:

$ apt-get install bioperl

Following Python modules need to be installed in the order shown below:

If you do not have pip installed, install it as shown below:

$ sudo apt-get install python-pip python-dev  

numpy(1.6.2):

$ sudo pip install numpy

BioPython(1.6):

$ sudo pip install biopython 

rpy2(2.3.1):

$ sudo pip install rpy2

setuptools(1.0):

$ sudo easy_install -U distribute

Cython(0.17.4):

$ sudo pip install cython

pysam(0.6):

$ sudo pip install pysam

Installation

This version of RiSER has has been tested under Linux (Ubuntu 12.04).

To install:

Option 1:

$ sudo apt-get install git
$ git clone https://github.com/oicr-ibc/riser.git 
$ cd riser

Option 2:

$ wget https://github.com/oicr-ibc/riser/archive/master.zip
$ unzip master.zip
$ mv riser-master riser
$ cd riser

Done! No installation is required, all Python scripts are in $RISER_DIR/bin and should be compatible with your system

Usage:

Assuming you are working in the RiSER directory.

  1. Simulate data for a particular set of genomes.

    The default config.ini file is in the config directory - to run RiSER you need to modify the initial .ini file. However make sure to first run RiSER with the configuration file config/config_simulation.example provided as an example on how to run the simulation:

    $ cp config/config.ini config/config.ini.save
    $ cp config/config_simulation.example config/config.ini

    edit the config.ini file.

    run the simulation script:

    $ python bin/run_simulator.py

    (i) Note, simulated results will be output to the directory specified in the config.ini file (see config_simulation.example). Also make sure to check if you are running a 32-bit or 64-bit machine (see config_simulation.example under [aligners])

    (ii) Note, in the GenBank flat file, the GenBank 'FEATURES' entries 'gene' and 'CDS' if both present, need to have the /db_xref="GeneID:XXXXX" associated with each.

  2. Compare the aligner's output to the truth file (from simulated data):

    Make sure to first run RiSER with the configuration file config/config_analysis.example provided as an example on how to run the analysis:

    $ cp config/config.ini config/config.ini.save
    $ cp config/config_analysis.example config/config.ini

    edit the config.ini

    run the analysis script:

    $ python bin/run_analysis.py

    The summary of results will be output to the aligner's directory specified by the user in the config.ini file (see config_analysis.example)

    In the example provided, results for the NC_001357.1 genome and the BFAST aligner will be output to:

    examples/aligners/NC_001357.1/BFAST/simulated_transcripts_0.fa/Rdata_multi/

    examples/aligners/NC_001357.1/BFAST/simulated_transcripts_10.fa/Rdata_multi/

    Note that the results are output as R data files, to view them launch R and load results as shown below:

    $ cd examples/aligners/NC_001357.1/BFAST/simulated_transcripts_0.fa/Rdata_multi/

    in R type:

    > # To load the data
    > load("aligner_stats.gzip")
    > 
    > # To run the analysis statistics
    > aligner_stats

File format for user specified transcript files:

In the case a transcript file is specified by the user (see also examples/genomes/NC_001357.1_transcripts.txt) each row in the file should designate a single transcript and columns (tab delimited) should be set as in the order shown below:

transcript_id (e.g. GI number or any other unique id) \t transcript_name \t genome_id (e.g. GenBank Accession) \t strand \t transcript_START \t transcript_END \t transcript_START \t transcript_END \t numb_exons \t exons_START(the START positions of each exon needs to be separated by commas) \t exons_END(the END positions of each exon needs to be separated by commas (and in the same order as the START positions))

Datasets

More datasets are available on the wiki at: https://github.com/oicr-ibc/riser/wiki/Datasets.

License and Copyright

Licensed under the GNU General Public License, Version 3.0. See LICENSE for more details.

Copyright 2013 The Ontario Institute for Cancer Research.

Acknowledgement

This project is supported by the Ontario Institute for Cancer Research (OICR) through funding provided by the government of Ontario, Canada.

riser's People

Contributors

iborozan avatar morungos avatar

Watchers

James Cloos avatar Wayne Fang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.