GithubHelp home page GithubHelp logo

jrash / statalign Goto Github PK

View Code? Open in Web Editor NEW

This project forked from statalign/statalign

0.0 0.0 0.0 62.63 MB

StatAlign

Home Page: http://statalign.github.com/

License: GNU General Public License v3.0

Shell 0.06% Java 99.94%

statalign's Introduction

Improvement of the biological realism of insertion and deletion in StructAlign

The modification to StructAlign which improved the biological realism of insertion and deletion in the Bayesian model for the joint estimation of phylogeny and protein structure alignment. Group project with Jeff Thorne (NCSU), Gary Larson and Scott Schmidler (Duke), Jotun Hein (Oxford) hosted by The Statistical and Applied Mathematical Sciences Institute.

The primary change to the model is in StructAlign.java here.

A write up of the original model is described in the original paper.

The write up of our modification to the model is here.

StatAlign v3.2

http://statalign.github.io/

*** INTRODUCTION ***

StatAlign is an extendable software package for Bayesian analysis of protein, DNA and RNA sequences. Multiple alignments, phylogenetic trees and evolutionary parameters are co-estimated in a Markov Chain Monte Carlo framework, yielding posterior distributions for quantities such as tree topologies, branch lengths, alignments, and indel rates.

Traditional methods conduct phylogenetic analysis on a single, fixed alignment, which can lead to strong biases in the inferred trees. In contrast, the joint estimation approach accounts for the interdependence between the various parameters.

The models behind the analysis permit the comparison of evolutionarily distant sequences: the TKF92 insertion-deletion model can be coupled to a variety of different substitution models. A broad range of models for nucleotide and amino acid data is included in the package and the plug-in management system ensures that new models can be easily added.

Since joint sampling of alignments and trees is more computationally intensive than analysis with a single alignment, StatAlign does require significantly more runtime than tree-only MCMC analyses, and is best suited to the analysis of small to medium datasets (10-30 sequences). We are currently working on software developments that will extend this range of applicability.

*** USAGE ***

StatAlign is written in Java and thus requires no installation, but you must have Java 6 or newer on your system. If you do not have such a Java framework, please download the most recent one from:

http://www.java.com/getjava/

Once you have Java, extract the StatAlign zip archive, and double-click the StatAlign.jar file. This will launch the graphical interface where sequences to analyse can be easily loaded from the menus -- see the Help page for more information.

Alternatively, StatAlign can also be run from the command line, which is recommended for automated, script-driven scenarios. To print the list of command line options, use the following command:

java -jar StatAlign.jar -help

An example command line setup:

java -Xmx512m -jar StatAlign.jar -ot=Fasta -mcmc=10k,100k,1k seqs.fasta

The -Xmx512m option is a standard JVM argument that sets the memory limit of the program at 512 MiBs. This is our recommended minimum, and increase it as necessary for large inputs. The -ot options selects the output alignment format, -mcmc sets MCMC parameters such as the number of burn-in steps and the number of samples, see the user documentation in the Help menu and on-line for tips on how to set these values. The input file must contain the sequences to align in Fasta format.

*** HOW TO COMPILE ***

If you would like to compile and package your own runnable jar file you can do so by following these steps:

  1. Install Gradle (unless you have at least version 2.0 already installed; check by running "gradle -version"). To install it, you have two options:
  2. a) (Any system) Manual installation: help can be found here: http://www.gradle.org/installation
  3. b) (Linux) If you have a linux system with bash, you can run the "install-gradle.sh" script (can be found in the root of the repo). Then you have to issue "source ~/.bashrc" to update the environment, or just start a new terminal.
  4. Gradle makes it really easy to compile the sources:
  5. a) (Any system) Use the command "gradle shadowJar" to create a runnable jar file. It's location will be "build/libs/StatAlign*.jar"
  6. b) (Linux / Windows+Cygwin) You can just run the "build-jar.sh" script. It will copy the newly created jar file to the root directory of the project with the name "StatAlign.jar"

*** LICENSE ***

StatAlign is distributed under the GNU General Public License Version 3. A copy of the license is found in LICENSE.txt.

statalign's People

Contributors

1j avatar novadam avatar cjchallis avatar michaelgoldendev avatar ingolfured avatar preetiarunapuram avatar jrash avatar ador avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.