GithubHelp home page GithubHelp logo

j3xugit / gremlin_cpp Goto Github PK

View Code? Open in Web Editor NEW

This project forked from sokrypton/gremlin_cpp

1.0 1.0 1.0 1.22 MB

GREMLIN - learn MRF/potts model from input multiple sequence alignment! Implementation now available in C++ and Tensorflow/Python!

Home Page: http://solab.org

C++ 4.26% Jupyter Notebook 95.74%

gremlin_cpp's Introduction

GREMLIN_CPP v1.0

Installation

$ g++ -O3 -std=c++0x -o gremlin_cpp gremlin_cpp.cpp -fopenmp

invoke -fopenmp to allow usage of multiple CPU(s), the code is openmp friendly.

Usage

Note, openmp uses the system variable OMP_NUM_THREADS to decide how many threads/CPU(s) to use.

$ export OMP_NUM_THREADS=16
$ ./gremlin_cpp -i alignment_file -o results
# ---------------------------------------------------------------------------------------------
#                                GREMLIN_CPP v1.0                                              
# ---------------------------------------------------------------------------------------------
#   -i            input alignment (either one sequence per line or in fasta format)
#   -o            save output to
# ---------------------------------------------------------------------------------------------
#  Optional settings                                                                           
# ---------------------------------------------------------------------------------------------
#   -only_neff    only compute neff (effective num of seqs)      [Default=0]
#   -only_v       only compute v (1body-term)                    [Default=0]
#   -gap_cutoff   remove positions with > X fraction gaps        [Default=0.5]
#   -alphabet     select: [protein|rna|binary]                   [Default=protein]
#   -eff_cutoff   seq id cutoff for downweighting similar seqs   [Default=0.8]
#   -lambda       L2 regularization weight                       [Default=0.01]
#   -mrf_i        load MRF
#   -mrf_o        save MRF
#   -pair_i       load list of residue pairs (one pair per line, index 0)
# ---------------------------------------------------------------------------------------------
#  Minimizer settings                                                                          
# ---------------------------------------------------------------------------------------------
#   -min_type     select: [lbgfs|cg|none]                        [Default=lbfgs]
#   -max_iter     number of iterations                           [Default=100]
# ---------------------------------------------------------------------------------------------

parsing output

i j raw apc ii jj
i = index i
j = index j
raw = l2norm(W)
apc = raw - mean(row) * mean(col) / mean(all)
ii = char-position i
jj = char-position j

The out MRF contains 21 values for each position (V) and 21 x 21 values for each pair of positions (W).

The order of the values is as follows: "ARNDCQEGHILKMFPSTWYV-" (where "-" is the gap).

For RNA the order is "AUCG-", with 5 values for V and 5x5 for W.

For Binary the order is "01-", with 3 values for V and 3x3 for W.

Alternative implementations

gremlin_cpp's People

Contributors

sokrypton avatar

Stargazers

 avatar

Watchers

James Cloos avatar

Forkers

damaoshan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.