GithubHelp home page GithubHelp logo

gardner-binflab / tisigner_paper_2019 Goto Github PK

View Code? Open in Web Editor NEW
1.0 7.0 0.0 22.39 MB

Code related to TIsigner manuscript and webserver

Home Page: https://tisigner.otago.ac.nz

License: Other

Jupyter Notebook 68.04% Python 1.04% Perl 0.67% PHP 7.96% HTML 15.24% CSS 1.97% Hack 0.68% JavaScript 4.39% Java 0.01% Classic ASP 0.01%
bioinformatics rna-structure multiprocessing protein structural-biology biotechnology simulated-annealing translation-initiation-site

tisigner_paper_2019's Introduction

Bhandari BK, Lim CS, Remus DM, Chen A, van Dolleweerd C, Gardner PP. (2021). Analysis of 11,430 recombinant protein production experiments reveals that protein yield is tunable by synonymous codon changes of translation initiation sites. PLOS Computational Biology. 17(10), e1009461. DOI:10.1371/journal.pcbi.1009461

  • This repository contains the scripts and Jupyter notebooks to reproduce the results and figures of this preprint. The source code of TIsigner webserver is available here.
  • Dependencies can be installed using Anaconda3. For example, conda install -c bioconda viennarna. ViennaRNA can also be installed according to the instructions here.
  • IXnos requires python2 to run.
  • openen.py is a wrapper for RNAplfold using multiple processes. It is useful to calculate the opening energy of multi-fasta sequences. The output can be analysed as in Fig1_2_S1_S2.ipynb
$ python openen.py -h
usage: openen.py [-h] -s STR [-U STR/INT] [-x] [-W INT] [-u INT] [-S] [-n INT]
                 [-t INT] [-e] [-i INT] [-l INT] [-r] [-o STR] [-p INT]

RNAplfold wrapper using multiprocesses

optional arguments:
  -h, --help            show this help message and exit
  -s STR, --sequence STR
                        Sequences in fasta or csv format
  -U STR/INT, --utr STR/INT
                        Use an integer if 5UTR presence, e.g., -U 1. Use your
                        own 5UTR sequence if your plasmid backbone is not of
                        pET vector. Default = GGGGAATTGTGAGCGGATAACAATTCCCCTCT
                        AGAAATAATTTTGTTTAACTTTAAGAAGGAGATATACAT
  -x, --execute         Run RNAplfold multiprocessing
  -W INT, --winsize INT
                        Average the pair probabilities over windows of given
                        size. An RNAplfold option. Default = 210
  -u INT, --ulength INT
                        Compute the mean probability that subsegments of
                        length 1 to a given length are unpaired. An RNAplfold
                        option. Default = 210
  -S, --stack           Stack _openen dataframes to single-column dataframes,
                        concatenate them as a single pandas dataframe and
                        output it as a .pkl pickle file. Requires i and j
                        options
  -n INT, --utrlength INT
                        The length of 5UTR. Related to option -S and -e.
                        Default = 71
  -t INT, --distance INT
                        Downstream distance to start codon to include when
                        stacking. Related to option -S. Default = 100
  -e, --parse           Parsing _openen dataframes to get opening energy of
                        unpaired subsegments. Requires i and l options
  -i INT, --ipos INT    Position i centered at start codon of an input
                        sequence. Related to option -e. Default = 18
  -l INT, --length INT  Subsegment l as in _openen file. Related to option -e.
                        Default = 48
  -r, --remove          Remove _openen and .ps files
  -o STR, --output STR  Output file name for .pkl. Related to -S. Default =
                        openen
  -p INT, --processes INT
                        Number of processes to spawn. Default = half of the
                        number of CPU

© Bikash Kumar Bhandari, Chun Shem Lim, Paul P Gardner (2019-)

tisigner_paper_2019's People

Contributors

bkb3 avatar lcscs12345 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

tisigner_paper_2019's Issues

missing rf_model

Hello,
I looked at the code in optimization.py, it seems missing rf_model.
self.rf_model = self.validate_model(rf_model).
look like you pre-trained features,
could you share the code for training?
thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.