GithubHelp home page GithubHelp logo

lrsoenksen / cl_rna_synthbio Goto Github PK

View Code? Open in Web Editor NEW
12.0 4.0 2.0 289.7 MB

Code to reproduce Angenent-Mari, N. et al 2020. Deep Learning for RNA Synthetic Biology

License: MIT License

Jupyter Notebook 43.71% Python 48.31% Makefile 0.06% Emacs Lisp 0.01% PostScript 0.83% TeX 0.11% Perl 0.24% Roff 0.03% Shell 0.03% M4 0.15% C 5.55% HTML 0.02% Pascal 0.01% PHP 0.01% Pawn 0.01% R 0.01% C++ 0.93% Fortran 0.01% Module Management System 0.01% XS 0.01%

cl_rna_synthbio's Introduction

CL_RNA_SynthBio

Code to reproduce Angenent-Mari, N. et al 2019. Deep Learning for RNA Synthetic Biology

Image description

DATA STRUCTURE (INPUT / OUTPU)

Data is loaded from a Toehold Sensor Database (data/2019-03-30_toehold_dataset_proc_with_params.csv) which is comma delimited table having the following columns of DNA encoded sub-sequences: organism, sequence_class, sequence_id, pre_seq promoter, trigger, loop1, switch, loop2, stem1, atg, stem2m linkerm post_linker, output

Input tensor is defined as (DS=Data_Style):

DS) Toehold Nucleotide Sequence
*NOTE: Base toehold string sequence [0-144]

  •   GGG  - Trigger - Loop1 - Switch  - Loop2 - Stem1 -  AUG  -  Stem2  -  Linker - Post-linker
    
  • [-3,-1]  [0,-29]  [30-49]  [50-79]  [80-90] [91,96] [97,99] [100,108] [109,134]  [135,144]    
    
  • For training we select our input sequence vector start with GGG and concatenate everything from "Loop1" to "post-linker"... which is seq_SwitchOFF_GFP  = ggg + seq[30:145].
    
  • Also, pre_seq & promoter sub-sequences are NEVER used because they are not converted into mRNA (is in the plasmid but > *     it is never in the functional toehold module), so it won't contribute in secondary structure at all. For this example > *     in particular we use DS_1.*
    

Output vector is defined as:

OUT) ON, Off &/OR ON-OFF State values derived from the experimental testing of toehold switch RNA sequence

PROBLEM DEFINITION

To investigate if a deep learning network can be used to predict toehold switch ON/OFF functionality, because in that case it would suggest the network is learning secondary structure prediction that would be transferable to other RNA based problems.

DATASET (Direct Download)

Due to ocassional high demand in downloads of this GIT, the LFS bandwidth or egress limit of this repo may require you download our data directly from this link: https://drive.google.com/file/d/1t_OXvtW-hEGRt3-mgNlyBKHBqro2Z572/view?usp=sharing

cl_rna_synthbio's People

Contributors

lrsoenksen avatar

Stargazers

Marc Horlacher avatar Vignesh Vanchinathan avatar Greg Tucker-Kellogg avatar Yan Hui avatar Michael Corrado avatar Jiayang Chen avatar  avatar JH Liu avatar Stephanie Zhang avatar wuwenjie avatar Shikai Jin avatar 2xE$ avatar

Watchers

Alex Garruss avatar  avatar Kostas Georgiou avatar  avatar

cl_rna_synthbio's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.