GithubHelp home page GithubHelp logo

the-hyp0cr1t3 / rna-secondary-structure-prediction Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 1.61 MB

Given a nucleic acid sequence of RNA, find a maximum matching of {A,U} or {C,G} base pairs without knots or sharp turns.

Home Page: https://lucent-lebkuchen-97223f.netlify.app/report.html

CMake 16.27% C++ 72.77% Python 10.96%

rna-secondary-structure-prediction's Introduction

RNA secondary structure prediction

Problem: Given a nucleic acid sequence of RNA, find a maximum matching of ${A,U}$ or ${C,G} $ base pairs without knots or sharp turns.

This is a modern C++ implementation that employs (iterative) dynamic programming on intervals to find the cardinality of the maximum matching of base pairs as well as the base pairs in the matching.

View the report here.

Install Dependencies

To build the project you must have CMake installed.

To install python dependencies

pip install -r requirements.txt

Build Project

Configure:

cmake -DCMAKE_BUILD_TYPE=Release -S . -B build

Build:

cmake --build build --config Release

Build Documentation

To build the documentation you must have Doxygen installed.

cmake --build build --target docs

Output will be in docs/html.

Usage

./bin/app [inputfile]

Example:

./bin/app sample.txt

Note: inputfile may also be relative to ./data.

Input Format

Input must contain the description of a nucleic acid sequence of RNA in the following format.

The first and only line must contain a string $s$ $(s_i \in {A,C,G,U})$ — the nucleic acid sequence.

Output Format

The output will contain the description of the maximum matching.

The first line will contain a single integer $m$ — the cardinality of the maximum matching in the sequence. Each of the next $m$ lines will contain two integers — the indices of the base pairs in the matching.

Visualization

Python script to run the app against some input and plot a graph with matplotlib using the output.

cd scripts
./run.py [inputfile]

Note: inputfile may also be relative to ./data.

Testing

This project uses GoogleTest for its unit tests and GoogleBenchmark for benchmarking.

Testing RNA::Predictor::find_max_matching() against various cases:

cd build
ctest -R PredictorTests -j6

Benchmarking against varying input sizes:

./bin/bench --benchmark_counters_tabular=true
Benchmark output
2022-04-18T03:23:59+05:30
Running ./bin/bench
Run on (12 X 3000 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x6)
  L1 Instruction 32 KiB (x6)
  L2 Unified 512 KiB (x6)
  L3 Unified 4096 KiB (x2)
Load Average: 0.91, 0.73, 0.65
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
-------------------------------------------------------------------
Benchmark              Time             CPU   Iterations          n
-------------------------------------------------------------------
BM_Random/4         1369 ns         1323 ns       526810          4
BM_Random/54      150777 ns       150695 ns         4718         54
BM_Random/104    1046964 ns      1046092 ns          672        104
BM_Random/154    3359127 ns      3356692 ns          203        154
BM_Random/204    7698778 ns      7692782 ns           90        204
BM_Random/254   14885152 ns     14874592 ns           47        254
BM_Random/304   25571931 ns     25553385 ns           27        304
BM_Random/354   40506849 ns     40484808 ns           17        354
BM_Random/404   59950695 ns     59919766 ns           11        404
BM_Random/454   85870123 ns     85813275 ns            8        454
--------------------------------------------------------
Benchmark              Time             CPU   Iterations
--------------------------------------------------------
BM_Random_BigO       0.91 N^3        0.91 N^3
BM_Random_RMS          1 %             1 %

This page uses math latex formatting. Download the extension to render it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.