GithubHelp home page GithubHelp logo

juleskers / fastq_extractor_proof_of_principle Goto Github PK

View Code? Open in Web Editor NEW

This project forked from olipelz/fastq_extractor_proof_of_principle

2.0 2.0 1.0 94 KB

a place to make proof of principle and benchmark various own fastq parser / extractors optimized for speed

Makefile 0.58% C 13.93% Rust 85.48%

fastq_extractor_proof_of_principle's Introduction

fastq_extractor_proof_of_principle

this repo tries to optimize some Perl scripts which are part of the CRISPRAnalyzer R shiny package and which are too slow for production use in web applications. The original code's benchmark is as follow (please note: PERL script not included here)

all examples are done on Thinkpad T420s Core i7 vPro with 16GB RAM and Evo 850 SSD.

PERL

time perl CRISPR-extract.pl "ACC(.{20,21})G" ./data/TRAIL-Replicate1.fastq no
real	0m45.006s
user	0m44.191s
sys	0m0.682s

RUST

$ time fastq_parser ./data/TRAIL-Replicate1.fastq 

output

real	0m3.866s
user	0m3.414s
sys	0m0.438s

C

time ./extractor default ./data/TRAIL-Replicate1.fastq  no

output

real	0m4.409s
user	0m3.953s
sys	0m0.445s

TODO: make the regexp parsing multithreaded in RUST on big big input files

unbelievable the Rust code did beat the low-level C code, pretty amazing!

sam_mapper in RUST

PERL (not included in this repo)

time perl CRISPR-mapping.pl ./data/pilotscreen.fasta ./data/TRAIL-Replicate1_extracted.sam "M{20,21}$" "_"

output

real	1m17.280s
user	1m16.820s
sys	0m0.192s

RUST

time sam_mapper -f ./data/pilotscreen.fasta -s ./data/TRAIL-Replicate1_extracted.sam

output

real	0m7.590s
user	0m7.461s
sys	0m0.111s

fastq_extractor_proof_of_principle's People

Contributors

juleskers avatar olipelz avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.