GithubHelp home page GithubHelp logo

binary117736 / irfinder Goto Github PK

View Code? Open in Web Editor NEW

This project forked from williamritchie/irfinder

0.0 0.0 0.0 1.53 MB

Detecting intron retention from RNA-Seq experiments

License: MIT License

Shell 19.40% C++ 52.88% Perl 23.15% C 2.45% R 0.86% Makefile 1.26%

irfinder's Introduction

IRFinder

Detecting intron retention from RNA-Seq experiments

User Manual

1.3.1

  1. IRFinder now exits immediately after error, instead of trying to complete the remaining processes.
  2. Improved Perl version judgement during Phase 3 of reference preparation.

1.3.0
New features:

  1. New BuildRefFromSTARRef mode. This allows users to use an existing STAR reference to build IRFinder reference, which significantly reduces the total preparation time. This new mode also tries to automatically figure out the original FASTA and GTF files used to generate the existing STAR reference. Call IRFinder -h for more details.
  2. BuildRef and BuildRefProcess mode now support -j option to parse an integer that changes the default value of --sjdbOverhang argument in STAR.
  3. FASTQ mode now supports -y option to feed extra STAR arguments to control alignment behaviors.

Improvements:

  1. FASTQ mode now outputs a full BAM file in "Unsorted.bam", instead of a BAM file with a trimmed QS column.
  2. IRFinder does not automatically generate "unsorted.frag.bam" to save disk space and to avoid redundancy to "Unsorted.bam". Instead, IRFinder now provides a tool at bin/TrimBAM4IGV to generate this kind of trimmed BAM file to facilitate visualization purpose in IGV.
  3. Re-design of standard output information during IRFinder reference preparation. It is easier to recognize occured errors now.
  4. Usage information now can be viewed by -h option.

Bug fixes:

  1. The mapability calculation during the IRFinder reference preparation stage has been re-designed. The previous algorithm encountered buffer size issues when dealing with genomes with a huge amount of chromosomes/scaffolds. This has been fixed. Please note, the new algorithm requires samtools (>=1.4) executable binary ready in $PATH.
  2. Since Perl 5.28.0, sort '_mergesort' is no longer supported. IRFinder now checks the Perl version and uses sort functions correspondingly.

1.2.6

  1. IRFinder now keeps introns with the same effective regions as separate entries in the reference.
  2. IRFinder now automatically checks if the reference preparation stage generates empty reference files, which indicates process failure.
  3. The R object genreated by Differential IR Analysis script now includes an additional slot named "MaxSplice", which represents the maximum splice reads at either end of introns. Each value is the maximum value between Column 17 and 18 in the IR quantification output.
  4. During differential IR analysis, values in "MaxSplice" are now used as the denominators in the GLM, instead of using the values of Column 19 in the IR quantification output. This makes the IR ratio in the differential IR analysis more consistent with the values of Column 20 in the IR quantification output.
  5. User manual has been updated.

1.2.5

  1. Headers are now correctly added to output files IRFinder-IR-dir.txt and IRFinder-IR-nondir.txt.

1.2.4

  1. In the GLM-based method for differential IR comparison, now the orginal matrix for DESeq2 is now made up by IR depth and correct splicing depth. In the previous versions, the latter one is the sum of splicing depth and IR depth. This change is supposed to give a smoother dispersion estimation across all introns.

1.2.3:

  1. IRFinder now supports GTF attribution tags gene_type and transcript_type upon the original requirement for typical Ensembl tags gene_biotype and transcript_biotype. Either of these two pairs is required to correctly build IRFinder reference.

1.2.2:

  1. In GLM-based differential IR comparison, fixed an error caused by duplicated row names when creating DESeq2 object with a version of DESeq2 later than 1.10.

1.2.1:

  1. Improved the performance of DESeq2-based GLM analysis for differential IR. This new approach should improve the estimation of dispersion. Normal splicing from IRFinder result is now used as a variable in the GLM, instead of using the value of normal splicing as an offset. This approach is adapted from detection of allele-specific expression from Michael Love. See Wiki page for details.
  2. Updated some out-of-date usage information

1.2.0:

  1. IRFinder is now compatible with GLM-based analysis. This is achieved by passing IRFinder result to DESeq2 using the function in bin/DESeq2Constructor.R. See Wiki page for details
  2. Fixed the conflict with latest version "bedtools complement" that used to cause failure in preparing IRFinder reference
  3. Improved memory usage when passing lines to bedtools genomecov. This is also supposed to benefit reference preparation of those genomes with a lot of chromosomes contigs. Thanks for the smart solution from Andreas @andpet0101.
  4. Specified the gtf file to be downloaded during reference preparation via automatic downloading. Ensembl currently holds several versions of gtf files for the same genome release. This confused IRFinder BuildRefDownload function in the previous version.
  5. Added -v option to print out version number.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.