Comments (5)
Sounds like it could be related to the bug that was fixed in v1.1.2 (last version). Could you give the last version a try on the same data and report back if you still get the error ?
from stringtie.
Unfortunately I'm getting the same error also with V 1.1.2
stringtie --version
1.1.2
On Mon, Dec 14, 2015 at 10:03 PM, Geo Pertea [email protected]
wrote:
Sounds like it could be related to the bug that was fixed in v1.1.2 (last
version). Could you give the last version a try on the same data and report
back if you still get the error ?—
Reply to this email directly or view it on GitHub
#32 (comment).
from stringtie.
Any chance of sharing with us the data which trigger this error? If you run with -v option (and no multi-threading) it should be possible to identify the bundle and the genomic location where the error happens and it should make it possible to extract only the reads and the reference transcripts from that location which triggers the problem. If you agree to share those data for debugging, please contact me at [email protected] to further discuss the data transfer details (unless you can already extract it and upload it somewhere for me to retrieve).
Thanks!
from stringtie.
Thank you for providing the data, we were able to reproduce the crash in that particular region which has very high coverage and a huge number of splice sites, causing StringTie to crash after taking more than 32GB RAM -- very unusual.
Upon investigation of that particularly dense bundle we noticed that the STAR alignments in that region were very messy, with many (probably false) splicing events seemingly caused by STAR forcefully aligning reads with mismatches + soft clipping.
The same reads aligned with HISAT2 (with --dta
option of course) provided much better alignments in that region (without losing coverage depth!) and StringTie was able to assemble that region without problems using its output (using only a few hundred megabytes of RAM in the process!).
Not sure if it was something about the particular version of STAR used for generating those alignments, or the options used, but we had a hard time looking for ways to filter the "bad" alignments there, due to somewhat incomplete SAM records (missing MD tags; number of mismatches reported per pair not per read etc.).
Anyway, the latest version of StringTie we just released (v1.2.0) is able to finish processing that BAM file you provided (with the STAR alignments for the whole genome), using about 13GB RAM (with 8 cpus), in about 12 minutes. (Again, using the alignments produced by HISAT2 would drastically reduce the RAM usage and running time).
So please give the new version a try on your existing STAR alignments, but we strongly recommend using HISAT2 --dta
for mapping the reads in the future (or at least, if you really have to use STAR, please use more stringent alignment options for it, e.g. lower the maximum number of mismatches per read to no more than 3 and perhaps also limit soft clipping somehow, to reduce the false-positive spliced alignment rate).
from stringtie.
This should've been fixed in v1.2.0.
from stringtie.
Related Issues (20)
- when dealing with big genomes, the error occurs: "the input alignment file is not sorted!" HOT 3
- "The -c and -m parameters of StringTie are not effective. HOT 3
- "Segmentation fault" error
- "Segmentation fault" Error using StringTie HOT 1
- Segmentation Error with Stringtie mix. The bundle appears too large.
- Error when running prepDE.py HOT 1
- Springtie installation instructions fail on "make release"
- StringTie for single-end reads HOT 1
- Not compiling SuperReads_RNA HOT 1
- I wonder 'else' clause should be commented out (or deleted) in prepDE.py3, in the block of badGenes check
- Whether stringtie will include novel transcripts into ballgown file? What other tools can be used to visualize novel transcripts?
- StringTie --merge
- <class 'str'> with high reads after running prepDE.py3 HOT 1
- problem with python prepDE.py3 -i sample_lst.txt HOT 1
- Stringtie --merge -c
- Segmentation fault (core dumped) and problems of gene orientations HOT 3
- how do you get all the potential processed transcripts HOT 1
- input file format
- Use of StringTie to assemby of unannotated intron retention transcripts HOT 1
- stringtie mix option
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stringtie.