Hi, I am running stringtie with gencode v22 annotation on one of my

Stringtie error about stringtie HOT 6 CLOSED

gpertea commented on September 28, 2024

Stringtie error

from stringtie.

Comments (6)

gpertea commented on September 28, 2024

It is quite unusual to get such a bad memory usage in StringTie, I am wondering if it's just some overexpressed gene on chrM or there are genes on other chromosomes involved here.
In order to pinpoint where it gets stuck/crashes, you should run with a single thread and use the -v option and capture the stderr messages - this way you could tell where it gets stuck (the last genomic region being processed). Hopefully then you can extract only the read alignments from that region and send it to us for further analysis..
I just uploaded a pre-release version v1.0.4 with the latest fixes: http://ccb.jhu.edu/software/stringtie/dl/stringtie-1.0.4.Linux_x86_64.tar.gz
Could you give this version a try with only one cpu (i.e. no -p option) and using -v as suggested above - and let me know what is the bundle where it gets stuck/crashes.
(the source is at http://ccb.jhu.edu/software/stringtie/dl/stringtie-1.0.4.tar.gz if you need to build from source).
There is also an option to ignore specific chromosome(s) in this version, so you can add "-x chrM" to skip processing of alignments on chrM (which are generally troublesome due to oversampling but rarely useful for gene expression analysis, unless one really cares about mitochondrial genes..). Be careful with the spelling of the chromosome names for this one, it's important if it's "chrM" or "ChrM".

from stringtie.

srithegreat commented on September 28, 2024

Hi,

Thanks for the quick response. I have tried with the newer version on one
of the files. It gets stuck at the bundle chr21:8182911-8595722. The last
few lines are as shown:
[05/18 19:44:51]^bundle chr21:8161340-8161490(3) done (0 processed
potential transcripts).
[05/18 19:44:51]>bundle chr21:8162457-8162596(3) (0js, 0 guides) loaded,
begins processing...
[05/18 19:44:51]^bundle chr21:8162457-8162596(3) done (0 processed
potential transcripts).
[05/18 19:44:51]>bundle chr21:8167909-8168134(5) (0js, 0 guides) loaded,
begins processing...
[05/18 19:44:51]^bundle chr21:8167909-8168134(5) done (0 processed
potential transcripts).
[05/18 19:44:51]>bundle chr21:8170441-8170607(3) (0js, 0 guides) loaded,
begins processing...
[05/18 19:44:51]^bundle chr21:8170441-8170607(3) done (0 processed
potential transcripts).
[05/18 19:46:21]>bundle chr21:8182911-8595722(12097263) (4774js, 39 guides)
loaded, begins processing...

Also, I have extracted the reads for this region only. Let me know your
thoughts. I do see that another sample gets stuck at chromosome M which
makes sense, but this I have no idea.

Files accessed here:
https://jh.box.com/s/cfxpfxrrb6q6y1yo2crgl0vjsm8w0kwl

Regards,
Srikanth

ᐧ

On Sun, May 17, 2015 at 2:20 PM, Geo Pertea [email protected]
wrote:

It is quite unusual to get such a bad memory usage in StringTie, I am
wondering if it's just some overexpressed gene on chrM or there are genes
on other chromosomes involved here.
In order to pinpoint where it gets stuck/crashes, you should run with a
single thread and use the -v option and capture the stderr messages - this
way you could tell where it gets stuck (the last genomic region being
processed). Hopefully then you can extract only the read alignments from
that region and send it to us for further analysis..
I just uploaded a pre-release version v1.0.4 with the latest fixes:
http://ccb.jhu.edu/software/stringtie/dl/stringtie-1.0.4.Linux_x86_64.tar.gz
Could you give this version a try with only one cpu (i.e. no -p option)
and using -v as suggested above - and let me know what is the bundle where
it gets stuck/crashes.
(the source is at
http://ccb.jhu.edu/software/stringtie/dl/stringtie-1.0.4.tar.gz if you
need to build from source).
There is also an option to ignore specific chromosome(s) in this version,
so you can add "-x chrM" to skip processing of alignments on chrM (which
are generally troublesome due to oversampling but rarely useful for gene
expression analysis, unless one really cares about mitochondrial genes..).
Be careful with the spelling of the chromosome names for this one, it's
important if it's "chrM" or "ChrM".

—
Reply to this email directly or view it on GitHub
#13 (comment).

Srikanth S. Manda
Research Scholar
Pandey Lab
McKusick-Nathans Institute of Genetic Medicine
Johns Hopkins University School of Medicine
Miller Research Building, Room 560
733 North Broadway
Baltimore, Maryland 21205

from stringtie.

gpertea commented on September 28, 2024

Ah, I see, this is another monster cluster courtesy of HISAT... It's so dense that IGV started to complain about memory when I tried to visualize the read alignments.. Stringtie gets bogged down badly on this bundle, using about 33GB RAM.
Interestingly, Cufflinks simply gives up -- after filtering a lot of alignments, ends up producing no transcript assemblies at all (I haven't tried to use it with -g because Cufflinks cheats badly with that option..).
We should probably implement a new "aggressive alignment filtering" option in StringTie, to be enabled per user request for exuberant aligners like HISAT. HISAT is an excellent aligner, great speed etc. but at this point I have to recommend TopHat instead, because we sometimes get too many spurious alignments from HISAT creating clusters like these which StringTie cannot handle properly.
So for now I am sorry to say we cannot provide a fix for this situation (besides recommending TopHat), but thank you for sharing this example data set - we'll use this to devise a better alignment filtering strategy in a future version.

from stringtie.

srithegreat commented on September 28, 2024

Thanks for the feedback. I will see if TopHat solves my issue.

Do you think its a good idea to mix alignments from HISAT and TopHat?
ᐧ

On Fri, May 22, 2015 at 5:50 PM, Geo Pertea [email protected]
wrote:

Ah, I see, this is another monster cluster courtesy of HISAT... It's so
dense that IGV started to complain about memory when I tried to visualize
the read alignments.. Stringtie gets bogged down badly on this bundle,
using about 33GB RAM.
Interestingly, Cufflinks simply gives up -- after filtering a lot of
alignments, ends up producing no transcript assemblies at all (I haven't
tried to use it with -g because Cufflinks cheats badly with that option..).
We should probably implement a new "aggressive alignment filtering" option
in StringTie, to be enabled per user request for exuberant aligners like
HISAT. HISAT is an excellent aligner, great speed etc. but at this point I
have to recommend TopHat instead, because we sometimes get too many
spurious alignments from HISAT creating clusters like these which StringTie
cannot handle properly.
So for now I am sorry to say we cannot provide a fix for this situation
(besides recommending TopHat), but thank you for sharing this example data
set - we'll use this to devise a better alignment filtering strategy in a
future version.

—
Reply to this email directly or view it on GitHub
#13 (comment).

from stringtie.

mpertea commented on September 28, 2024

This should be fixed in StringTie version 1.1.0 that was just released.

from stringtie.

Liuy12 commented on September 28, 2024

Hi,

I encountered similar errors when running StrintTie version 1.3.3. StringTie basically stuck at a certain bundle for days. I looked at this region in IGV and it seems like this region has tons of duplication reads and multi-mapped reads (Over 15000). I guess this is the reason that causes stringtie to stuck at this position. I tried to use the -M option and set it to a relatively lower value (0.1), but that didn't help. I am wondering do you have any suggestions regarding this issue? Do I need to pre-filter the bam files to remove multi-mapped reads? Any suggestion is appreciated. Thanks

Yuanhang(Leo) Liu
Informatics Specialist
Division of Biomedical Statistics and Informatics

Mayo Clinic
200 First Street SW
Rochester, MN 55905

from stringtie.

Stringtie error about stringtie HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs