Hi, the new suite of RNA-Seq tools look great and I'm excited to get using them. <

Multi-threading not working? about stringtie HOT 7 CLOSED

gpertea commented on September 28, 2024

Multi-threading not working?

from stringtie.

Comments (7)

gpertea commented on September 28, 2024

Indeed it's strange, I noticed that too sometimes on fast machines - I am assuming that the CPU usage may be very low because most of the time the threads are just waiting for data (alignment bundles) - the I/O is just not keeping up with the fast processing of those small bundles. This is the case when there are a lot of very small bundles to process (which unfortunately happens often). On the other hand it's also often the case that there is a high coverage/large span bundle that is keeping just 1 thread busy for a long time, while all the others finished - the many alignments on chrM seems to do that quite often in our experience. I suspect you do have such a high-coverage (and perhaps large span) bundle in your relatively small data set there.
I am actually curious about the kind of alignments/coverage you have there which would make StringTie so slow for just 4 million reads.. If you could share that .bam file I would be interested to take a look (let me know if you want a ftp location to upload that file - in that case please write to me directly at gpertea at jhu.edu).

from stringtie.

cmonger commented on September 28, 2024

Unfortunately I cannot share this particular dataset but I can say that it is 2x 300bp data from a miseq and was prepared using a ribo reduction method rather than polya selection and does indeed have a large amount of reads mapping to chrM So your suspision is probably correct!

Thanks for the reply, I'll try further testing with a 'more appropriate' dataset. If theres any other information I could provide you with (without revealing the data itself) please let me know!

from stringtie.

cmonger commented on September 28, 2024

Also if it helps you with any diagnostics, the genome reads were aligned to is in ~100k contigs and the existing annotation is in its early stages compared to model organisms

from stringtie.

gpertea commented on September 28, 2024

I took another look at this and I found there were indeed some mutex usage issues in the code which caused the threads to idle much more than needed. I fixed some of those and I can see a serious increase in the efficiency of the multi-threading code now -- this fix should be in v1.0.4

from stringtie.

zhanghao-njmu commented on September 28, 2024

I also have this issue when i use the stringtie v1.3.3b.I'm using a '-p 48' flag
my machine have 54 cores.However i saw there is only one core running when processing the stringtie.CPU usage is about 100-200% but usually 4000% when use the same parameter in hisat2.

from stringtie.

gpertea commented on September 28, 2024

Unfortunately StringTie's -p option does not scale well at all, there is rarely a need to run with more than 4 CPUs -- definitely 48 is overkill. There are too many small bundles and StringTie blows through those very quickly, and because the current implementation takes one bundle per worker thread, most threads will spend their time waiting for each other in order to either grab the next bundle or to write the results (since large, complex bundles are quite rare).

This situation can be improved with some involved rewriting of the multi-threading code (to allow threads to grab many small bundles at a time), but I still think the benefit for most pipelines would be minimal, because it's still going to be a better (more efficient) use of computing cores to just run multiple stringtie processes (i.e. for multiple samples) with a small number of cores each, than using many threads on a single sample.

from stringtie.

cryptic0 commented on September 28, 2024

If this is the case, the reference in the documentation of "-p 8" is also an overkill (Pertea et al 2016). I am experiencing this same issue with v1.3.4.

from stringtie.

Multi-threading not working? about stringtie HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs