GithubHelp home page GithubHelp logo

Comments (16)

ljyanesm avatar ljyanesm commented on August 23, 2024

Hi!
Looks like a DISCOVAR log. Could you share which version of w2rap and how are you running it? If you’re running DISCOVAR I am afraid we won’t be able to help you, but you can always give w2rap-contigger a go.

from w2rap-contigger.

jcgrenier avatar jcgrenier commented on August 23, 2024

Oh,
Ah yes that's right. I tried so many different things so far with my dataset.
I don't find the log back unfortunately, I would need to rerun it again. I know that the first step went well, and ran for about 3 hours for loading the dataset. Then, it crashed at the K-mer step because of the memory.

I will regenerate it, but if you know for my files, that are not PCR-free libraries by the way, and 2x150bp, the amount of memory needed, it would be really helpful!

Thanks a lot.

JC

from w2rap-contigger.

bjclavijo avatar bjclavijo commented on August 23, 2024

We can't really guesstimate, but as a useful hint, disk batches will decrease the need of memory for step 2. Usually 16 disk batches will be fine.

from w2rap-contigger.

ljyanesm avatar ljyanesm commented on August 23, 2024

Hi,

Have a look at using the "-d" flag for step_2 which should reduce the amount of memory needed. It will count the kmers in batches using the disk to store tmp hashes.

Try with 16 batches and if it still fails increase that number.

You can run from step_2 onwards by using the --from_step 2 so it doesn't repeat step_1.

Best,

from w2rap-contigger.

jcgrenier avatar jcgrenier commented on August 23, 2024

Hi @ljyanesm,

Where can I recover the temporary files coming from step1? I was launching that on a compute node, but it crashed. Was it in memory? Can I save them in a temporary folder?
Thanks for your help.
JC

from w2rap-contigger.

ljyanesm avatar ljyanesm commented on August 23, 2024

The files should be in the output directory, they are named pe_data.fastb and pe_data.cqual if the --dump_all flag was used.

EDIT: dump_all flag is required to get the intermediate files.

from w2rap-contigger.

jcgrenier avatar jcgrenier commented on August 23, 2024

Awesome,
thanks @ljyanesm !

from w2rap-contigger.

jcgrenier avatar jcgrenier commented on August 23, 2024

Hello @ljyanesm,

Sorry to reopen than topic, but it seems that I'm getting some problems regarding the same thing.

I will ask the question first and then put the log of my run. So I ran w2rap-contigger step by step, keeping all the temporary files, in order to make sure that I could resume in case it crashes.

I was making it run with the splitting on disk option of 16 (-d 16) but it appeared to be not enough. It ran well for 1 sample, but now I'm merging 2 samples together, so having twice the number of reads. So I specified -d 30. It seems to run well, but crashed close to the end of the step 2, at the merging step.

Here's the log :
~/Programs/w2rap-contigger/bin/w2rap-contigger -t 48 -m 500 -r M007-M008_R1.trimmed.fq.gz,M007-M008_R2.trimmed.fq.gz -o contigs -p m_k200_trimmed -d 30 --dump_all 1
--from_step 2 --to_step 2

Welcome to w2rap-contigger
WARNING: you are running the code with omp_proc_bind_false, parallel performance may suffer
Loading reads in fastb/qualp format...
DONE!
--== Step 2: Building first (small K) graph ==--
Tue May 30 03:01:17 2017: creating kmers from reads...
Tue May 30 03:01:17 2017: disk-based kmer counting with 30 batches
Tue May 30 04:45:06 2017: batch 0 done and dumped with 2703840497 kmers
Tue May 30 06:11:17 2017: batch 1 done and dumped with 2675797008 kmers
Tue May 30 07:31:03 2017: batch 2 done and dumped with 2720373598 kmers
Tue May 30 08:03:30 2017: batch 3 done and dumped with 2872143701 kmers
Tue May 30 09:19:06 2017: batch 4 done and dumped with 2489537761 kmers
Tue May 30 09:45:26 2017: batch 5 done and dumped with 2477138396 kmers
Tue May 30 10:25:39 2017: batch 6 done and dumped with 2506687754 kmers
Tue May 30 11:43:34 2017: batch 7 done and dumped with 2588948991 kmers
Tue May 30 13:07:47 2017: batch 8 done and dumped with 2582684623 kmers
Tue May 30 14:31:09 2017: batch 9 done and dumped with 2641273776 kmers
Tue May 30 15:59:46 2017: batch 10 done and dumped with 2768253865 kmers
Tue May 30 16:26:02 2017: batch 11 done and dumped with 2649639881 kmers
Tue May 30 16:50:44 2017: batch 12 done and dumped with 2563868736 kmers
Tue May 30 18:12:00 2017: batch 13 done and dumped with 2946466737 kmers
Tue May 30 19:15:46 2017: batch 14 done and dumped with 3654065673 kmers
Tue May 30 20:37:30 2017: batch 15 done and dumped with 2929339080 kmers
Tue May 30 21:50:28 2017: batch 16 done and dumped with 2862097653 kmers
Tue May 30 23:09:32 2017: batch 17 done and dumped with 2867445244 kmers
Wed May 31 00:28:52 2017: batch 18 done and dumped with 2904585817 kmers
Wed May 31 00:54:30 2017: batch 19 done and dumped with 2919453621 kmers
Wed May 31 02:13:44 2017: batch 20 done and dumped with 2841856618 kmers
Wed May 31 03:32:56 2017: batch 21 done and dumped with 2864796621 kmers
Wed May 31 04:52:31 2017: batch 22 done and dumped with 2901847126 kmers
Wed May 31 06:22:51 2017: batch 23 done and dumped with 3261754449 kmers
Wed May 31 07:49:59 2017: batch 24 done and dumped with 3205541894 kmers
Wed May 31 09:14:50 2017: batch 25 done and dumped with 3067587988 kmers
Wed May 31 10:00:45 2017: batch 26 done and dumped with 2955175947 kmers
Wed May 31 11:25:03 2017: batch 27 done and dumped with 2819848661 kmers
Wed May 31 12:49:44 2017: batch 28 done and dumped with 2849583374 kmers
Wed May 31 14:14:36 2017: batch 29 done and dumped with 2881152158 kmers
Wed May 31 14:14:37 2017: merging from disk
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
w2rap-contigger.M008-007.sh: line 1: 31410 Aborted (core dumped) ~/Programs/w2rap-contigger/bin/w2rap-contigger -t 48 -m 500 -r M007-M008_R1.trimmed.fq.gz,M007-M008_R2.
trimmed.fq.gz -o contigs -p meerkat_k200_trimmed -d 30 --dump_all 1 --from_step 2 --to_step 2

I was working on a 512Gb FAT node, but it went up to 521Gb, so the node killed the process, even if I specified -m 500. Will it run well if I'm specifying -m 450 for example? Or will it go over it?

And another question, is it possible to start the process from the merging part?

Thanks for your help.
JC

from w2rap-contigger.

bjclavijo avatar bjclavijo commented on August 23, 2024

from w2rap-contigger.

jcgrenier avatar jcgrenier commented on August 23, 2024

Hi @bjclavijo,

So for now, playing with the min_freq is the only solution to reduce the memory usage at the step 2 right? So it will include less small kmers in the analysis?

Thanks for your help and for responding so quickly.

JC

from w2rap-contigger.

bjclavijo avatar bjclavijo commented on August 23, 2024

from w2rap-contigger.

jcgrenier avatar jcgrenier commented on August 23, 2024

In reality, it is the same individual but did over two lanes. So, in this case it's probably ok to proceed like this, increasing the representation diversity I guess.

Thanks!

from w2rap-contigger.

bjclavijo avatar bjclavijo commented on August 23, 2024

from w2rap-contigger.

shamshad1987 avatar shamshad1987 commented on August 23, 2024

Hello All,
Is there any pause option in this program? For example, if we are running it on a supercomputer where any job can run only for 24 hours, so it can be paused and resumed from where it was paused. Thanks

from w2rap-contigger.

jonwright99 avatar jonwright99 commented on August 23, 2024

You can run one step at a time with the --from_step and --to_step options. Individual steps may still take more than 24 hours if your genome is large.

from w2rap-contigger.

bjclavijo avatar bjclavijo commented on August 23, 2024

Closing this as it seems to be solved.

from w2rap-contigger.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.