Hello there,
Thanks for providing this tool. Do you have any way to know how much memory we would need to do the assembly for files of a particular size? I'm playing with two paired-end samples that were generated on a X10 machine. Each fastq.gz files are less than 40g. If I unzip them, they are of about 160G each. So far, I tried on a fat node containing 512G of memory and it crashed every time at the second step
Performing re-exec to adjust stack size.
Tue May 02 07:52:25 2017 run on cp0302, pid=8127 [Apr 13 2017 11:04:02 R52488 ]
DiscovarDeNovo READS="sample:M008 ::
HGTGYCCXX_8_160403_FR07921224_Other__R_151123_JEFWAL_M008_R{1,2
}.fastq" OUT_DIR=Discovar_Denovo NUM_THREADS=48
MAX_MEM_GB=500
SYSTEM INFO
- OS: Linux :: 2.6.32-642.6.2.el6.x86_64 :: #1 SMP Wed Oct 26 06:52:09 UTC 2016
- node name: cp0302
- hardware type: x86_64
- cache size: 512 KB
- cpu MHz: 2200.000
- cpu model name: AMD Opteron(tm) Processor 6174
- physical memory: 504.75 GB
Omitting memory check. If you run into problems with memory,
you might try rerunning with MEMORY_CHECK=True.
Tue May 02 07:52:25 2017: finding input files
Tue May 02 07:52:25 2017: reading 2 files (which may take a while)
INPUT FILES:
[1a,type=frag,sample=M008,lib=1,frac=1] M008_R1.fastq
[1b,type=frag,sample=M008,lib=1,frac=1] M008_R2.fastq
Tue May 02 12:38:11 2017: found 1 samples
Tue May 02 12:38:11 2017: starts = 0
Tue May 02 13:37:30 2017: using 964,997,086 reads
Tue May 02 13:37:31 2017: data extraction complete, peak mem = 375.88 GB
5.75 hours used extracting reads
Tue May 02 13:37:46 2017: see total physical memory of 541,975,564,288 bytes
Tue May 02 13:37:46 2017: see user-imposed limit on memory of 536,870,912,000 bytes
Tue May 02 13:37:46 2017: 3.74 bytes per read base, assuming max memory available
We need 46 passes.
Expect 1343834 keys per batch.
Provide 1517886 keys per batch.
There were 21 buffer overflows.
Fatal error (pid=8127) at Tue May 02 18:25:36 2017:
Insufficient memory.
Tue May 02 18:25:36 2017. Abort. Stopping.
Generating a backtrace...
Dump of stack:
- CRD::exit(int), in Exit.cc:30
- run, in MapReduceEngine.h:408
- (...), in BuildReadQGraph.cc:179
- buildReadQGraph(...), in BuildReadQGraph.cc:1311
- GapToyCore(int, char**), in GapToyCore.cc:584
- main, in DiscovarDeNovo.cc:43
I didn't try on the trimmed files so far, but I guess it won't work with the settings I have presently.
Another question too, is there a way to combine two samples together? The only way I thought about for the moment is concatenating the fastq files together, but it could create some issues with the library characteristics right?
Thanks a lot for your help
JC