GithubHelp home page GithubHelp logo

strange segfault about minia HOT 12 CLOSED

ekg avatar ekg commented on August 16, 2024
strange segfault

from minia.

Comments (12)

ekg avatar ekg commented on August 16, 2024

I should note that this machine has 256G of RAM and I don't appear to have any limitations as far as disk space.

from minia.

rchikhi avatar rchikhi commented on August 16, 2024

Hi Erik!

Curious, and that's definitely not tied to the number of fastq files, as at this stage minia only considers the counted kmers and no longer the input files. That's the first time I see this stage fail, it could be due to some special unhandled case in the graph structure.

I'd be happy to assist with debugging.

  1. Can the data be sent by any chance?
  2. Does it complete with a different kmer size?

from minia.

ekg avatar ekg commented on August 16, 2024

It does complete with a shorter kmer size (41) and the same abundance. I would love to share the data but I will need to ask my collaborators. I will send you an email if I can share. I definitely understand that's the only way to really resolve this problem. Interestingly the k=41 assembly has very few contigs (in the order of a few hundred kb) while the unitig set is several hundred mb. Also this data is a little strange, it's going to consist of many small fragments by library design (GBS/RadSeq).

from minia.

rchikhi avatar rchikhi commented on August 16, 2024

I see. Well, I'll keep an eye for that email, otherwise, just let me know if you encounter that bug again in another dataset. I'd like to get a sense of whether this is a one-of-a-kind thing.

from minia.

Mozart1776 avatar Mozart1776 commented on August 16, 2024

Hello,
I am also experiencing a segmentation fault of minia at the Iterating DSK partitions step at k41 (see the log file below). I am using a computer cluster with nearly 2TB of RAM and a large disk space. Minia runs without an issue with the test GATB dataset. Any help?
Thanks!

(2018-08-13 22:18:53) GATB-pipeline starting
(2018-08-13 22:18:53) Command line: /home/pavel/bin/gatb-minia-pipeline/gatb -1 /home/pavel/Run_1837_Tyrophagus_putrescentiae_Mex/Sample_79070/2a-removeDuplicates_clumpify/79070_TGACCA_S1_L003_R1_001_bbduk_no_adapters_Results/79070_TGACCA_S1_L003_R1_001_bbduk_no_adapters_deduplicated.fastq.gz -2 /home/pavel/Run_1837_Tyrophagus_putrescentiae_Mex/Sample_79070/2a-removeDuplicates_clumpify/79070_TGACCA_S1_L003_R1_001_bbduk_no_adapters_Results/79070_TGACCA_S1_L003_R2_001_bbduk_no_adapters_deduplicated.fastq.gz -o 79070_TGACCA_S1_L003_R1_001_bbduk_no_adapters_deduplicated.assembly

(2018-08-13 22:18:53) Setting maximum kmer length to: 151 bp
(2018-08-13 22:18:53) Multi-k values and cutoffs: [(21, 2), (41, 2), (61, 2), (81, 2), (101, 2), (121, 2), (141, 2)]

(2018-08-13 22:18:53) Minia assembling at k=21 min_abundance=2
(2018-08-13 22:18:53) Execution of 'minia/minia'. Command line:
/home/pavel/bin/gatb-minia-pipeline/tools/memused /home/pavel/bin/gatb-minia-pipeline/minia/minia -in 79070_TGACCA_S1_L003_R1_001_bbduk_no_adapters_deduplicated.assembly.list_reads -kmer-size 21 -abundance-min 2 -out 79070_TGACCA_S1_L003_R1_001_bbduk_no_adapters_deduplicated.assembly_k21
(2018-08-14 03:36:01) Finished Minia k=21

(2018-08-14 03:36:01) Minia assembling at k=41 min_abundance=2
(2018-08-14 03:36:01) Execution of 'minia/minia'. Command line:
/home/pavel/bin/gatb-minia-pipeline/tools/memused /home/pavel/bin/gatb-minia-pipeline/minia/minia -in 79070_TGACCA_S1_L003_R1_001_bbduk_no_adapters_deduplicated.assembly.list_reads -kmer-size 41 -abundance-min 2 -out 79070_TGACCA_S1_L003_R1_001_bbduk_no_adapters_deduplicated.assembly_k41
(2018-08-14 05:06:33) Execution of 'minia/minia' failed. Command line:
/home/pavel/bin/gatb-minia-pipeline/tools/memused /home/pavel/bin/gatb-minia-pipeline/minia/minia -in 79070_TGACCA_S1_L003_R1_001_bbduk_no_adapters_deduplicated.assembly.list_reads -kmer-size 41 -abundance-min 2 -out 79070_TGACCA_S1_L003_R1_001_bbduk_no_adapters_deduplicated.assembly_k41
pavel@sagarana:~/Run_1837_Tyrophagus_putrescentiae_Mex/Sample_79070/5d-gatb-pipeline/79070_TGACCA_S1_L003_R1_001_bbduk_no_adapters_deduplicated> cat 0-gatb_79070_TGACCA_S1_L003_R1_001_bbduk_no_adapters_deduplicated.pbs.e108512
[Approximating frequencies of minimizers ] 100 % elapsed: 0 min 48 sec remaining: 0 min 0 sec cpu: 99.9 % mem: [ 20, 20, 20] MB
[DSK: nb solid kmers found : 595537654 ] 212 % elapsed: 71 min 3 sec remaining: 0 min 0 sec cpu: 1813.0 % mem: [7845, 8032, 8050] MB
[Iterating DSK partitions ] 99.7 % elapsed: 21 min 6 sec remaining: 0 min 4 sec
[Building BooPHF] 100 % elapsed: 0 min 11 sec remaining: 0 min 0 sec
[removing tips, pass 1 ] 100 % elapsed: 8 min 49 sec remaining: 0 min 0 sec cpu: 25282.3 % mem: [28646, 28646, 33965] MB
[removing tips, pass 2 ] 100 % elapsed: 0 min 39 sec remaining: 0 min 0 sec cpu: 24498.1 % mem: [24143, 28675, 33965] MB
[removing tips, pass 3 ] 100 % elapsed: 0 min 21 sec remaining: 0 min 0 sec cpu: 7850.8 % mem: [24082, 24134, 33965] MB
[removing tips, pass 4 ] 100 % elapsed: 0 min 22 sec remaining: 0 min 0 sec cpu: 8396.4 % mem: [24082, 24082, 33965] MB
[removing tips, pass 5 ] 100 % elapsed: 0 min 21 sec remaining: 0 min 0 sec cpu: 7820.0 % mem: [24082, 24082, 33965] MB
[removing bulges, pass 1 ] 100 % elapsed: 10 min 29 sec remaining: 0 min 0 sec cpu: 22194.7 % mem: [24194, 24194, 33965] MB
[removing bulges, pass 2 ] 100 % elapsed: 9 min 7 sec remaining: 0 min 0 sec cpu: 23147.3 % mem: [23158, 24203, 33965] MB
[removing bulges, pass 3 ] 100 % elapsed: 9 min 10 sec remaining: 0 min 0 sec cpu: 23453.0 % mem: [22105, 23163, 33965] MB
[removing bulges, pass 4 ] 100 % elapsed: 9 min 12 sec remaining: 0 min 0 sec cpu: 23583.9 % mem: [21399, 22098, 33965] MB
[removing bulges, pass 5 ] 100 % elapsed: 9 min 44 sec remaining: 0 min 0 sec cpu: 23581.3 % mem: [20617, 21390, 33965] MB
[removing ec, pass 1 ] 100 % elapsed: 5 min 10 sec remaining: 0 min 0 sec cpu: 24824.4 % mem: [21343, 21343, 33965] MB
[removing ec, pass 2 ] 100 % elapsed: 1 min 4 sec remaining: 0 min 0 sec cpu: 20397.3 % mem: [20668, 21375, 33965] MB
[removing ec, pass 3 ] 100 % elapsed: 0 min 27 sec remaining: 0 min 0 sec cpu: 22962.5 % mem: [20167, 20698, 33965] MB
[removing ec, pass 4 ] 100 % elapsed: 0 min 27 sec remaining: 0 min 0 sec cpu: 23310.6 % mem: [19924, 20170, 33965] MB
[removing ec, pass 5 ] 100 % elapsed: 0 min 28 sec remaining: 0 min 0 sec cpu: 23301.5 % mem: [19860, 19925, 33965] MB
[removing tips, pass 6 ] 100 % elapsed: 0 min 55 sec remaining: 0 min 0 sec cpu: 25142.7 % mem: [19974, 19974, 33965] MB
[removing bulges, pass 6 ] 100 % elapsed: 1 min 36 sec remaining: 0 min 0 sec cpu: 22626.6 % mem: [19948, 20002, 33965] MB
[removing ec, pass 6 ] 100 % elapsed: 0 min 27 sec remaining: 0 min 0 sec cpu: 22708.7 % mem: [19920, 19947, 33965] MB
[removing tips, pass 7 ] 100 % elapsed: 0 min 21 sec remaining: 0 min 0 sec cpu: 24399.3 % mem: [19909, 19921, 33965] MB
[removing bulges, pass 7 ] 100 % elapsed: 1 min 37 sec remaining: 0 min 0 sec cpu: 23080.5 % mem: [19853, 19911, 33965] MB
[removing ec, pass 7 ] 100 % elapsed: 0 min 26 sec remaining: 0 min 0 sec cpu: 23476.6 % mem: [19837, 19848, 33965] MB
[removing tips, pass 8 ] 100 % elapsed: 0 min 16 sec remaining: 0 min 0 sec cpu: 8285.7 % mem: [19837, 19841, 33965] MB
[removing bulges, pass 8 ] 100 % elapsed: 1 min 37 sec remaining: 0 min 0 sec cpu: 23144.2 % mem: [19833, 19837, 33965] MB
[removing ec, pass 8 ] 100 % elapsed: 0 min 26 sec remaining: 0 min 0 sec cpu: 23530.7 % mem: [19827, 19828, 33965] MB
[removing tips, pass 9 ] 100 % elapsed: 0 min 15 sec remaining: 0 min 0 sec cpu: 8222.7 % mem: [19826, 19827, 33965] MB
[removing bulges, pass 9 ] 100 % elapsed: 1 min 36 sec remaining: 0 min 0 sec cpu: 23186.3 % mem: [19831, 19831, 33965] MB
[removing ec, pass 9 ] 100 % elapsed: 0 min 27 sec remaining: 0 min 0 sec cpu: 23680.5 % mem: [19825, 19826, 33965] MB
[removing tips, pass 10 ] 100 % elapsed: 0 min 16 sec remaining: 0 min 0 sec cpu: 8274.7 % mem: [19825, 19826, 33965] MB
[removing bulges, pass 10 ] 100 % elapsed: 1 min 36 sec remaining: 0 min 0 sec cpu: 23208.0 % mem: [19829, 19829, 33965] MB
[removing ec, pass 10 ] 100 % elapsed: 0 min 28 sec remaining: 0 min 0 sec cpu: 23634.8 % mem: [19825, 19825, 33965] MB
[Minia : assembly ] 100 % elapsed: 8 min 6 sec remaining: 0 min 0 sec cpu: 100.2 % mem: [19798, 19798, 33965] MB
[Approximating frequencies of minimizers ] 100 % elapsed: 1 min 17 sec remaining: 0 min 0 sec cpu: 99.8 % mem: [ 28, 28, 28] MB
[DSK: nb solid kmers found : 678092331 ] 201 % elapsed: 88 min 9 sec remaining: 0 min 0 sec cpu: 1630.6 % mem: [7642, 7848, 7871] MB
[Iterating DSK partitions ] 0 % elapsed: 0 min 0 sec remaining: 0 min 0 sec/home/pavel/bin/gatb-minia-pipeline/tools/memused: line 18: 249751 Segmentation fault "$@"

from minia.

rchikhi avatar rchikhi commented on August 16, 2024

Hi Mozart, thanks for reporting it. I'm assuming you cannot share the data either, therefore I'd be curious to see if the problem occurs with a different k-mer combinations. Can you please try the following command line? ./gatb --kmer-sizes 31,51,71 -1 [..] -2 [..] -o [..]

from minia.

Mozart1776 avatar Mozart1776 commented on August 16, 2024

from minia.

rchikhi avatar rchikhi commented on August 16, 2024

Hi Pavel,
Very nice! Please email me at [email protected]. Any way to get the data is fine, if you can share it on your end. Otherwise I'll send a shared dropbox link.
Rayan

from minia.

adigenova avatar adigenova commented on August 16, 2024

HI Rayan,
I have a similar problem with DSK(core dumped), the log is the following:

Minia 3, git commit 4b32fec
setting storage type to hdf5
[Approximating frequencies of minimizers ] 100 % elapsed: 2 min 45 sec remaining: 0 min 0 sec cpu: 99.9 % mem: [ 25, 25, 25] MB
[DSK: Pass 1/1, Step 2: counting kmers ] 73 % elapsed: 145 min 33 sec remaining: 53 min 49 sec cpu: 553.2 % mem: [36580, 36580, 36580] MB /home/adigenova/binaries/gatb-minia-pipeline/tools/memused: line 18: 33019 Segmentation fault (core dumped) "$@"
maximal memory used: 41023 MB
(2018-09-18 19:12:07) Execution of 'minia/minia' failed. Command line:
/home/adigenova/binaries/gatb-minia-pipeline/tools/memused /home/adigenova/binaries/gatb-minia-pipeline/minia/minia -in MMK.list_reads -kmer-size 61 -abundance-min 2 -out MMK_k61 -max-memory 35000

The command line that I used was the following:
~/binaries/gatb-minia-pipeline/gatb -l sequences.txt --nb-cores 20 --max-memory 35000 -o MMK --kmer-sizes 31,61,91 --abundance-mins 2,2,2

and the sequences.txt file contains the following reads:
stLFR.split_read.1.fq.gz
stLFR.split_read.2.fq.gz

the reads are from GIAB:
ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/NA12878/stLFR/stLFR.split_read.1.fq.gz
ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/NA12878/stLFR/stLFR.split_read.2.fq.gz

Let me know if you need more information to reproduce the error.
Thank in advance.

The best
Alex

from minia.

rchikhi avatar rchikhi commented on August 16, 2024

Hi Alex,
Thanks for the very detailed bug report and sorry for the answer delay. Can you reproduce this problem on another machine? Unfortunately I cannot, the pipeline finished without crashing on my server using the command line you provided, on the master branch of gatb-pipeline repo.

from minia.

adigenova avatar adigenova commented on August 16, 2024

from minia.

rchikhi avatar rchikhi commented on August 16, 2024

After offline discussion with Alex it seems that the problem he reported was related to higher memory usage than what was available on the system.
I'm closing this thread as it's aggregating a bunch of unrelated segfault problems.

from minia.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.