GithubHelp home page GithubHelp logo

Comments (10)

riederd avatar riederd commented on June 12, 2024

Hi,
we never ran into this problem. It seems to be an issue with mixcr for which we can not do much about.
Which kind of filesystem is /scratch/u/kfang/ChenHZ_lab/Neoantigen/test2/work using?

from nextneopi.

KunFang93 avatar KunFang93 commented on June 12, 2024

Hi

Our admin of Cluster doesn't know how to solve this issue too...The filesystem is nfs. Thanks~

from nextneopi.

riederd avatar riederd commented on June 12, 2024

Is the NFS lock daemon running on the system? Usually it should, but maybe you can check this, as well.

You can also manually try to run the mixcr process to see if it was only a transient problem by doing the following:

cd  /scratch/u/kfang/ChenHZ_lab/Neoantigen/test2/work/c3/5a4b303cc544c8f790079d0754d082
bash .command.run

If that works you may resume nextNEOpi with -resume

If you can not fix the issue, you can also skip the TCR stuff by using --TCR false

HTH

from nextneopi.

KunFang93 avatar KunFang93 commented on June 12, 2024

Thanks for your suggestion! Will try.

from nextneopi.

KunFang93 avatar KunFang93 commented on June 12, 2024

Hi,

I tried skipping the TCR stuff by using --TCR false. The pipeline works fine initially but stuck in the MarkDuplicates just like issue #17 . I wondered if there is anything I could do to solve the problem? Thanks for your help!

Best,
Kun

from nextneopi.

riederd avatar riederd commented on June 12, 2024

Hi, this is strange.

What happens when you cd to the work directory of the MarkDuplicates process and run the .command.run script manually?

First use ctrl+c to stop the pipeline, then look into the .nextflow.log file and get the work dir for the MarkDuplicates process. You might want to look for something like TaskHandler[id: 70; name: MarkDuplicates and note down the directory listed after workDir:

Then cd into this directory and run bash .command.run. You can monitor the activity with top

Can you also sent the output of ls -la in that workDir

from nextneopi.

KunFang93 avatar KunFang93 commented on June 12, 2024

Hi,

Thanks for your reply! This is the output of ls -al in the workDir

(base) [kun@g1400png-ap01lp 1f1771d1843dfa04c9ab2159038b5a]$ ls -la
total 48
drwxrwxr-x 2 kun kun  4096 Nov  8 14:35 .
drwxrwxr-x 3 kun kun  4096 Oct 26 12:00 ..
-rw-rw-r-- 1 kun kun     0 Nov  8 14:35 .command.begin
-rw-rw-r-- 1 kun kun   946 Nov  8 14:35 .command.err
-rw-rw-r-- 1 kun kun  1490 Nov  8 14:32 .command.log
-rw-rw-r-- 1 kun kun     0 Nov  8 14:35 .command.out
-rw-rw-r-- 1 kun kun 11019 Oct 26 12:13 .command.run
-rw-rw-r-- 1 kun kun   650 Oct 26 12:13 .command.sh
-rw-rw-r-- 1 kun kun     0 Nov  8 14:35 .command.trace
lrwxrwxrwx 1 kun kun    97 Nov  8 14:35 GRCh38.d1.vd1.dict -> /data/kun/software/nextNEOpi/resources/references/hg38/gdc/GRCh38.d1.vd1/fasta/GRCh38.d1.vd1.dict
lrwxrwxrwx 1 kun kun    95 Nov  8 14:35 GRCh38.d1.vd1.fa -> /data/kun/software/nextNEOpi/resources/references/hg38/gdc/GRCh38.d1.vd1/fasta/GRCh38.d1.vd1.fa
lrwxrwxrwx 1 kun kun    99 Nov  8 14:35 GRCh38.d1.vd1.fa.fai -> /data/kun/software/nextNEOpi/resources/references/hg38/gdc/GRCh38.d1.vd1/fasta/GRCh38.d1.vd1.fa.fai
lrwxrwxrwx 1 kun kun   138 Nov  8 14:35 Patient353_T1star_normal_DNA_aligned_uBAM_merged.bam -> /data/kun/ChenHZ_lab/Neoantigens/patient353/T1/work/bb/6282fe6f5845e1a2dc962465ab05c4/Patient353_T1star_normal_DNA_aligned_uBAM_merged.bam

When I am trying to run bash .command.run, the screen freezes with the output

(base) [kun@g1400png-ap01lp 1f1771d1843dfa04c9ab2159038b5a]$ bash .command.run

sambamba 0.7.1
 by Artem Tarasov and Pjotr Prins (C) 2012-2019
    LDC 1.20.0 / DMD v2.090.1 / LLVM7.0.0 / bootstrap LDC - the LLVM D compiler (0.17.6)

finding positions of the duplicate reads in the file...
22:35:42.344 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/opt/conda/share/gatk4-4.2.6.1-1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
[Tue Nov 08 22:35:42 UTC 2022] SetNmMdAndUqTags --INPUT /dev/stdin --OUTPUT Patient353_T1star_normal_DNA_aligned_sort_mkdp.bam --TMP_DIR /tmp/Kun/nextNEOpi --VALIDATION_STRINGENCY LENIENT --MAX_RECORDS_IN_RAM 4194304 --CREATE_INDEX true --REFERENCE_SEQUENCE GRCh38.d1.vd1.fa --IS_BISULFITE_SEQUENCE false --SET_ONLY_UQ false --VERBOSITY INFO --QUIET false --COMPRESSION_LEVEL 2 --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false

when I use top, I only see java process

top - 14:58:45 up 326 days,  5:51,  3 users,  load average: 5.13, 4.99, 4.96
Tasks: 601 total,   1 running, 600 sleeping,   0 stopped,   0 zombie
%Cpu(s):  9.8 us,  0.8 sy,  0.0 ni, 89.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 39465558+total,  1187048 free, 14884032 used, 37858451+buff/cache
KiB Swap:  2094076 total,  1415656 free,   678420 used. 37851708+avail Mem

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 67928 yufan     20   0 5604544   5.2g   1176 S 399.3  1.4   2578:19 bwa
240190 yufan     20   0   36.2g   3.3g  19228 S 107.6  0.9   1085:30 java
  5215 gdm       20   0  806208  87512    704 S   2.6  0.0   6112:34 gsd-color
272272 kun       20   0  162548   2816   1588 R   0.7  0.0   0:00.48 top
248733 kun       20   0   70.8g 373588  15280 S   0.3  0.1   0:12.51 java
     1 root      20   0  192024   3364   1632 S   0.0  0.0  11:28.58 systemd
     2 root      20   0       0      0      0 S   0.0  0.0   0:39.49 kthreadd

However, I checked with ps -ax. It looks like there are several commands is submitted

248492 pts/0    S+     0:00 bash .command.run
248513 pts/0    S+     0:00 tee .command.out
248514 pts/0    S+     0:00 tee .command.err
248515 pts/0    S+     0:00 bash .command.run
248518 pts/0    Sl+    0:00 Singularity runtime parent
248539 pts/0    S+     0:00 /bin/bash /data/kun/ChenHZ_lab/Neoantigens/patient353/T1/work/89/1f1771d1843dfa04c9ab2159038b5a/.command.run nxf_trace
248551 ?        S<     0:00 [loop0]
248575 pts/0    S+     0:00 /bin/bash -ue /data/kun/ChenHZ_lab/Neoantigens/patient353/T1/work/89/1f1771d1843dfa04c9ab2159038b5a/.command.sh
248577 pts/0    S+     0:06 /bin/bash /data/kun/ChenHZ_lab/Neoantigens/patient353/T1/work/89/1f1771d1843dfa04c9ab2159038b5a/.command.run nxf_trace
248584 pts/0    Sl+    6:39 sambamba markdup -t 20 --tmpdir /tmp/Kun/nextNEOpi --hash-table-size=1048576 --overflow-list-size=1000000 --io-buffer-size=1024 Patient353_T1s
248585 pts/0    S+     0:00 samtools sort -@20 -m 8G -O BAM -l 0 /dev/stdin
248586 pts/0    S+     0:00 python /opt/conda/bin/gatk --java-options -Xmx64G SetNmMdAndUqTags --TMP_DIR /tmp/Kun/nextNEOpi -R GRCh38.d1.vd1.fa -I /dev/stdin -O Patient35
248733 pts/0    Sl+    0:12 java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.co

I then use the following code to check if other PID is running

if ps -p $1 > /dev/null
then
   echo "$1 is running"
   # Do something knowing the pid exists, i.e. the process with $PID is running
fi

and found that 248584, 248585, 248586 is running.

Weird....Please let me know if any information is needed. Thanks for your help!

from nextneopi.

riederd avatar riederd commented on June 12, 2024

Hmm...
can you check if /tmp is running out of space when the MarkDuplicates process is running

from nextneopi.

riederd avatar riederd commented on June 12, 2024

I the problem could be related to a memory limit, can you please post the contents of /data/kun/ChenHZ_lab/Neoantigens/patient353/T1/work/89/1f1771d1843dfa04c9ab2159038b5a/.error.log ?

Try to reserve more memory in slurm for the process by setting something like:

withName:MarkDuplicates {
    cpus = 4
    memory = "96 GB"
 } 

in conf/process.config

from nextneopi.

KunFang93 avatar KunFang93 commented on June 12, 2024

Sorry for the late reply. I don't see .error.log in the folder

(base) [kun@g1400png-ap01lp 1f1771d1843dfa04c9ab2159038b5a]$ less .
./              ../             .command.begin  .command.err    .command.log    .command.out    .command.run    .command.sh     .command.trace  .exitcode

Ok, I will try it with modified config file. Since currently we found alternative way to predict neoantigens, I will try your suggestion and report the results later in case other run into same problem. Thanks for your time and help again!

from nextneopi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.