nf-core / circrna Goto Github PK

View Code? Open in Web Editor NEW

43.0 105.0 21.0 63.93 MB

circRNA quantification, differential expression analysis and miRNA target prediction of RNA-Seq data

Home Page: https://nf-co.re/circrna

License: MIT License

HTML 1.05% R 12.65% Shell 0.54% Python 4.06% Nextflow 81.70%

circrna circrna-prediction circrna-pipeline circular-rna rna-seq nf-core nextflow pipeline workflow bioinformatics

circrna's Introduction

Introduction

nf-core/circrna is a bioinformatics pipeline to analyse total RNA sequencing data obtained from organisms with a reference genome and annotation. It takes a samplesheet and FASTQ files as input, performs quality control (QC), trimming, back-splice junction (BSJ) detection, annotation, quantification and miRNA target prediction of circular RNAs.

The pipeline is still under development, but the BSJ detection and quantification steps are already implemented and functional. The following features are planned to be implemented soon:

Isoform-level circRNA detection and quantification
circRNA-miRNA interaction analysis using SPONGE and spongEffects
Improved downstream analyses

If you want to contribute, feel free to create an issue or pull request on the GitHub repository or join the Slack channel.

Pipeline summary

Raw read QC (FastQC)
Adapter trimming (Trim Galore!)
BSJ detection
circRNA annotation
- Based on a GTF file
- Based on database files (if provided)
Extract circRNA sequences and build circular transcriptome
Merge circular transcriptome with linear transcriptome derived from provided GTF
Quantification of combined circular and linear transcriptome
- psirc-quant
miRNA binding affinity analysis (only if the mature parameter is provided)
- Normalizes miRNA expression (only if the mirna_expression parameter is provided)
- Binding site prediction
  - miRanda
  - TargetScan
- Perform majority vote on binding sites
- Compute correlations between miRNA and transcript expression levels (only if the mirna_expression parameter is provided)
Statistical tests (only if the phenotype parameter is provided)
- CircTest
MultiQC report MultiQC

Usage

Note

If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

First, prepare a samplesheet with your input data that looks as follows:

sample,fastq_1,fastq_2
CONTROL,CONTROL_R1.fastq.gz,CONTROL_R2.fastq.gz
TREATMENT,TREATMENT_R1.fastq.gz,TREATMENT_R2.fastq.gz

Each row represents a fastq file (single-end) or a pair of fastq files (paired end).

Now, you can run the pipeline using:

nextflow run nf-core/circrna \
   -profile <docker/singularity/.../institute> \
   --input samplesheet.csv \
   --outdir <OUTDIR>

Warning

Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.

For more details and further functionality, please refer to the usage documentation and the parameter documentation.

Pipeline output

To see the results of an example test run with a full size dataset refer to the results tab on the nf-core website pipeline page. For more details about the output files and reports, please refer to the output documentation.

nextflow run nf-core/circrna \
   -profile <docker/singularity/.../institute> \
   --input samplesheet.csv \
   --outdir <OUTDIR>

Warning

For more details and further functionality, please refer to the usage documentation and the parameter documentation.

Pipeline output

Credits

nf-core/circrna was originally written by Barry Digby. It was later refactored, extended and improved by Nico Trummer.

We thank the following people for their extensive assistance in the development of this pipeline (in alphabetical order):

Acknowledgements

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

For further information or help, don't hesitate to get in touch on the Slack #circrna channel (you can join with this invite).

Citations

nf-core/circrna: a portable workflow for the quantification, miRNA target prediction and differential expression analysis of circular RNAs.

Barry Digby, Stephen P. Finn, & Pilib Ó Broin

BMC Bioinformatics 24, 27 (2023) doi: 10.1186/s12859-022-05125-8

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

circrna's People

Contributors

Stargazers

Watchers

circrna's Issues

Implement circRNADisease

Test workflow not working

Hello,

I am unable to successfully reproduce the test workflow :
nextflow run nf-core/circrna -profile test,docker -r dev

Error executing process > 'FASTA (fust1_1:CIRCexplorer2)'

Caused by:
  Missing output file(s) `fasta/*` expected by process `FASTA (fust1_1:CIRCexplorer2)`

Command executed:

  ## FASTA sequences (bedtools does not like the extra annotation info - split will not work properly)
  cut -d$'      ' -f1-12 fust1_1.bed > bed12.tmp
  bedtools getfasta -fi chrI.fa -bed bed12.tmp -s -split -name > circ_seq.tmp
  ## clean fasta header
  grep -A 1 '>' circ_seq.tmp | cut -d: -f1,2,3 > circ_seq.fa && rm circ_seq.tmp
  ## output to dir
  mkdir -p fasta
  awk -F '>' '/^>/ {F=sprintf("fasta/%s.fa",$2); print > F;next;} {print >> F;}' < circ_seq.fa

Command exit status:
  0

Command output:
  (empty)

Command error:
  index file chrI.fa.fai not found, generating...

I have also tried the master branch
nextflow run nf-core/circrna -profile test,docker -r master with another error.

I have attached the log files fgor both attempts.
nextflow.dev.log
nextflow.master.log

Hardware: Desktop 32 threads 64 go RAM
Executor: local
OS: Ubuntu 22 LTS
Nextflow: version 22.04.5.5708
Engine: Docker version 20.10.17, build 100c701
Image tag: https://github.com/nf-core/circrna [determined_pare] DSL1 - revision: cb41d06 [dev]

Implement circbank

Add `psirc quant` output to MulitQC via `kallisto` module

Description of feature

Kallisto module docs

Implement psirc for isoform detection

Documentation can be found here

Tasks

Beta Give feedback

Set up environment
Implement process
Options

Unexpected error [StackOverflowError]

Hi is there anything i need to setup before i can run this pipeline?
i can run nextflow with rnaseq image so shouldn't be nextflow installation problem.

after git clone the folder, running the test i got this error

Linux CentOS 7

nextflow run circrna -profile test,docker --input circna_ss.csv --outdir result/
N E X T F L O W  ~  version 21.10.6
Launching `circrna/main.nf` [awesome_hypatia] - revision: 059337e827
WARNING: Could not load nf-core/config profiles: https://raw.githubusercontent.com/nf-core/configs/master/nfcore_custom.config
Unexpected error [StackOverflowError]

Deprecate the `module` parameter

Description of feature

This parameter can safely be replaced with checks if specific other parameters have been defined. Currently, it is only used for enabling/disabling mirna_prediction and differential_expression.
For mirna_prediction we can use the mature.fa as an indicator. For differential_expression we can base this on the phenotype parameter.

Problem with star 2nd pass, Process exceeded running time limit - CIRCexplorer2

Hi, i am trying to run the pipeline with this command, however, i encounter this error "nv files not found on this host." it seems STAR output the temp file to somewhere else rather than the work directory?

my command:
./nextflow run circrna -profile singularity -c mynf.config --input SS.csv --outdir SSout -work-dir work/ --module 'circrna_discovery' --tool_filter 2 --skip_trimming true --skip_fastqc true --fasta hg38.fasta --gtf gencode.v45.basic.annotation.gtf --save_reference

Execution cancelled -- Finishing pending tasks before exit
-[nf-core/circrna] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_2ND_PASS (SS)'

Caused by:
  Process exceeded running time limit (16h)

Command executed:

  STAR \
      --genomeDir star \
      --readFilesIn input1/SS_1.merged.fastq.gz input2/SS_2.merged.fastq.gz \
      --runThreadN 8 \
      --outFileNamePrefix SS. \
      --outSAMtype BAM Unsorted \
       \
      --outSAMattrRGline 'ID:SS'  'SM:SS'  \
      --chimOutType Junctions WithinBAM --outSAMunmapped Within --outFilterType BySJout --outReadsUnmapped None --readFilesCommand zcat --sjdbFileChrStartEnd dataset.SJ.out$
...
Command exit status:
  -
Command output:
        STAR --genomeDir star --readFilesIn input1/SS_1.merged.fastq.gz input2/SS_2.merged.fastq.gz --runThreadN 8 --outFileNamePrefix SS. $
        STAR version: 2.7.10a   compiled: 2022-01-14T18:50:00-05:00 :/home/dobin/data/STAR/STARcode/STAR.master/source
  May 09 00:13:59 ..... started STAR run
  May 09 00:14:00 ..... loading genome
  May 09 00:14:43 ..... inserting junctions into the genome indices
  May 09 00:26:45 ..... started mapping

Command error:
  INFO:    Could not find any nv files on this host!

Work dir:
work/9d/685bc836c17ef132cdb98f3a558c50

Not all samples are used in analysis resulting in pipeline failur

Description of the bug

Hello,

Thank you for the pipeline. I am having quite trouble fully running the pipeline. I tried it with both Docker and Singularity with both master and dev branches and every time different issues occurs. I am listing the one is key for me to move forward,

In the dev branch

I have 6 samples but after the TRIMGALORE/FASTQ steps, it only runs one sample with STAR 1st pass. After failing with I try to resume the pipeline, somehow the STAR part shows compeletly different sample.

Command used and terminal output

Command

nextflow run nf-core/circrna  \
--input 'samplesheet.csv' \
--phenotype 'phenotype.csv' \
--genome 'GRCm38' \
--outdir 'Analysis' \
--tool 'circexplorer2' \
--module 'circrna_discovery,mirna_prediction,differential_expression' \
--mature '/BI/GenomeDB/Mus_musculus/GRCm38.102/mature_mus_musculus.fa' \
--fasta '/BI/GenomeDB/Mus_musculus/GRCm38.102/Mus_musculus.GRCm38.102.fa' \
--gtf '/BI/GenomeDB/Mus_musculus/GRCm38.102/Mus_musculus.GRCm38.102.gtf' \
--hisat2 '/BI/GenomeDB/Mus_musculus/GRCm38.102/hisat2_cricRNA' \
--star '/BI/GenomeDB/Mus_musculus/GRCm38.102/STAR' \
--igenomes_base /BI/GenomeDB \
--species Grcm38 -resume -profile docker --bsj_reads 2 -r dev

Try1

[-        ] process > NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:STAR_GENOMEGENERATE                                                   -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:SEGEMEHL_INDEX                                                        -
[c7/37137c] process > NFCORE_CIRCRNA:CIRCRNA:FASTQC_TRIMGALORE:FASTQC (Sample-04)                                               [100%] 6 of 6, cached: 6 ✔
[bf/5da299] process > NFCORE_CIRCRNA:CIRCRNA:FASTQC_TRIMGALORE:TRIMGALORE (Sample-04)                                           [100%] 6 of 6, cached: 6 ✔
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:SEGEMEHL_ALIGN                                                     -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:SEGEMEHL_FILTER                                                    -
[d3/51cbba] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_1ST_PASS (Sample-03)                                        [100%] 1 of 1 ✔
[38/8a4c88] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_SJDB (star_sjdb)                                              [100%] 1 of 1 ✔
[7f/b54d3b] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_2ND_PASS (Sample-03)                                        [100%] 1 of 1 ✔
[02/6c4629] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCEXPLORER2_REF (Mus_musculus.GRCm38.102.gtf)                    [100%] 1 of 1, cached: 1 ✔
[2d/2cef78] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCEXPLORER2_PAR (Sample-03)                                    [100%] 1 of 1 ✔
[9a/848547] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCEXPLORER2_ANN (Sample-03)                                    [100%] 1 of 1 ✔
[85/958c3b] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCEXPLORER2_FLT (Sample-03)                                    [100%] 1 of 1 ✔
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCRNA_FINDER_FILTER                                              -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:FIND_CIRC_ALIGN                                                    -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:SAMTOOLS_INDEX                                                     -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:SAMTOOLS_VIEW                                                      -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:FIND_CIRC_ANCHORS                                                  -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:FIND_CIRC                                                          -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:FIND_CIRC_FILTER                                                   -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT_YML                                                      -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT                                                          -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT_FILTER                                                   -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_1ST_PASS                                                       -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_SJDB                                                           -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_2ND_PASS                                                       -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_MATE1_1ST_PASS                                                 -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_MATE1_SJDB                                                     -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_MATE1_2ND_PASS                                                 -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_MATE2_1ST_PASS                                                 -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_MATE2_SJDB                                                     -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_MATE2_2ND_PASS                                                 -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC                                                                -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_FILTER                                                         -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_REFERENCE                                                -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_ALIGN                                                    -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_PARSE                                                    -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_ANNOTATE                                                 -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_FILTER                                                   -
[22/84f47e] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:ANNOTATION (Sample-03:circexplorer2)                             [  0%] 0 of 1
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:FASTA                                                              -
[be/f37df8] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:COUNTS_SINGLE (circexplorer2)                                      [100%] 1 of 1 ✔
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:MIRNA_PREDICTION:TARGETSCAN_DATABASE                                                 -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:MIRNA_PREDICTION:TARGETSCAN                                                          -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:MIRNA_PREDICTION:MIRANDA                                                             -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:MIRNA_PREDICTION:MIRNA_TARGETS                                                       -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:HISAT2_ALIGN                                                 -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:BAM_SORT_STATS_SAMTOOLS:SAMTOOLS_SORT                        -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:BAM_SORT_STATS_SAMTOOLS:SAMTOOLS_INDEX                       -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_STATS    -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_FLAGSTAT -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_IDXSTATS -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:STRINGTIE_STRINGTIE                                          -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:STRINGTIE_PREPDE                                             -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:DESEQ2_DIFFERENTIAL_EXPRESSION                               -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:PARENT_GENE                                                  -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:PREPARE_CLR_TEST                                             -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:CIRCTEST                                                     -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CUSTOM_DUMPSOFTWAREVERSIONS                                                          -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:MULTIQC                                                                              -
-[nf-core/circrna] Pipeline completed with errors-
WARN: Killing running tasks (1)



Try2
[17/7a55ab] process > NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:HISAT2_BUILD (Mus_musculus.GRCm38.102.fa)                             [  0%] 0 of 1
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:STAR_GENOMEGENERATE                                                   -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:SEGEMEHL_INDEX                                                        -
[a1/cea211] process > NFCORE_CIRCRNA:CIRCRNA:FASTQC_TRIMGALORE:FASTQC (Sample-06)                                               [100%] 6 of 6, cached: 6 ✔
[68/c1e594] process > NFCORE_CIRCRNA:CIRCRNA:FASTQC_TRIMGALORE:TRIMGALORE (Sample-03)                                           [100%] 6 of 6, cached: 6 ✔
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:SEGEMEHL_ALIGN                                                     -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:SEGEMEHL_FILTER                                                    -
[49/33bd8f] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_1ST_PASS (Sample-02)                                        [100%] 1 of 1, cached: 1 ✔
[45/93808b] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_SJDB (star_sjdb)                                              [100%] 1 of 1, cached: 1 ✔
[e5/fd8186] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_2ND_PASS (Sample-02)                                        [100%] 1 of 1, cached: 1 ✔
[76/4c2152] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCEXPLORER2_REF (Mus_musculus.GRCm38.102.gtf)                    [100%] 1 of 1, cached: 1 ✔
[08/a8caeb] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCEXPLORER2_PAR (Sample-02)                                    [100%] 1 of 1, cached: 1 ✔
[76/4ca3bc] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCEXPLORER2_ANN (Sample-02)                                    [100%] 1 of 1, cached: 1 ✔
[52/c81cf8] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCEXPLORER2_FLT (Sample-02)                                    [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCRNA_FINDER_FILTER                                              -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:FIND_CIRC_ALIGN                                                    -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:SAMTOOLS_INDEX                                                     -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:SAMTOOLS_VIEW                                                      -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:FIND_CIRC_ANCHORS                                                  -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:FIND_CIRC                                                          -

Relevant files

No response

System information

nextflow version 23.04.1
Ubuntu 18.04.6 LTS (GNU/Linux 4.15.0-213-generic x86_64)
Container Docker
Hardware Local Linux

nf-core/circrna -r dev

Implement polyA-Benchmarking

Description of feature

Perform a parallel execution of CIRCRNA_DISCOVERY using polyA-enriched data to investigate the tendency of tools to find circRNAs in datasets where there should not be any. Results should be added to MultiQC.

Implement improved annotation of hits also in the DIFFERENTIAL_EXPRESSION:PARENT_GENE process

Recently, @nictru updated the annotation pipeline (with #95), improving the overall performance. This should also be applied to the DIFFERENTIAL_EXPRESSION:PARENT_GENE process

Wrong BWA index directory in references

Description of the bug

CIRIquant step can't find BWA index files.
This behaviour is seen regardless if reference files are fetched from AWS or locally assigned.

Command used and terminal output

nextflow run nf-core/circrna \
 -profile docker \
 --input test_samples.csv \
 --genome GRCh38 \
 --input_type fastq \
 -r 5e17f6cbbc74b2c3bc807d26662dd7f411759b33 \
 --module 'circrna_discovery, mirna_prediction' \
 --tool 'ciriquant' \
 --bwa "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Sequence/BWAIndex/" \
 --bowtie "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Sequence/BowtieIndex/" \
 --bowtie2 "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Sequence/Bowtie2Index/" \
 --star "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Sequence/STARIndex/" \
 --gtf "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Annotation/Genes/genes.gtf" \
 --bed12 "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Annotation/Genes/genes.bed" \
 --mature "/home/lab32/references/Homo_sapiens/NCBI/GRCh38/Annotation/SmallRNA/mature.fa"

Workflow execution completed unsuccessfully!
The exit status of the task that caused the workflow execution to fail was: 1.

The full error message was:

Error executing process > 'CIRIQUANT (SRR16316888)'

Caused by:
  Process `CIRIQUANT (SRR16316888)` terminated with an error exit status (1)

Command executed:

  CIRIquant \
      -t 16 \
      -1 SRR16316888_1.fastq.gz \
      -2 SRR16316888_2.fastq.gz \
      --config travis.yml \
      --no-gene \
      -o SRR16316888 \
      -p SRR16316888
  
  ## Apply Filtering
  cp SRR16316888/SRR16316888.gtf .
  
  ## extract counts (convert float/double to int [no loss of information])
  grep -v "#" SRR16316888.gtf | awk '{print $14}' | cut -d '.' -f1 > counts
  grep -v "#" SRR16316888.gtf | awk -v OFS="	" '{print $1,$4,$5,$7}' > SRR16316888.tmp
  paste SRR16316888.tmp counts > SRR16316888_unfilt.bed
  
  ## filter bsj_reads
  awk '{if($5 >= 0) print $0}' SRR16316888_unfilt.bed > SRR16316888_filt.bed
  grep -v '^$' SRR16316888_filt.bed > SRR16316888_ciriquant
  
  ## correct offset bp position
  awk -v OFS="	" '{$2-=1;print}' SRR16316888_ciriquant > SRR16316888_ciriquant.bed
  
  rm SRR16316888.gtf
  
  ## Re-work for Annotation
  awk -v OFS="	" '{print $1, $2, $3, $1":"$2"-"$3":"$4, $5, $4}' SRR16316888_ciriquant.bed > SRR16316888_ciriquant_circs.bed

Command exit status:
  1

Command output:
  (empty)

Command error:
  Traceback (most recent call last):
    File "/opt/conda/envs/nf-core-circrna-1.0.0/bin/CIRIquant", line 8, in 
      sys.exit(main())
    File "/opt/conda/envs/nf-core-circrna-1.0.0/lib/python2.7/site-packages/CIRIquant/main.py", line 89, in main
      config = check_config(check_file(args.config_file))
    File "/opt/conda/envs/nf-core-circrna-1.0.0/lib/python2.7/site-packages/CIRIquant/utils.py", line 95, in check_config
      BWA_INDEX = os.path.splitext(check_file(config['reference']['bwa_index'] + '.bwt'))[0]
    File "/opt/conda/envs/nf-core-circrna-1.0.0/lib/python2.7/site-packages/CIRIquant/utils.py", line 49, in check_file
      raise ConfigError('File: {}, not found'.format(file_name))
  CIRIquant.utils.ConfigError: File: /home/lab32/Downloads/teste/results/reference_genome/BWAIndex/genome.fa.bwt, not found

Work dir:
  /home/lab32/Downloads/teste/work/2d/9d56bcc3e7d596e02df26bf69a5383

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

Relevant files

nextflow.log

System information

N E X T F L O W ~version 22.10.4, build 5836 (09-12-2022 09:58 UTC)
Desktop
local
docker
Ubuntu 22 LTS
nf-core/circrna v1.0.0

DSL2 modules

Description of feature

Categorization of the workflow at the process level with the corresponding modules needed to port to 'DSL2'. Once the modules have been created, I can place more shape on this in terms of subworkflows.

N.B: please checkout new branches for individual features and push to the DSL2 branch, not dev.

Input files

Currently, circRNA takes as input a samplesheet.csv file and a phenotype.csv file. Functions already exist to check these files, all that is needed is to place these in an input_check.nf local subworkflow.

I would like to incorporate strandedness like other nf-core workflows. Will check which circRNA quantification tools have a parameter denoting strandedness.

Pre-processing

The workflow takes as input fastq or bam files (which are converted to fastq using picard SamToFastq) and performs FastQC on the raw reads prior to trimming using BBDUK. The trimmed reads are then checked using FastQC again and placed in channels for downstream analyses.

FastQC
MultiQC
BBDUK
picard/SamToFastq (I don't care if we drop this functionality.)

circRNA Discovery

Several tools utilize the same aligner, there will be duplicates here.

CIRIquant

bwa index
hisat build
ciriquant

CIRCexplorer2

STAR genomegenerate
STAR align (2 Pass mode)
circexplorer2 parse
circexplorer2 annotate

circRNA_finder

star genomegenerate
star align (2 Pass mode)
circRNA_Finder (postProcessStarAlignment.pl script)

DCC

DCC maps paired-end reads jointly and separately using STAR 2 pass mode. The goal is to generate chimeric.junction.out files from joint STAR mapping and individual read 1 and read 2 STAR mapping.

star genomegenerate
star align (2 Pass)
dcc

find_circ

bowtie2 build
bowtie2 align
find_circ find_anchors
find_circ find_circ

Mapsplice

bowtie build
mapsplice align
circexplorer2 parse
circexplorer2 annotate

Segemehl

segemehl align

Custom scripts to parse segemehl output, no need to create a module.

circRNA annotation

customized bash script to standardise the annotation outputs from the seven quantification tools.

circRNA FASTA sequence

customized bash script to generate the mature spliced sequence in FASTA format, and append the back-splice junction sequence for miRNA target prediction.

circRNA count matrix

consolidate the circRNAs called by multiple tools on a per sample basis, generate the count matrix.

miRNA target prediction

miRanda

miranda

TargetScan

targetscan. biocontainers #475

custom script to amalgamate the results from both tools.

Differential expression

hisat build
hisat align
stringite

Custom R scripts for DESeq2 and CircTest, no need to create modules.

Add strandedness column to samplesheet

Description of feature

Discovery tools can handle strandedness information to improve their result quality. Currently the pipeline offers no way of specifying the strandedness.

The following changes need to be made:

Tasks

Beta Give feedback

Update the samplesheet schema
Make sure the information gets passed through to the discovery tool
Find out which discovery tools can make use of the information
Add proper handling of the information in all tools supporting this
Update documentation
Options

Add support for full length circRNA sequence detection

Description of feature

Currently, only the BSJs are detected and quantified, without knowing the exact sequences. There are several tools which can obtain the full length circRNA sequence from paired-end data listed below. Unfortunately, none of these tools can process single-end data.

We should then create a parameter for switching between BSJ level and Isoform level discovery.

Tasks

Beta Give feedback

Implement JCcirc #133

0 of 6

feature-request first-timers-only
Implement CIRI-full #134

0 of 4

feature-request first-timers-only
Implement circRNAfull #135

0 of 3

feature-request first-timers-only
Implement psirc for isoform detection #136

1 of 2

feature-request first-timers-only
Options

BSJ reads filtering

Allow user to require circRNAs to have [int] reads spanning BSJ.

Pass as parameter to awk filtering steps / shell scripts in bin.

Timeout waiting for connection from pool

Description of the bug

Hi,
I get the error in the title when I run the command line below.
The error is this one: https://www.biostars.org/p/9590721/ but related to circrna.

What's wrong?

thanks in advance

Command used and terminal output

./nextflow run nf-core/circrna -profile docker --genome GRCh38 --input samples.csv --tool 'ciriquant' --module 'circrna_discovery' --outdir ./results_single --tool_filter 1 -r dev

Relevant files

.nextflow.log

System information

N E X T F L O W ~ version 23.10.1
OS: Rocky Linux 8
Container: Docker
Version of nf-core/circrna: revision: c29124f [dev]

conda recepies

nf-core/circrna modules that require package builds for the conda repository:

CIRIquant

PR: bioconda/bioconda-recipes#38029

Not working because argparse>=1.2.1 and scipy==1.2.2 are not found using defaults/conda-forge/bioconda.

argparse>=1.2.1 is available from this channel: pdrops::argparse==1.2.2 but I cannot figure out how to tell my build to incorporate this channel. I have added it to my local conda config, to no avail...

scipy==1.2.2 does not exist on conda.

I have tried both pypi skeletons and generic noarch python skeletons here, so I am not sure how to proceed.

circtools

PR: bioconda/bioconda-recipes#37786

posterity..

PR: bioconda/bioconda-recipes#37786

The author of DCC made a pypi package called circtools available at https://pypi.org/project/circtools/. This is convenient for me as it bundles DCC and CircTest together, both of which are used in the workflow.

grayskull pypi circtools
conda build recipes/circtools

Successfully builds, pushed to anaconda to test

conda config --set anaconda_upload yes
anaconda upload \
    /Users/bdigby/opt/anaconda3/conda-bld/osx-64/circtools-1.2.1-py39_0.tar.bz2

TODO: This package works as expected, but I need it to be hosted by bioconda and not b.digby
https://anaconda.org/b.digby/circtools

bioconda-utils build --docker --mulled-test --packages recipes/circtools fails, see below:

error log

(bioconda) bdigby@sr-loaner-c02ytamllvcf bioconda-recipes % bioconda-utils build --docker --mulled-test --packages circtools
14:40:40 BIOCONDA INFO Considering total of 1 recipes (circtools).
14:40:40 BIOCONDA INFO Processing 1 recipes (circtools).
14:40:41 BIOCONDA INFO Generating DAG
Loading Recipes: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 69.64it/s]
14:40:41 BIOCONDA INFO 1 recipes to build and test: 
circtools
14:40:41 BIOCONDA INFO Determining expected packages for recipes/circtools
Setting build platform. This is only useful when pretending to be on another platform, such as for rendering necessary dependencies on a non-native platform. I trust that you know what you're doing.
14:40:41 CONDA_BUILD WARNING Setting build platform. This is only useful when pretending to be on another platform, such as for rendering necessary dependencies on a non-native platform. I trust that you know what you're doing.
Updating build index: /Users/bdigby/opt/anaconda3/envs/bioconda/conda-bld

No numpy version specified in conda_build_config.yaml.  Falling back to default numpy value of 1.16
14:40:41 CONDA_BUILD WARNING No numpy version specified in conda_build_config.yaml.  Falling back to default numpy value of 1.16
Adding in variants from internal_defaults
14:40:41 CONDA_BUILD INFO Adding in variants from internal_defaults
Adding in variants from /Users/bdigby/opt/anaconda3/envs/bioconda/conda_build_config.yaml
14:40:41 CONDA_BUILD INFO Adding in variants from /Users/bdigby/opt/anaconda3/envs/bioconda/conda_build_config.yaml
Adding in variants from /Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/bioconda_utils/bioconda_utils-conda_build_config.yaml
14:40:41 CONDA_BUILD INFO Adding in variants from /Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/bioconda_utils/bioconda_utils-conda_build_config.yaml
defaults/noarch: 4.77MB [00:00, 30.2MB/s]                                                                                      | 0/9 [00:00<?, ?files/s]
defaults/linux: 33.5MB [00:00, 39.3MB/s]                                                                               | 1/9 [00:00<00:03,  2.61files/s]
bioconda/linux: 29.0MB [00:01, 16.4MB/s]▍                                                                              | 2/9 [00:01<00:07,  1.07s/files]
defaults/osx: 30.8MB [00:03, 10.0MB/s]█████████████▋                                                                   | 3/9 [00:03<00:06,  1.07s/files]
conda-forge/noarch: 69.2MB [00:04, 16.6MB/s]██████████████████▉                                                        | 4/9 [00:04<00:06,  1.20s/files]
bioconda/osx: 23.4MB [00:04, 5.89MB/s]MB/s]]
bioconda/noarch: 26.1MB [00:00, 86.9MB/s]█████████████████████████████████                                             | 5/9 [00:08<00:08,  2.09s/files]
conda-forge/linux: 215MB [00:10, 20.8MB/s]██████████████████████████████████████████████████████▌                      | 7/9 [00:09<00:02,  1.30s/files]
conda-forge/osx: 187MB [00:20, 9.59MB/s]███████████████████████████████████████████████████████████████████▊           | 8/9 [00:19<00:03,  3.83s/files]
Downloading: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:27<00:00,  3.09s/files]
Updating build index: /Users/bdigby/opt/anaconda3/envs/bioconda/conda-bld

Adding in variants from argument_variants
14:41:20 CONDA_BUILD INFO Adding in variants from argument_variants
Attempting to finalize metadata for circtools
14:41:20 CONDA_BUILD INFO Attempting to finalize metadata for circtools
bioconda/noarch                                      3.8MB @   4.0MB/s  1.1s
conda-forge/noarch                                  10.1MB @   4.4MB/s  2.6s
bioconda/osx-64                                      3.7MB @   1.3MB/s  2.9s
conda-forge/osx-64                                  24.7MB @   4.9MB/s  6.1s
Reloading output folder: /Users/bdigby/opt/anaconda3/envs/bioconda/conda-bld
Users/bdigby/opt/anaconda3/envs/bioconda/conda-b..  ??.?MB @  ??.?MB/s 0 failed  0.0s
Users/bdigby/opt/anaconda3/envs/bioconda/conda-b.. 127.0 B @ 296.0kB/s  0.0s
Transaction

  Prefix: /Users/bdigby/opt/anaconda3/envs/bioconda

  Nothing to do

Reloading output folder: /Users/bdigby/opt/anaconda3/envs/bioconda/conda-bld
Users/bdigby/opt/anaconda3/envs/bioconda/conda-b..  ??.?MB @  ??.?MB/s 0 failed  0.0s
Users/bdigby/opt/anaconda3/envs/bioconda/conda-b.. 127.0 B @   1.2MB/s  0.0s
Transaction

  Prefix: /Users/bdigby/opt/anaconda3/envs/bioconda

  Updating specs:

   - pip
   - python=3.9[build=*_cpython]


  Package              Version  Build               Channel                                                                          Size
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
  Install:
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

  + bzip2                1.0.8  h0d85af4_4          conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/osx-64     159kB
  + ca-certificates  2022.9.24  h033912b_0          conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/osx-64     154kB
  + libffi               3.4.2  h0d85af4_5          conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/osx-64      51kB
  + libsqlite           3.39.4  ha978bb4_0          conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/osx-64     891kB
  + libzlib             1.2.13  hfd90126_4          conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/osx-64      66kB
  + ncurses                6.3  h96cf925_1          conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/osx-64     937kB
  + openssl              3.0.7  hfd90126_0          conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/osx-64       3MB
  + pip                   22.3  pyhd8ed1ab_0        conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/noarch       2MB
  + python              3.9.13  hf8d34f4_0_cpython  conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/osx-64      13MB
  + readline             8.1.2  h3899abd_0          conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/osx-64     272kB
  + setuptools          65.5.0  pyhd8ed1ab_0        conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/noarch     787kB
  + sqlite              3.39.4  h9ae0607_0          conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/osx-64     897kB
  + tk                  8.6.12  h5dbffcc_0          conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/osx-64       4MB
  + tzdata               2022f  h191b570_0          conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/noarch     121kB
  + wheel               0.37.1  pyhd8ed1ab_0        conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/noarch      32kB
  + xz                   5.2.6  h775f41a_0          conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/osx-64     238kB

  Summary:

  Install: 16 packages

  Total download: 26MB

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Traceback (most recent call last):
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/bin/bioconda-utils", line 10, in <module>
    sys.exit(main())
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/bioconda_utils/cli.py", line 971, in main
    bioconductor_skeleton, clean_cran_skeleton, autobump, bot
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/argh/dispatching.py", line 328, in dispatch_commands
    dispatch(parser, *args, **kwargs)
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/argh/dispatching.py", line 174, in dispatch
    for line in lines:
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/argh/dispatching.py", line 277, in _execute_command
    for line in result:
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/argh/dispatching.py", line 260, in _call
    result = function(*positional, **keywords)
  File "<boltons.funcutils.FunctionBuilder-5>", line 2, in build
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/bioconda_utils/cli.py", line 130, in wrapper
    func(*args, **kwargs)
  File "<boltons.funcutils.FunctionBuilder-4>", line 2, in build
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/bioconda_utils/cli.py", line 59, in wrapper
    func(*args, **kwargs)
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/bioconda_utils/cli.py", line 476, in build
    keep_old_work=keep_old_work)
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/bioconda_utils/build.py", line 320, in build_recipes
    pkg_paths = utils.get_package_paths(recipe, check_channels, force=force)
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/bioconda_utils/utils.py", line 1126, in get_package_paths
    platform, metas = _load_platform_metas(recipe, finalize=True)
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/bioconda_utils/utils.py", line 1035, in _load_platform_metas
    return platform, load_all_meta(recipe, config=config, finalize=finalize)
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/bioconda_utils/utils.py", line 447, in load_all_meta
    for non_finalized_meta in metas
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/bioconda_utils/utils.py", line 453, in <listcomp>
    bypass_env_check=False,
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/conda_build/api.py", line 52, in render
    bypass_env_check=bypass_env_check):
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/conda_build/metadata.py", line 2122, in get_output_metadata_set
    bypass_env_check=bypass_env_check)
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/conda_build/metadata.py", line 782, in finalize_outputs_pass
    permit_unsatisfiable_variants=permit_unsatisfiable_variants)
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/conda_build/render.py", line 548, in finalize_metadata
    exclude_pattern)
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/conda_build/render.py", line 409, in add_upstream_pins
    permit_unsatisfiable_variants, exclude_pattern)
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/conda_build/render.py", line 375, in _read_upstream_pin_files
    permit_unsatisfiable_variants=permit_unsatisfiable_variants)
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/conda_build/render.py", line 148, in get_env_dependencies
    channel_urls=tuple(m.config.channel_urls))
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/boa/cli/mambabuild.py", line 124, in mamba_get_install_actions
    solution = solver.solve_for_action(_specs, prefix)
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/boa/core/solver.py", line 242, in solve_for_action
    self.index + self.local_index,
  File "/Users/bdigby/opt/anaconda3/envs/bioconda/lib/python3.7/site-packages/boa/core/solver.py", line 79, in to_action
    entry = lookup_dict[get_url_from_channel(c)]
KeyError: 'https://conda.anaconda.org.-a24869bc-cf14-4239-84c7-1b52b5a11e95/conda-forge/osx-64'

find_circ

PR: bioconda/bioconda-recipes#37923

circrna_finder

PR: bioconda/bioconda-recipes#37922

targetscan

PR: bioconda/bioconda-recipes#37960

I cannot find a license file for targetscan, and it is denoted as being Copyright (c) The Whitehead Institute of Biomedical Research.

Is it appropriate to add this to bioconda? If not, how can I proceed to ensure I am following nf-core best practices?

DEA

Mulled container for differential exp scripts. PR: BioContainers/multi-package-containers#2382

Implement JCcirc

Documentation can be found here.

Implement options for creating contig files:

Tasks

Beta Give feedback

Trinity
SPAdes
SOAPdenovo-Trans
Add parameter for deciding which of the above should be used
Options

Then implent JCcirc:

Tasks

Beta Give feedback

Create environment
Implement process
Options

Deprecated mapsplice docker image: quay.io/biocontainers/mapsplice:2.2.1--py27h07887db_0

Description of the bug

Running nf-core/circrna with mapsplice, the docker showed DEPRECATION NOTICE when pulling mapsplice image: quay.io/biocontainers/mapsplice:2.2.1--py27h07887db_0

Command used and terminal output

Command used: nextflow run nf-core/circrna -r dev --input sample_sheet_for_circrna.csv --outdir circrna_output  -profile docker  --phenotype phenotype_for_circrna_diff_expr.csv --module 'circrna_discovery,mirna_prediction,differential_expression' --tool_filter 2 --tool 'ciriquant,circexplorer2,find_circ,circrna_finder,mapsplice,dcc,segemehl'  --genome GRCh37

Terminal output: 
-[nf-core/circrna] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE:ALIGN (MQ221024_007)'

Caused by:
  Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE:ALIGN (MQ221024_007)` terminated with an error exit status (125)


Command executed:

  gzip -d -f MQ221024_007_1_val_1.fq.gz
  gzip -d -f MQ221024_007_2_val_2.fq.gz
  
  mapsplice.py \
      -c chromosomes \
      -x genes \
      -1 MQ221024_007_1_val_1.fq \
      -2 MQ221024_007_2_val_2.fq \
      -p 12 \
      --bam \
      --gene-gtf genes.gtf \
      -o MQ221024_007 \
      --seglen 25 --min-intron 20 --max-intron 1000000 --min-map-len 40 --min-fusion-distance 200 --fusion-non-canonical
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE:ALIGN":
      mapsplice: v2.2.1
  END_VERSIONS

Command exit status:
  125

Command output:
  (empty)

Command error:
  Unable to find image 'quay.io/biocontainers/mapsplice:2.2.1--py27h07887db_0' locally
  2.2.1--py27h07887db_0: Pulling from biocontainers/mapsplice
  docker: [DEPRECATION NOTICE] Docker Image Format v1 and Docker Image manifest version 2, schema 1 support is disabled by default and will be removed in an upcoming release. Suggest the author of quay.io/biocontainers/mapsplice:2.2.1--py27h07887db_0 to upgrade the image to the OCI Format or Docker Image manifest v2, schema 2. More information at https://docs.docker.com/go/deprecated-image-specs/.
  See 'docker run --help'.

Work dir:
  /home/ian/NGSDATA_FOR_NEXTFLOW/Dr.LaiCH_RNASeq/work/16/48bd004cc53d8f4b18f073458485db

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

 -- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting

 -- Check '.nextflow.log' file for details

Relevant files

No response

System information

No response

Problem in running circrna

Description of the bug

Nextflow is an excellent standardized pipeline tool and helps me a lot.

When I use the circrna, it always report error:

Execution cancelled -- Finishing pending tasks before exit
WARN: Got an interrupted exception while taking agent result | java.lang.InterruptedException
ERROR ~ Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:HISAT2_EXTRACTSPLICESITES'

Caused by:
  Not a valid path value: 'null'

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

 -- Check '.nextflow.log' file for details
ERROR ~ Unexpected error [ClosedByInterruptException]

 -- Check '.nextflow.log' file for details
ERROR ~ Unexpected error [ClosedByInterruptException]

 -- Check '.nextflow.log' file for details

It doesn't seem to have read the file yet. But when I change the code to run eccdna, it works well.

Could you please tell me why?

Command used and terminal output

$nextflow run nf-core/circrna --input samplesheet_1.csv --outdir '/media/super/Hard_Disk_1/USING/circdna/HCMV/HCMV/fastq/result' --genome GRCh38 -profile docker

Relevant files

.nextflow.log

System information

N E X T F L O W ~ version 23.10.0
Hardware: Desktop
Executor: local
Container engine: Docker
OS: Ubuntu 22.04
version of circrna: revision: 18e580e [master]

Fix problems with outdated STAR indices in iGenomes

Description of the bug

When using STAR indices from iGenomes, the versions are partly not compatible with the STAR version used by the pipeline. A workaround is setting manually setting star = null, which prevents usage of iGenomes and thus forces the pipeline to build an own index.

Command used and terminal output

No response

Relevant files

No response

System information

No response

To Do before release

✔️ Improve DEA.R differential expression script such that all possible comparisons are made within the response variable. i.e condition = c(A, B, C, D) will produce A vs B, A vs C, A vs D, B vs C, B vs D, C vs D. This will circumvent the need for response variable being named 'control'.

✔️ Make the phenotype.csv colname[1] = Sample_ID to match the input.csv file header. (update usage documentation, don't think this is hard-coded in the nextflow or DEA.R script. ).

✔️ Omit grep -v "6mer" from TaregtScan output, as 6mers are included in the miRanda output. Filter only by MFE, which in turn can be passed as a parameter to the workflow.

Fix software versions, the output csv file is not capturing versions for all tools.

mapsplice error when input fasta file contains multiple fields in the header

Description of the bug

When running nf-core/circrna with the mapsplice as circRNA detection tool, I encountered the following error form mapsplice:
'chr1 1' contains space
which is due to extra fields in the input fasta file

I replicated this issue by changing the nf-core test fasta file.

It usually looks like this

>chrI
gcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagc
ctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcct
aagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaa

When changed to this, the error occurs (see also below)

>chrI test
gcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagc
ctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcct
aagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaa

When downloading fasta files from repositories, there often are more fields in the header. For example:

Gencode (this file)

>chr1 1
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN

Ensembl (this file)

>1 dna_sm:chromosome chromosome:GRCh38:1:1:248956422:1 REF
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn

So my question is, should there be something to catch this possible error? Is this a problem in other pipelines?

If you think this is useful, I can take a look implementing it, maybe using something like
head GRCh38.p14.genome.fa | awk '/^>/ {print $1;next;} {print;}' > GRCh38.p14.genome_clean_header.fa

Two side notes (mostly for myself)

In this mapsplice_align module, bowtie_index is in the 'input' block, but is never used. I assume this can be removed? And/or the specific mapsplice parameters -x and -c should be double-chekced
the link to mapsplice documentation (mentioned in the error below) does not work anymore

Command used and terminal output

nextflow run \
	nf-core/circrna \
	-r dev \
	-profile docker,arm,test \
	--fasta /Users/marieke/Documents/test_mapsplice/chrI.fa \
	--tool mapsplice

error

Execution cancelled -- Finishing pending tasks before exit
-[nf-core/circrna] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_ALIGN (fust1_1)'

Caused by:
  Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_ALIGN (fust1_1)` terminated with an error exit status (1)

Command executed:

  gzip -d -f fust1_1_1_val_1.fq.gz
  gzip -d -f fust1_1_2_val_2.fq.gz

  mapsplice.py \
      -c chromosomes \
      -x chrI \
      -1 fust1_1_1_val_1.fq \
      -2 fust1_1_2_val_2.fq \
      -p 2 \
      --bam \
      --gene-gtf chrI.gtf \
      -o fust1_1 \
      --seglen 25 --min-intron 20 --max-intron 1000000 --min-map-len 40 --min-fusion-distance 200 --fusion-non-canonical

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_ALIGN":
      mapsplice: v2.2.1
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:

  -----------------------------------------------
  [Tue Jan 30 15:13:14 2024] Beginning Mapsplice run (MapSplice v2.2.1)
  [Tue Jan 30 15:13:14 2024] Bin directory: /usr/local/bin/
  [Tue Jan 30 15:13:14 2024] Preparing output location fust1_1/
  [Tue Jan 30 15:13:14 2024] Checking files or directory: fust1_1_1_val_1.fq
  [Tue Jan 30 15:13:14 2024] Checking files or directory: fust1_1_2_val_2.fq
  [Tue Jan 30 15:13:14 2024] Checking files or directory: chromosomes/
  [Tue Jan 30 15:13:14 2024] Checking Bowtie index files
  [Tue Jan 30 15:13:14 2024] Building Bowtie index for reference sequence
  [Tue Jan 30 15:13:27 2024] Inspecting Bowtie index files
  [Tue Jan 30 15:13:28 2024] Checking reference sequence length
  [Tue Jan 30 15:13:28 2024] Checking consistency of Bowtie index and reference sequence
  Error: Reference name in Bowtie Index contains space:
  'chrI test' contains space
  [MapSplice Running Failed]
  Error: Checking consistency of Bowtie index and reference sequence failed
  Please check if Bowtie Index and Reference Sequence parameters are set correctly and they comply with MapSplice requirements
  Visit MapSplice 2.0 online manual for details:
  http://www.netlab.uky.edu/p/bioinfo/MapSplice2UserGuide#CommandLine

Work dir:
  /Users/marieke/Documents/test_mapsplice/work/62/b028adfa87c8ea30a882dfd956c599

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

 -- Check '.nextflow.log' file for details

Relevant files

chrI.fa.txt

System information

Nextflow version 23.10.1 build 5891
Mac Studio (M2) running Sonoma
local using Docker
nf-core/circrna dev 9999e23

Implement coding potential analysis

Description of feature

Checking the detected circRNAs for their coding potential could open an additional perspective on deciphering the function of circRNAs. This tool was suggested to me for this purpose.

ERROR ~ Argument of `file` function cannot be null

Description of the bug

I cannot run nf-core/circRNA. The code works on other computers and in a cluster.

Command used and terminal output

nextflow run nf-core/circRNA 
          -r dev -profile apptainer 
          --input data/samplesheet_circRNA.csv 
          --module circrna_discovery 
          --outdir /home/neuroim/Desktop/results_circRNA 
          --tool 'ciriquant','circexplorer2','find_circ','circrna_finder','dcc','segemehl' 
          --max_cpus 12 
          --max_memory 64GB 
          -w /home/neuroim/Desktop/work_rnaseq 
          --genome GRCh38 
          --save_reference false 
          --bowtie /data/index/bowtie/ 
          --bowtie2 /data/index/bowtie2/ 
          --bwa /data/index/bwa/ 
          --star /data/index/star/ 
          --segemehl /data/index/segemehl/ 
          --hisat2 /data/index/hisat2/ 
          --skip_fastqc

 N E X T F L O W   ~  version 24.04.2

NOTE: Your local project version looks outdated - a different revision is available in the remote repository [d5967c5273]
Launching `https://github.com/nf-core/circRNA` [intergalactic_ampere] DSL2 - revision: 5ddc061929 [dev]

WARN: Access to undefined parameter `monochromeLogs` -- Initialise it to a default value eg. `params.monochromeLogs = some_value`


------------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/circrna vdev-g5ddc061
------------------------------------------------------
Core Nextflow options
  revision       : dev
  runName        : intergalactic_ampere
  containerEngine: apptainer
  launchDir      : /data/RNAseq_MS
  workDir        : /home/neuroim/Desktop/work_rnaseq
  projectDir     : /home/neuroim/.nextflow/assets/nf-core/circRNA
  userName       : neuroim
  profile        : apptainer
  configFiles    : 

Input/output options
  input          : data/samplesheet_circRNA.csv
  outdir         : /home/neuroim/Desktop/results_circRNA

Pipeline Options
  tool           : ciriquant,circexplorer2,find_circ,circrna_finder,dcc,segemehl

Reference genome options
  save_reference : false
  genome         : GRCh38
  fasta          : s3://ngi-igenomes/igenomes//Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa
  gtf            : s3://ngi-igenomes/igenomes//Homo_sapiens/NCBI/GRCh38/Annotation/Genes/genes.gtf
  mature         : s3://ngi-igenomes/igenomes//Homo_sapiens/NCBI/GRCh38/Annotation/SmallRNA/mature.fa
  bowtie         : /data/index/bowtie/
  bowtie2        : /data/index/bowtie2/
  bwa            : /data/index/bwa/
  hisat2         : /data/index/hisat2/
  segemehl       : /data/index/segemehl/
  star           : /data/index/star/

Read trimming options
  skip_fastqc    : true

Max job request options
  max_cpus       : 12
  max_memory     : 64GB

!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
If you use nf-core/circrna for your analysis please cite:

* The pipeline

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

* Software dependencies
  https://github.com/nf-core/circrna/blob/master/CITATIONS.md
------------------------------------------------------
ERROR ~ Argument of `file` function cannot be null

 -- Check script '/home/neuroim/.nextflow/assets/nf-core/circRNA/main.nf' at line: 62 or see '.nextflow.log' file for more details

Relevant files

No response

System information

Nextflow: 24.04.2.5914
Hardware: Desktop
Executor: local
Conatiner: Docker, Singularity, Apptainer, Conda
OS: Linux
nf-core/circrna: dev

Implement CIRI-full

Documentation can be found here.

Tasks

Beta Give feedback

Make sure outputs of BWA are available
Implement CIRI
Implement CIRI-AS
Implement CIRI-full
Options

Implement circRNA database annotation functionalities

Description of feature

Some candidates:

Tasks

Beta Give feedback

Implement circbase #139

feature-request first-timers-only
Implement circatlas #140

feature-request first-timers-only
Implement circbank #141

feature-request first-timers-only
Implement circRNADisease #142

feature-request first-timers-only
Options

old version of /opt/conda/envs

Workflow execution completed unsuccessfully!
The exit status of the task that caused the workflow execution to fail was: 1.

The full error message was:

Error executing process > 'HISAT_ALIGN (eye_2)'

Caused by:
Process HISAT_ALIGN (eye_2) terminated with an error exit status (1)

Command executed:

hisat2 -p 16 --dta -q -x s_les_chr -1 FS6_S31_R1_001.fastq.gz -2 FS6_S31_R2_001.fastq.gz -t | samtools view -bS - | samtools sort --threads 16 -m 2G - > eye_2.bam

Command exit status:
1

Command output:
(empty)

Command error:
Warning: the current version of HISAT2 () is older than the version (2.0.0) used to build the index.
Users are strongly recommended to update HISAT2 to the latest version.
Error reading _rstarts[] array: 199929, 244512
Time loading forward index: 00:00:09
Overall time: 00:00:09
Error: Encountered internal HISAT2 exception (#1)
Command: /opt/conda/envs/nf-core-circrna-1.0.0/bin/hisat2-align-s --wrapper basic-0 -p 16 --dta -q -x s_les_chr -t --read-lengths 151,150,149,148,147,35,146,145,143,141,134,131,130,126,144,140,139,138,136,117,108,90,88 -1 /tmp/2818130.inpipe1 -2 /tmp/2818130.inpipe2
(ERR): hisat2-align exited with value 1
[main_samview] fail to read the header from "-".
samtools sort: failed to read header from "-"

Work dir:
/flash/MillerU/work/b0/ea0dc49dac2f0f89fa06090a6ad7f3

Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out

I run this command:
nextflow run -r mirna_rework nf-core/circrna -resume -profile singularity,oist --input /flash/MillerU/sample_trial_circRNA.csv --input_type fastq --phenotype /flash/MillerU/phenotype_circRNA.csv --outdir results_circRNA_all_working --tool ciriquant,circexplorer2,circrna_finder,dcc --module 'circrna_discovery, differential_expression' --fasta ./s_les_chr.fasta --gtf /bucket/MillerU/Zifcakova/Dovetail_ika_genomes/Dovetail_annotated_v_10_04_2022/PO1788_Sepioteuthis_lessoniana_annotation_chr1.gtf --mature /bucket/MillerU/Zifcakova/databases/mature.fa --igenomes_ignore true --adapters /home/l/lucia-zifcakova/.nextflow/assets/nf-core/circrna/bin/adapters.fa --max_cpus 40 --genome false --star /flash/MillerU/work/6d/1cbd95ce8882b4fd2e5a13deeb715d/STARIndex --fasta_fai /flash/MillerU/work/33/4c3fd667f44d34b40596f42a4c1d70/s_les_chr.fasta.fai --bwa /flash/MillerU/work/ac/6c9686383b60b3c7c986576c6dfcab/BWAIndex

Process exceeded running time limit (16h)

Description of the bug

Hi again,
I've bumbed against this error with mapsplice. Is there a way to increase the running time limit?
I've tried with the command line below, but it seems not to respect the set limit (--max_time '240.h').

thanks in advance,
tom

ERROR ~ Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_ALIGN (967)'

Caused by:
  Process exceeded running time limit (16h)

Command executed:

  gzip -d -f 967_1_val_1.fq.gz
  gzip -d -f 967_2_val_2.fq.gz

  mapsplice.py \
      -c chromosomes \
      -x gencode.v45.annotation \
      -1 967_1_val_1.fq \
      -2 967_2_val_2.fq \
      -p 12 \
      --bam \
      --gene-gtf gencode.v45.annotation.gtf \
      -o 967 \
      --seglen 25 --min-intron 20 --max-intron 1000000 --min-map-len 40 --min-fusion-distance 200 --fusion-non-canonical

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_ALIGN":
      mapsplice: v2.2.1
  END_VERSIONS

Command used and terminal output

./nextflow run nf-core/circrna -profile docker --input samples.csv --module 'circrna_discovery' --outdir ./results_single --tool_filter 1 -r dev --fasta /data/reference_data/hg38/gome/hg38_ucsc_filtered.fa --gtf /data/reference_data/hg38/annotations/gencode.v45.annotation.gtf --mature /data/reference_data/mirBaseT/human/mature_human.fa --tool 'ciriquant,circexplorer2,find_circ,circrna_finder,mapsplice,dcc,segemehl' --bsj_reads 2 --max_time '240.h'

Relevant files

nextflow.log

System information

No response

[nf-core/circrna] error: Your FASTQ files do not have the appropriate extension

Hi all,

Thanks so much for generating this useful pipeline!
I wanted to find circrnas in a different way, and I found your work. But when I use it, I encounter the following problems:

Here is my code:

nohup nextflow run nf-core/circrna \
-r 892b3136e7432221bd81f8c7cc0400ebe541b08e \
-profile singularity \
--genome 'GRCh37' \
--input "/scratch/c.c2050857/circrna/raw_data_gz/*.fastq.gz" \
--input_type 'fastq' \
--module 'circrna_discovery' \
--tool 'ciriquant, dcc, find_circ, circexplorer2' \
--outdir Results

My fastq.gz data:
Here I also have a question, is this pipeline only for fastq.gz data? Can I use fastq data？

My error:

Could you please take a look at this? Any advice would be appreciated. Thanks！

Kind regards,
Birong

get_software versions

add get_software_versions process to main.nf.

.collect() with MultiQC

Problems with scripts referenced using the ${workflow.projectDir} variable

Description of the bug

Apparantly when running the pipeline in the cloud (AWS tower), scripts called via their absolute path (using ${workflow.projectDir}) are not available to nextflow at runtime.

An example is:

python ${workflow.projectDir}/bin/circRNA_counts_matrix.py > matrix.txt

which results in

python: can't open file '/.nextflow/assets/nf-core/circrna/bin/circRNA_counts_matrix.py'

Command used and terminal output

No response

Relevant files

No response

System information

AWS tower

Add bioconda recipe for `psirc`

Description of feature

Currently there is only a custom image available via dockerhub. We should find a cleaner solution, most likely via a bioconda recipe.

Release Preparation Checklist

Description of the bug

There are a couple of things to address before a first release:

Move all containers to be biocontainers
Migrate the code to follow guidelines on nf-core:
- Main.nf only calls a workflow - that can hold the logic to e.g. start several subworkflows
- Subworkflows for combining multiple steps together, e.g. most people split preprocessing steps from QC and main analysis steps (a bit pipeline dependent how to do that)

Most of the above isn't difficult to do, but definitely has to be done before a first release. Also, I'd be inclined to get the open issues resolved as much as possible. if the conda issues persist, we can also ditch conda support - which is something we anyways try to encourage in people as its not exactly very reproducible compared to containers.

Command used and terminal output

Relevant files

System information

No response

Implement miRNA sponging analysis

Description of feature

As described in this paper

pipeline won't start without filling ref-genome bwa, bowtie, etc...

Check Documentation

I have checked the following places for your error:

[yes ] nf-core website: troubleshooting
[ yes] nf-core/circrna pipeline documentation

Description of the bug

Pipeline won't start running. I have filled in information to launch the pipeline online but is seems that it won't start without paths to bwa, bowie, bowtie2...

Steps to reproduce

Steps to reproduce the behaviour:

Command line: nextflow run nf-core/circrna -r dev -name circRNA_sles -work-dir ./circRNA_sles -resume -params-file nf-params.json
See error:
WARN: Access to undefined parameter name -- Initialise it to a default value eg. params.name = some_value
WARN: Access to undefined parameter bowtie -- Initialise it to a default value eg. params.bowtie = some_value
WARN: Access to undefined parameter bowtie2 -- Initialise it to a default value eg. params.bowtie2 = some_value
WARN: Access to undefined parameter bwa -- Initialise it to a default value eg. params.bwa = some_value
WARN: Access to undefined parameter hisat -- Initialise it to a default value eg. params.hisat = some_value
WARN: Access to undefined parameter star -- Initialise it to a default value eg. params.star = some_value
WARN: Access to undefined parameter segemehl -- Initialise it to a default value eg. params.segemehl = some_value
[- ] process > SOFTWARE_VERSIONS -
[- ] process > BWA_INDEX -
[- ] process > SAMTOOLS_INDEX -
[- ] process > HISAT2_INDEX -
[- ] process > STAR_INDEX -
[- ] process > BOWTIE_INDEX -
[- ] process > BOWTIE2_INDEX -
WARN: Access to undefined parameter circexplorer2_annotation -- Initialise it to a default value eg. params.circexplorer2_annotation = some_value
WARN: Singularity cache directory has not been defined -- Remote image will be stored in the path: /flash/MillerU/circRNA_sles2/singularity -- Use env variable NXF_SINGULARITY_CACHEDIR to specify a different location
Pulling Singularity image docker://barryd237/circrna:dev [cache /flash/MillerU/circRNA_sles2/singularity/barryd237-circrna-dev.img]
WARN: There's no process matching config selector: get_software_versions
-[nf-core/circrna] Pipeline completed with errors-

Expected behaviour

Start to run pipeline with my genome. I have inputed soft repeat masked multisequnce fasta file. I have inputed gff3 file, which was renamed to gtf file and inputted to pipeline. Only issue, why pipeline is not working I can think of is the gif file. But on the other hand, I have tested the pipeline with iGenome and it was not running with similar warning anyway...

Log files

Have you provided the following extra information/files:
The command used to run the pipeline:

#!/bin/bash
#SBATCH --job-name=nf-core-circ
#SBATCH --partition=compute
#SBATCH --time=3-0
#SBATCH --mem=120G
#SBATCH --cpus-per-task=20
#SBATCH --mail-user=[email protected]
#SBATCH --mail-type=BEGIN,FAIL,END
#SBATCH --output=/flash/MillerU/nf-core-circ.out

ml bioinfo-ugrp-modules
ml DebianMed
ml singularity
ml Nextflow

nextflow run nf-core/circrna -r dev -name circRNA_sles2 -work-dir ./circRNA_sles2 -resume -params-file nf-params.json

The .nextflow.log file
Apr-05 15:21:54.801 [main] DEBUG nextflow.cli.Launcher - $> nextflow run nf-core/circrna -r dev -name circRNA_sles2 -work-dir ./circRNA_sles2 -resume -params-file nf-params.json
Apr-05 15:21:55.044 [main] INFO nextflow.cli.CmdRun - N E X T F L O W ~ version 21.10.6
Apr-05 15:21:56.349 [main] DEBUG nextflow.scm.AssetManager - Git config: /home/l/lucia-zifcakova/.nextflow/assets/nf-core/circrna/.git/config; branch: master; remote: origin; url: https://github.com/nf-core/circrna.git
Apr-05 15:21:56.361 [main] DEBUG nextflow.scm.AssetManager - Git config: /home/l/lucia-zifcakova/.nextflow/assets/nf-core/circrna/.git/config; branch: master; remote: origin; url: https://github.com/nf-core/circrna.git
Apr-05 15:21:56.739 [main] INFO nextflow.cli.CmdRun - Launching nf-core/circrna [circRNA_sles2] - revision: 5e17f6c [dev]
Apr-05 15:21:57.488 [main] DEBUG nextflow.config.ConfigBuilder - Found config base: /home/l/lucia-zifcakova/.nextflow/assets/nf-core/circrna/nextflow.config
Apr-05 15:21:57.488 [main] DEBUG nextflow.config.ConfigBuilder - Found config local: /flash/MillerU/nextflow.config
Apr-05 15:21:57.489 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /home/l/lucia-zifcakova/.nextflow/assets/nf-core/circrna/nextflow.config
Apr-05 15:21:57.489 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /flash/MillerU/nextflow.config
Apr-05 15:21:57.514 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: standard
Apr-05 15:21:57.971 [main] DEBUG nextflow.plugin.PluginsFacade - Using Default plugins manager
Apr-05 15:21:57.985 [main] INFO org.pf4j.DefaultPluginStatusProvider - Enabled plugins: []
Apr-05 15:21:57.987 [main] INFO org.pf4j.DefaultPluginStatusProvider - Disabled plugins: []
Apr-05 15:21:57.990 [main] INFO org.pf4j.DefaultPluginManager - PF4J version 3.4.1 in 'deployment' mode
Apr-05 15:21:58.095 [main] DEBUG nextflow.plugin.PluginsFacade - Using Default plugins manager
Apr-05 15:21:58.705 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: standard
Apr-05 15:21:58.792 [main] DEBUG nextflow.plugin.PluginsFacade - Setting up plugin manager > mode=prod; plugins-dir=/home/l/lucia-zifcakova/.nextflow/plugins
Apr-05 15:21:58.794 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins default=[]
Apr-05 15:21:58.799 [main] INFO org.pf4j.DefaultPluginStatusProvider - Enabled plugins: []
Apr-05 15:21:58.800 [main] INFO org.pf4j.DefaultPluginStatusProvider - Disabled plugins: []
Apr-05 15:21:58.802 [main] INFO org.pf4j.DefaultPluginManager - PF4J version 3.4.1 in 'deployment' mode
Apr-05 15:21:58.815 [main] INFO org.pf4j.AbstractPluginManager - No plugins
Apr-05 15:21:58.882 [main] DEBUG nextflow.Session - Session uuid: 0eb81af5-8ec3-475b-848a-68da4f2d4258
Apr-05 15:21:58.882 [main] DEBUG nextflow.Session - Run name: circRNA_sles2
Apr-05 15:21:58.883 [main] DEBUG nextflow.Session - Executor pool size: 20
Apr-05 15:21:58.908 [main] DEBUG nextflow.cli.CmdRun -
Version: 21.10.6 build 5661
Created: 21-12-2021 17:01 UTC (22-12-2021 02:01 JDT)
System: Linux 4.18.0-348.2.1.el8_5.x86_64
Runtime: Groovy 3.0.9 on OpenJDK 64-Bit Server VM 1.8.0_312-b07
Encoding: UTF-8 (ANSI_X3.4-1968)
Process: [email protected] [10.145.2.81]
CPUs: 20 - Mem: 120 GB (119.2 GB) - Swap: 8 GB (7.4 GB)
Apr-05 15:21:58.998 [main] DEBUG nextflow.Session - Work-dir: /flash/MillerU/circRNA_sles2 [lustre]
Apr-05 15:21:59.090 [main] DEBUG nextflow.executor.ExecutorFactory - Extension executors providers=[GoogleLifeSciencesExecutor, AwsBatchExecutor, IgExecutor]
Apr-05 15:21:59.107 [main] DEBUG nextflow.Session - Observer factory: DefaultObserverFactory
Apr-05 15:21:59.133 [main] DEBUG nextflow.Session - Observer factory: TowerFactory
Apr-05 15:21:59.261 [main] DEBUG nextflow.util.CustomThreadPool - Creating default thread pool > poolSize: 21; maxThreads: 1000
Apr-05 15:21:59.379 [main] DEBUG nextflow.Session - Session start invoked
Apr-05 15:21:59.386 [main] DEBUG nextflow.trace.TraceFileObserver - Flow starting -- trace file: /flash/MillerU/results_circRNA_sles2/pipeline_info/execution_trace_2022-04-05_15-21-58.txt
Apr-05 15:21:59.405 [main] DEBUG nextflow.Session - Using default localLib path: /home/l/lucia-zifcakova/.nextflow/assets/nf-core/circrna/lib
Apr-05 15:21:59.411 [main] DEBUG nextflow.Session - Adding to the classpath library: /home/l/lucia-zifcakova/.nextflow/assets/nf-core/circrna/lib
Apr-05 15:21:59.412 [main] DEBUG nextflow.Session - Adding to the classpath library: /home/l/lucia-zifcakova/.nextflow/assets/nf-core/circrna/lib/nfcore_external_java_deps.jar
Apr-05 15:22:02.515 [main] DEBUG nextflow.script.ScriptRunner - > Launching execution
Apr-05 15:22:02.532 [main] INFO nextflow.Nextflow -

-�[2m----------------------------------------------------�[0m-
�[0;32m,--.�[0;30m/�[0;32m,-.�[0m
�[0;34m ___ __ __ __ ___ �[0;32m/,-..--~'�[0m
�[0;34m |\ | |__ __ / / \ |__) |__ �[0;33m} {�[0m �[0;34m | \| | \__, \__/ | \ |___ �[0;32m\-.,--,�[0m �[0;32m.,.,'�[0m
�[0;35m nf-core/circrna v1.0.0�[0m
-�[2m----------------------------------------------------�[0m-

Apr-05 15:22:02.682 [main] WARN nextflow.script.ScriptBinding - Access to undefined parameter genomes -- Initialise it to a default value eg. params.genomes = some_value
Apr-05 15:22:02.749 [main] DEBUG nextflow.util.CustomThreadPool - Creating default thread pool > poolSize: 21; maxThreads: 1000
Apr-05 15:22:02.771 [Actor Thread 2] ERROR nextflow.extension.DataflowHelper - @unknown
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at nextflow.splitter.CsvSplitter.parseHeader(CsvSplitter.groovy:136)
at nextflow.splitter.CsvSplitter.process(CsvSplitter.groovy:124)
at nextflow.splitter.CsvSplitter.process(CsvSplitter.groovy)
at nextflow.splitter.AbstractSplitter.apply(AbstractSplitter.groovy:155)
at nextflow.extension.SplitOp$_applySplittingOperator_closure1.doCall(SplitOp.groovy:190)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:38)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:139)
at nextflow.extension.DataflowHelper$_subscribeImpl_closure2.doCall(DataflowHelper.groovy:285)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
at groovy.lang.Closure.call(Closure.java:412)
at groovyx.gpars.dataflow.operator.DataflowOperatorActor.startTask(DataflowOperatorActor.java:120)
at groovyx.gpars.dataflow.operator.DataflowOperatorActor.onMessage(DataflowOperatorActor.java:108)
at groovyx.gpars.actor.impl.SDAClosure$1.call(SDAClosure.java:43)
at groovyx.gpars.actor.AbstractLoopingActor.runEnhancedWithoutRepliesOnMessages(AbstractLoopingActor.java:293)
at groovyx.gpars.actor.AbstractLoopingActor.access$400(AbstractLoopingActor.java:30)
at groovyx.gpars.actor.AbstractLoopingActor$1.handleMessage(AbstractLoopingActor.java:93)
at groovyx.gpars.util.AsyncMessagingCore.run(AsyncMessagingCore.java:132)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Apr-05 15:22:02.787 [Actor Thread 2] DEBUG nextflow.Session - Session aborted -- Cause: Input length = 1
Apr-05 15:22:02.795 [main] INFO nextflow.Nextflow - �[1mCore Nextflow options�[0m
�[0;34mrevision : �[0;32mdev�[0m
�[0;34mrunName : �[0;32mcircRNA_sles2�[0m
�[0;34mcontainerEngine : �[0;32msingularity�[0m
�[0;34mcontainer : �[0;32mbarryd237/circrna:dev�[0m
�[0;34mlaunchDir : �[0;32m/flash/MillerU�[0m
�[0;34mworkDir : �[0;32m/flash/MillerU/circRNA_sles2�[0m
�[0;34mprojectDir : �[0;32m/home/l/lucia-zifcakova/.nextflow/assets/nf-core/circrna�[0m
�[0;34muserName : �[0;32mlucia-zifcakova�[0m
�[0;34mprofile : �[0;32mstandard�[0m
�[0;34mconfigFiles : �[0;32m/home/l/lucia-zifcakova/.nextflow/assets/nf-core/circrna/nextflow.config, /flash/MillerU/nextflow.config�[0m

�[1mInput/output options�[0m
�[0;34minput : �[0;32m/bucket/MillerU/Zifcakova/circ_rna_samples.csv�[0m
�[0;34minput_type : �[0;32mfastq�[0m
�[0;34moutdir : �[0;32m./results_circRNA_sles2�[0m

�[1mPipeline Options�[0m
�[0;34mtool : �[0;32mciriquant,circexplorer2,find_circ,circrna_finder,mapsplice,dcc,segemehl�[0m
�[0;34mmodule : �[0;32mcircrna_discovery,mirna_prediction�[0m

�[1mReference genome files�[0m
�[0;34mfasta : �[0;32m/bucket/MillerU/Zifcakova/Dovetail_ika_genomes/Dovetail_annotated_v_3_01_2022/PO1788_Sepioteuthis_lessoniana_RepeatMasked.fasta�[0m
�[0;34mgtf : �[0;32m/bucket/MillerU/Zifcakova/Dovetail_ika_genomes/Dovetail_annotated_v_3_01_2022/PO1788_Sepioteuthis_lessoniana_RepeatMasked.gtf�[0m
�[0;34mmature : �[0;32m/bucket/MillerU/Zifcakova/databases/mature.fa�[0m
�[0;34mspecies : �[0;32mSepiotheusis lessoniana�[0m
�[0;34mfasta_fai : �[0;32m/bucket/MillerU/Zifcakova/Dovetail_ika_genomes/Dovetail_annotated_v_3_01_2022/PO1788_Sepioteuthis_lessoniana_RepeatMasked.fasta.fai�[0m
�[0;34migenomes_ignore : �[0;32mtrue�[0m

�[1mRead trimming & adapter removal�[0m
�[0;34madapters : �[0;32m�[0m

�[1mSTAR�[0m
�[0;34mchimScoreSeparation : �[0;32m10�[0m

�[1mGeneric options�[0m
�[0;34mmax_multiqc_email_size : �[0;32m25 MB�[0m

�[1mMax job request options�[0m
�[0;34mmax_cpus : �[0;32m20�[0m
�[0;34mmax_memory : �[0;32m500 GB�[0m
�[0;34mmax_time : �[0;32m3d 18h�[0m

�[1mInstitutional config options�[0m
�[0;34mconfig_profile_description: �[0;32mThe Okinawa Institute of Science and Technology Graduate University (OIST) HPC cluster profile provided by nf-core/configs.�[0m
�[0;34mconfig_profile_contact : �[0;32mOISTs Bioinformatics User Group [email protected]�[0m
�[0;34mconfig_profile_url : �[0;32mhttps://github.com/nf-core/configs/blob/master/docs/oist.md�[0m

-�[2m----------------------------------------------------�[0m-�[2m
Only displaying parameters that differ from defaults.
�[0m-�[2m----------------------------------------------------�[0m-
Apr-05 15:22:02.796 [main] WARN nextflow.script.ScriptBinding - Access to undefined parameter name -- Initialise it to a default value eg. params.name = some_value
Apr-05 15:22:02.797 [main] WARN nextflow.script.ScriptBinding - Access to undefined parameter bowtie -- Initialise it to a default value eg. params.bowtie = some_value
Apr-05 15:22:02.797 [main] WARN nextflow.script.ScriptBinding - Access to undefined parameter bowtie2 -- Initialise it to a default value eg. params.bowtie2 = some_value
Apr-05 15:22:02.797 [main] WARN nextflow.script.ScriptBinding - Access to undefined parameter bwa -- Initialise it to a default value eg. params.bwa = some_value
Apr-05 15:22:02.798 [main] WARN nextflow.script.ScriptBinding - Access to undefined parameter hisat -- Initialise it to a default value eg. params.hisat = some_value
Apr-05 15:22:02.798 [main] WARN nextflow.script.ScriptBinding - Access to undefined parameter star -- Initialise it to a default value eg. params.star = some_value
Apr-05 15:22:02.798 [main] WARN nextflow.script.ScriptBinding - Access to undefined parameter segemehl -- Initialise it to a default value eg. params.segemehl = some_value
Apr-05 15:22:02.818 [Actor Thread 2] DEBUG nextflow.Session - The following nodes are still active:
[operator] splitCsv
[operator] map
[operator] into

Apr-05 15:22:02.865 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:02.865 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:02.872 [main] DEBUG nextflow.executor.Executor - [warm up] executor > slurm
Apr-05 15:22:02.876 [main] DEBUG n.processor.TaskPollingMonitor - Creating task monitor for executor 'slurm' > capacity: 100; pollInterval: 5s; dumpInterval: 5m
Apr-05 15:22:02.880 [main] DEBUG n.executor.AbstractGridExecutor - Creating executor 'slurm' > queue-stat-interval: 1m
Apr-05 15:22:02.949 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:02.949 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:02.957 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_low matches labels process_low for process with name SAMTOOLS_INDEX
Apr-05 15:22:02.958 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:02.958 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:02.969 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name HISAT2_INDEX
Apr-05 15:22:02.969 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:02.969 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:02.978 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name STAR_INDEX
Apr-05 15:22:02.979 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:02.979 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:02.987 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name BOWTIE_INDEX
Apr-05 15:22:02.987 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:02.987 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:02.997 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name BOWTIE2_INDEX
Apr-05 15:22:02.998 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:02.999 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.010 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name SEGEMEHL_INDEX
Apr-05 15:22:03.011 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.011 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.018 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.018 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.059 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.060 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.069 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.070 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.072 [main] WARN nextflow.script.ScriptBinding - Access to undefined parameter circexplorer2_annotation -- Initialise it to a default value eg. params.circexplorer2_annotation = some_value
Apr-05 15:22:03.076 [Actor Thread 4] DEBUG n.splitter.AbstractTextSplitter - Splitter Fasta collector path: nextflow.splitter.TextFileCollector$CachePath(/flash/MillerU/circRNA_sles2/46/c92957289d11a17261b909334d3350/PO1788_Sepioteuthis_lessoniana_RepeatMasked.fasta, null)
Apr-05 15:22:03.089 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_medium matches labels process_medium for process with name BAM_TO_FASTQ
Apr-05 15:22:03.093 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.093 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.102 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_low matches labels process_low,py3 for process with name FASTQC_RAW
Apr-05 15:22:03.103 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:py3 matches labels process_low,py3 for process with name FASTQC_RAW
Apr-05 15:22:03.103 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.103 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.108 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_medium matches labels process_medium for process with name BBDUK
Apr-05 15:22:03.109 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.109 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.111 [Actor Thread 38] WARN nextflow.container.SingularityCache - Singularity cache directory has not been defined -- Remote image will be stored in the path: /flash/MillerU/circRNA_sles2/singularity -- Use env variable NXF_SINGULARITY_CACHEDIR to specify a different location
Apr-05 15:22:03.112 [Actor Thread 38] INFO nextflow.container.SingularityCache - Pulling Singularity image docker://barryd237/circrna:dev [cache /flash/MillerU/circRNA_sles2/singularity/barryd237-circrna-dev.img]
Apr-05 15:22:03.118 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_low matches labels process_low,py3 for process with name FASTQC_BBDUK
Apr-05 15:22:03.119 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:py3 matches labels process_low,py3 for process with name FASTQC_BBDUK
Apr-05 15:22:03.119 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.119 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.129 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name CIRIQUANT
Apr-05 15:22:03.131 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.131 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.145 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name STAR_1PASS
Apr-05 15:22:03.146 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.146 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.153 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.153 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.171 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name STAR_2PASS
Apr-05 15:22:03.172 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.172 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.182 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_medium matches labels process_medium for process with name CIRCEXPLORER2
Apr-05 15:22:03.183 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.183 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.192 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_medium matches labels process_medium for process with name CIRCRNA_FINDER
Apr-05 15:22:03.192 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.192 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.199 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name DCC_MATE1
Apr-05 15:22:03.200 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.201 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.207 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name DCC_MATE2
Apr-05 15:22:03.207 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.207 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.226 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_medium matches labels py3,process_medium for process with name DCC
Apr-05 15:22:03.226 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:py3 matches labels py3,process_medium for process with name DCC
Apr-05 15:22:03.227 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.227 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.235 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name FIND_ANCHORS
Apr-05 15:22:03.235 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.235 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.244 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name FIND_CIRC
Apr-05 15:22:03.245 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.245 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.251 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name MAPSPLICE_ALIGN
Apr-05 15:22:03.252 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.252 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.261 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_medium matches labels process_medium for process with name MAPSPLICE_PARSE
Apr-05 15:22:03.261 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.261 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.300 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name SEGEMEHL_ALIGN
Apr-05 15:22:03.301 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.301 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.313 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name ANNOTATION
Apr-05 15:22:03.314 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.314 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.318 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name FASTA
Apr-05 15:22:03.319 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.319 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.329 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.330 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.334 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.334 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.337 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.337 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.349 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_low matches labels process_low for process with name MIRNA_PREDICTION
Apr-05 15:22:03.350 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.350 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.355 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_low matches labels process_low for process with name MIRNA_TARGETS
Apr-05 15:22:03.356 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.356 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.361 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_high matches labels process_high for process with name HISAT_ALIGN
Apr-05 15:22:03.361 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.361 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.366 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_medium matches labels process_medium for process with name STRINGTIE
Apr-05 15:22:03.366 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.367 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.372 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_medium matches labels process_medium for process with name DEA
Apr-05 15:22:03.373 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.373 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.408 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:process_low matches labels py3,process_low for process with name MULTIQC
Apr-05 15:22:03.409 [main] DEBUG nextflow.script.ProcessConfig - Config settings withLabel:py3 matches labels py3,process_low for process with name MULTIQC
Apr-05 15:22:03.409 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm
Apr-05 15:22:03.409 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm'
Apr-05 15:22:03.416 [main] DEBUG nextflow.Session - Workflow process names [dsl1]: BOWTIE2_INDEX, MULTIQC, COUNT_MATRIX_SINGLE, SJDB_FILE, MAPSPLICE_PARSE, FASTQC_BBDUK, TARGETSCAN_DATABASE, ANNOTATION, FILTER_GTF, BWA_INDEX, BBDUK, FASTA, HISAT2_INDEX, GENE_ANNOTATION, SEGEMEHL_ALIGN, FASTQC_RAW, SEGEMEHL_INDEX, DEA, COUNT_MATRIX_COMBINED, DCC_MATE2, MIRNA_PREDICTION, DCC, STAR_2PASS, DCC_MATE1, BOWTIE_INDEX, CIRIQUANT, FIND_CIRC, MAPSPLICE_ALIGN, STRINGTIE, STAR_INDEX, FIND_ANCHORS, MIRNA_TARGETS, CIRIQUANT_YML, CIRCEXPLORER2, CIRCRNA_FINDER, HISAT_ALIGN, STAR_1PASS, SOFTWARE_VERSIONS, MERGE_TOOLS, BAM_TO_FASTQ, SAMTOOLS_INDEX
Apr-05 15:22:03.429 [main] WARN nextflow.Session - There's no process matching config selector: get_software_versions
Apr-05 15:22:03.431 [main] DEBUG nextflow.script.ScriptRunner - > Await termination
Apr-05 15:22:03.431 [main] DEBUG nextflow.Session - Session await
Apr-05 15:22:03.431 [main] DEBUG nextflow.Session - Session await > all process finished
Apr-05 15:22:03.431 [main] DEBUG nextflow.Session - Session await > all barriers passed
Apr-05 15:22:03.534 [Actor Thread 45] DEBUG nextflow.sort.BigSort - Sort completed -- entries: 1; slices: 1; internal sort time: 0.001 s; external sort time: 0.021 s; total time: 0.022 s
Apr-05 15:22:03.535 [Actor Thread 45] DEBUG nextflow.file.FileCollector - >> temp file exists? false
Apr-05 15:22:03.536 [Actor Thread 45] DEBUG nextflow.file.FileCollector - Missed collect-file cache -- cause: java.nio.file.NoSuchFileException: /scratch/b0fa46d8336bb20794c7a1e468467e4d.collect-file
Apr-05 15:22:03.543 [Actor Thread 45] DEBUG nextflow.file.FileCollector - Saved collect-files list to: /scratch/b0fa46d8336bb20794c7a1e468467e4d.collect-file
Apr-05 15:22:03.554 [Actor Thread 45] DEBUG nextflow.file.FileCollector - Deleting file collector temp dir: /scratch/nxf-4044264932371346223
Apr-05 15:22:03.632 [main] INFO nextflow.Nextflow - -�[0;35m[nf-core/circrna]�[0;31m Pipeline completed with errors�[0m-
Apr-05 15:22:03.638 [main] DEBUG nextflow.trace.WorkflowStatsObserver - Workflow completed > WorkflowStats[succeededCount=0; failedCount=0; ignoredCount=0; cachedCount=0; pendingCount=0; submittedCount=0; runningCount=0; retriesCount=0; abortedCount=0; succeedDuration=0ms; failedDuration=0ms; cachedDuration=0ms;loadCpus=0; loadMemory=0; peakRunning=0; peakCpus=0; peakMemory=0; ]
Apr-05 15:22:03.639 [main] DEBUG nextflow.trace.TraceFileObserver - Flow completing -- flushing trace file
Apr-05 15:22:03.642 [main] DEBUG nextflow.trace.ReportObserver - Flow completing -- rendering html report
Apr-05 15:22:03.643 [main] DEBUG nextflow.trace.ReportObserver - Execution report summary data:
[]
Apr-05 15:22:05.392 [main] DEBUG nextflow.trace.TimelineObserver - Flow completing -- rendering html timeline
Apr-05 15:22:05.583 [main] DEBUG nextflow.CacheDB - Closing CacheDB done
Apr-05 15:22:05.636 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye

System

Hardware:HPC Deigo, OIST
Executor: slurm
OS: CentOS Linux
Version

Nextflow Installation

Version: 21.10.6

Container engine

Engine: Singularity
version: 3.5.2
Image tag: docker://barryd237/circrna:dev [cache /flash/MillerU/circRNA_sles2/singularity/barryd237-circrna-dev.img

Additional context

print summary logs

Add 'Print Summary' log info inherent in all nf-core pipelines to main.nf

Not a valid path value: 'null'

Description of the bug

Unable to complete any workflow except the "test" one.

Command used and terminal output

nextflow run nf-core/circrna -profile test_full,docker -r dev

Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCEXPLORER2_REFERENCE (null)'

Caused by:  Not a valid path value: 'null'

Relevant files

nextflow.log

System information

N E X T F L O W ~ version 22.10.4
Desktop
local
docker
Ubuntu 22 LTS
nf-core/circrna v1.0.0-g51abcba

Fatal INPUT FILE error, no valid exon lines in the GTF file: genes.gtf when using --genome GRCh38

Description of the bug

When running nf-core/circrna with --genome GRCh38, the error happended.
I checked the fasta.fa downloaded from AWS, the chromosome name is not what I expected. The top 10 chromosome names:

$ ~/circrna_test/work/b0/e5217e456114a3a1f67f633ddf2687$ grep ">" fasta.fa

chr1gi:568336023LN:248956422rl:ChromosomeM5:6aef897c3d6ff0c78aff06ac189178ddAS:GRCh38
chr2gi:568336022LN:242193529rl:ChromosomeM5:f98db672eb0993dcfdabafe2a882905cAS:GRCh38
chr3gi:568336021LN:198295559rl:ChromosomeM5:76635a41ea913a405ded820447d067b0AS:GRCh38
chr4gi:568336020LN:190214555rl:ChromosomeM5:3210fecf1eb92d5489da4346b3fddc6eAS:GRCh38
chr5gi:568336019LN:181538259rl:ChromosomeM5:a811b3dc9fe66af729dc0dddf7fa4f13AS:GRCh38hm:47309185-49591369
chr6gi:568336018LN:170805979rl:ChromosomeM5:5691468a67c7e7a7b5f2a3a683792c29AS:GRCh38
chr7gi:568336017LN:159345973rl:ChromosomeM5:cc044cc2256a1141212660fb07b6171eAS:GRCh38
chr8gi:568336016LN:145138636rl:ChromosomeM5:c67955b5f7815a9a1edfaa15893d3616AS:GRCh38
chr9gi:568336015LN:138394717rl:ChromosomeM5:6c198acf68b5af7b9d676dfdd531b5deAS:GRCh38
chr10gi:568336014LN:133797422rl:ChromosomeM5:c0eeee7acfdaf31b770a509bdaa6e51aAS:GRCh38

Command used and terminal output

Command used: nextflow run nf-core/circrna -r dev --input sample_sheet_for_circrna.csv --outdir circrna_output  -profile docker  --phenotype phenotype_for_circrna_diff_expr.csv --module 'circrna_discovery,mirna_prediction,differential_expression' --tool_filter 2 --tool 'ciriquant,circexplorer2,find_circ,circrna_finder,mapsplice,dcc,segemehl' --genome GRCh38

Terminal output:
ERROR ~ Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:STAR_GENOMEGENERATE (fasta.fa)'

Caused by:
  Process `NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:STAR_GENOMEGENERATE (fasta.fa)` terminated with an error exit status (104)


Command executed:

  samtools faidx fasta.fa
  NUM_BASES=`gawk '{sum = sum + $2}END{if ((log(sum)/log(2))/2 - 1 > 14) {printf "%.0f", 14} else {printf "%.0f", (log(sum)/log(2))/2 - 1}}' fasta.fa.fai`
  
  mkdir star
  STAR \
      --runMode genomeGenerate \
      --genomeDir star/ \
      --genomeFastaFiles fasta.fa \
      --sjdbGTFfile genes.gtf \
      --runThreadN 24 \
      --genomeSAindexNbases $NUM_BASES \
      --limitGenomeGenerateRAM 154518822656 \
      --sjdbOverhang 100
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:STAR_GENOMEGENERATE":
      star: $(STAR --version | sed -e "s/STAR_//g")
      samtools: $(echo $(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*$//')
      gawk: $(echo $(gawk --version 2>&1) | sed 's/^.*GNU Awk //; s/, .*$//')
  END_VERSIONS

Command exit status:
  104

Command output:
        STAR --runMode genomeGenerate --genomeDir star/ --genomeFastaFiles fasta.fa --sjdbGTFfile genes.gtf --runThreadN 24 --genomeSAindexNbases 14 --limitGenomeGenerateRAM 154518822656 --sjdbOverhang 100
        STAR version: 2.7.10a   compiled: 2022-01-14T18:50:00-05:00 :/home/dobin/data/STAR/STARcode/STAR.master/source
  Jun 08 01:49:15 ..... started STAR run
  Jun 08 01:49:15 ... starting to generate Genome files
  Jun 08 01:49:53 ..... processing annotations GTF

Command error:
        STAR --runMode genomeGenerate --genomeDir star/ --genomeFastaFiles fasta.fa --sjdbGTFfile genes.gtf --runThreadN 24 --genomeSAindexNbases 14 --limitGenomeGenerateRAM 154518822656 --sjdbOverhang 100
        STAR version: 2.7.10a   compiled: 2022-01-14T18:50:00-05:00 :/home/dobin/data/STAR/STARcode/STAR.master/source
  Jun 08 01:49:15 ..... started STAR run
  Jun 08 01:49:15 ... starting to generate Genome files
  Jun 08 01:49:53 ..... processing annotations GTF
  
  Fatal INPUT FILE error, no valid exon lines in the GTF file: genes.gtf
  Solution: check the formatting of the GTF file. One likely cause is the difference in chromosome naming between GTF and FASTA file.
  
  Jun 08 01:49:57 ...... FATAL ERROR, exiting

Work dir:
  /home/ian/NGSDATA_FOR_NEXTFLOW/Dr.LaiCH_RNASeq/work/b0/e5217e456114a3a1f67f633ddf2687

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`


 -- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting

 -- Check '.nextflow.log' file for details

Relevant files

No response

System information

Nextflow version : 24.04.2
Hardware: Linux desktop
Executor: local
Container engine: Docker
OS: centos 7
Version of nf-core/circrna: dev

Implement circbase

Issues generating, and pointing to, genome reference data lead to early crashes during configuration (w/ fix(?))

Description of the bug

Intro

Thank you for the effort of putting this together -- combining many tools seems in line with the consensus of the field and is a lot of work. Unfortunately, 60cbad7 does not work (on my system; below) outside of test profile. Based on other issues (#68 , #70 seems potentially related as well) I believe this is a general error in the construction of reference pointers in the configuration. For each scenario below I have uploaded the command and nextflow log in this issue where requested. This may no longer be actively maintained but I hope to have a working solution for others in my situation.

test_full

As detailed in #68 , this results in a null assignment for a process path. This is difficult to debug as a user because the error happens before the work directory is populated with anything, suggesting its a problem in the config setup.

user data (w/ remote igenomes)

As with above , it appears that the igenomes S3 sync is unsuccessful and not mapping a path to the process, resulting in null paths for every process and failure (thought slightly different).

user data (w/ local igenomes)

with:

aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/UCSC/hg38/ /reference/hg38/

I synced the igenome directory to a local path ($REF_PATH), paths still come up as 'null' and things break during conf however genome.fa is now located and split.

user data (w/ hard-coded genome params)

(See command below)
New error: Cannot get property 'fasta' on null object with a stdout log suggesting to:
-- Check script './workflows/circrna.nf' at line: 48 or see '.nextflow.log' file for more details
Ternary param assignment in workflows/circrna.nf [starts @ line 48] leads to empty param paths which can be fixed by commenting out lines 48 - 55, avoiding the broken reassignment of genome params based on the igenome object that 1) doesn't exist in this use case 2) seemed to fail in the others anyways (?):

// Genome params
// params.fasta   = params.genome  ? params.genomes[ params.genome ].fasta ?: false : false
// params.gtf     = params.genome  ? params.genomes[ params.genome ].gtf ?: false : false
// params.bwa     = params.genome && params.tool.contains('ciriquant') ? params.genomes[ params.genome ].bwa ?: false : false
// params.star    = params.genome && ( params.tool.contains('circexplorer2') || params.tool.contains('dcc') || params.tool.contains('circrna_finder') ) ? params.genomes[ params.genome ].star ?: false : false
// params.bowtie  = params.genome && params.tool.contains('mapsplice') ? params.genomes[ params.genome ].bowtie ?: false : false
// params.bowtie2 = params.genome && params.tool.contains('find_circ') ? params.genomes[ params.genome ].bowtie2 ?: false : false
params.mature  = params.genome && params.module.contains('mirna_prediction') ? params.genomes[ params.genome ].mature ?: false : false
// params.species = params.genome  ? params.genomes[ params.genome ].species_id ?: false : false

conclusion

I am currently running a succesful instance of the pipeline with hard-coded genome params that are not reassigned thanks to the commented lines above

I wanted to get this up once I knew the pipeline was working, so while I'm certain there is a way to meaningfully fix the param reassignment I disable in the fix above (e.g. have ternary set to the existing param if false? skip if igenomes_ignore==TRUE?, debug the igenomes config object generation?) I just hacked this into functional shape and will update on the overall success of the run.

Command used and terminal output

# test_full
nextflow run $OUTPUT_PATH/nf-core-circrna_dev/dev \
	-profile test_full,singularity \
	--input "$OUTPUT_PATH/data/samplesheet.csv" \
	--outdir "$OUTPUT_PATH/data/results/" \
	--module "circrna_discovery" \
	--tool 'ciriquant,circexplorer2,find_circ,circrna_finder' \
	--bsj_reads 2
# errors (see testFull.nextflow.log):
ERROR nextflow.processor.TaskProcessor - Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:HISAT2_EXTRACTSPLICESITES'

Caused by:
  Not a valid path value: 'null'

#######################

# user data, remote igenome
nextflow run $OUTPUT_PATH/nf-core-circrna_dev/dev \
	-profile singularity \
	--input "$OUTPUT_PATH/data/samplesheet.csv" \
	--outdir "$OUTPUT_PATH/data/results/" \
	--genome "hg38" \
	--module "circrna_discovery" \
	--tool 'ciriquant,circexplorer2,find_circ,circrna_finder' \
	--bsj_reads 2
# errors (see iGenome.nextflow.log):
ERROR nextflow.processor.TaskProcessor - Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCEXPLORER2_REF'
# plus a bunch more 'null' paths

Caused by:
  Not a valid path value: 'null'

#######################

# user data, local igenome
nextflow run $OUTPUT_PATH/nf-core-circrna_dev/dev \
	-profile singularity \
	--input "$OUTPUT_PATH/data/samplesheet.csv" \
	--outdir "$OUTPUT_PATH/data/results/" \
	--genome "hg38" \
	--igenomes_base "$REF_PATH" \
	--module "circrna_discovery" \
	--tool 'ciriquant,circexplorer2,find_circ,circrna_finder' \
	--bsj_reads 2
# errors (see localGenome.nextflow.log):
ERROR nextflow.processor.TaskProcessor - Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_REFERENCE'
...
Sep-21 12:23:00.324 [Actor Thread 21] INFO  nextflow.Session - Execution cancelled -- Finishing pending tasks before exit
Sep-21 12:23:00.331 [Actor Thread 9] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=NFCORE_CIRCRNA:CIRCRNA:INPUT_CHECK:SAMPLESHEET_CHECK; work-dir=null
  error [java.lang.InterruptedException]: java.lang.InterruptedException
Sep-21 12:23:00.341 [Actor Thread 2] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:BOWTIE2_BUILD; work-dir=null
  error [java.lang.InterruptedException]: java.lang.InterruptedException
Sep-21 12:23:00.341 [Actor Thread 10] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:BWA_INDEX; work-dir=null
  error [java.lang.InterruptedException]: java.lang.InterruptedException
Sep-21 12:23:00.353 [Actor Thread 10] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_1ST_PASS; work-dir=null
  error [java.lang.InterruptedException]: java.lang.InterruptedException

#######################

# user data, hard coded 
nextflow run $OUTPUT_PATH/nf-core-circrna_dev/dev \
	-profile singularity \
	--input "$OUTPUT_PATH/data/samplesheet.csv" \
	--outdir "$OUTPUT_PATH/data/results/" \
	--genome "hg38" \
	--igenomes_ignore "true" \
	--fasta "$REF_PATH/Sequence/WholeGenomeFasta/genome.fa" \
	--bowtie2 "$REF_PATH/Sequence/Bowtie2Index/" \
	--bowtie "$REF_PATH/Sequence/BowtieIndex/" \
	--bwa "$REF_PATH/Sequence/BWAIndex/version0.6.0/" \
	--star "$REF_PATH/Sequence/STARIndex/" \
	--gtf "$REF_PATH/Annotation/Genes/genes.gtf" \
	--species "hsa" \
	--module "circrna_discovery" \
	--tool 'ciriquant,circexplorer2,find_circ,circrna_finder' \
	--bsj_reads 2
# errors (see hardGenome.nextflow.log):
nextflow.Session - Session aborted -- Cause: Cannot get property 'fasta' on null object

Relevant files

hardGenome.nextflow.log
iGenome.nextflow.log
localGenome.nextflow.log
testFull.nextflow.log

System information

nextflow: 23.04.3
hardware: HPC (3.10.0-1160.92.1.el7.x86_64)
executor: local (within slurm partition)
container engine: Singularity (pre-download performed as suggested)
OS: CentOS Linux
Version nf-core/circrna (commit): 60cbad7

Limitted input length.

###Description of the bug

From what I understood from the error messages the maximum input length of this pipeline is at 650 bases per read.

For reads longer 650 the RNA sequence seems to be cropped to a length of 650, while the quality string doesn't, leading to inequal quality string and sequence length.

Command used and terminal output

nextflow run /nfs/data3/CIRCEST/pipeline -profile apptainer,cluster -params-file ./configs/params.yaml

PARAMS:
input: './samplesheet.csv'
outdir: './results/'
save_reference: true
save_intermediates:  true
hisat2_build_memory:  '200.GB'
genome:  'WBcel235'
star: null
module: 'circrna_discovery'


ERROR ~ Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_1ST_PASS (elegans_unselected_1)'

Caused by:
  Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_1ST_PASS (elegans_unselected_1)` terminated with an error exit status (104)

Command executed:

  STAR \
      --genomeDir star \
      --readFilesIn input1/elegans_unselected_1_trimmed.fq.gz  \
      --runThreadN 24 \
      --outFileNamePrefix elegans_unselected_1. \
      --outSAMtype BAM Unsorted \
       \
      --outSAMattrRGline 'ID:elegans_unselected_1'  'SM:elegans_unselected_1'  \
      --chimOutType Junctions WithinBAM --outSAMunmapped Within --outFilterType BySJout --outReadsUnmapped None --readFilesCommand zcat --alignSJDBoverhangMin 10 --chimJunctionOverhangMin 10 --chimSegmentMin 10



  if [ -f elegans_unselected_1.Unmapped.out.mate1 ]; then
      mv elegans_unselected_1.Unmapped.out.mate1 elegans_unselected_1.unmapped_1.fastq
      gzip elegans_unselected_1.unmapped_1.fastq
  fi
  if [ -f elegans_unselected_1.Unmapped.out.mate2 ]; then
      mv elegans_unselected_1.Unmapped.out.mate2 elegans_unselected_1.unmapped_2.fastq
      gzip elegans_unselected_1.unmapped_2.fastq
  fi

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_1ST_PASS":
executor >  slurm (7)
[54/b6824f] process > NFCORE_CIRCRNA:CIRCRNA:INPUT_CHECK:SAMPLESHEET_CHECK (samplesheet.csv)                                     [100%] 1 of 1 ✔
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CAT_FASTQ                                                                           -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:BOWTIE_BUILD                                                         -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:BOWTIE2_BUILD                                                        -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:BWA_INDEX                                                            -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:HISAT2_EXTRACTSPLICESITES                                            -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:HISAT2_BUILD                                                         -
[5b/f8b873] process > NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:STAR_GENOMEGENERATE (genome.fa)                                      [100%] 1 of 1 ✔
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:PREPARE_GENOME:SEGEMEHL_INDEX                                                       -
[00/9b89ee] process > NFCORE_CIRCRNA:CIRCRNA:FASTQC_TRIMGALORE:FASTQC (elegans_unselected_1)                                     [100%] 1 of 1 ✔
[95/3c54c8] process > NFCORE_CIRCRNA:CIRCRNA:FASTQC_TRIMGALORE:TRIMGALORE (elegans_unselected_1)                                 [100%] 1 of 1 ✔
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:SEGEMEHL_ALIGN                                                    -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:SEGEMEHL_FILTER                                                   -
[17/6ccdf3] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_1ST_PASS (elegans_unselected_1)                              [100%] 2 of 2, failed: 2..
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_SJDB                                                         -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_2ND_PASS                                                     -
[39/0919bd] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCEXPLORER2_REF (genes.gtf)                                     [100%] 1 of 1 ✔
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCEXPLORER2_PAR                                                 -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCEXPLORER2_ANN                                                 -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCEXPLORER2_FLT                                                 -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRCRNA_FINDER_FILTER                                             -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:FIND_CIRC_ALIGN                                                   -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:SAMTOOLS_INDEX                                                    -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:SAMTOOLS_VIEW                                                     -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:FIND_CIRC_ANCHORS                                                 -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:FIND_CIRC                                                         -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:FIND_CIRC_FILTER                                                  -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT_YML                                                     -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT                                                         -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT_FILTER                                                  -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_MATE1_1ST_PASS                                                -                          [-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_MATE1_SJDB                                                    -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_MATE1_2ND_PASS                                                -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_MATE2_1ST_PASS                                                -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_MATE2_SJDB                                                    -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_MATE2_2ND_PASS                                                -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC                                                               -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_FILTER                                                        -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_REFERENCE                                               -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_ALIGN                                                   -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_PARSE                                                   -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_ANNOTATE                                                -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:MAPSPLICE_FILTER                                                  -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:COUNTS_SINGLE                                                     -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:REMOVE_HEADER                                                     -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:SPLIT_ANNOTATION                                                  -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:ANNOTATION                                                        -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CAT_ANNOTATION                                                    -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:SORT_ANNOTATION                                                   -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:FASTA                                                             -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:PSIRC_INDEX                                                       -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:PSIRC_QUANT                                                       -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:PSIRC_COMBINE                                                     -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:MIRNA_PREDICTION:TARGETSCAN_DATABASE                                                -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:MIRNA_PREDICTION:TARGETSCAN                                                         -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:MIRNA_PREDICTION:MIRANDA                                                            -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:MIRNA_PREDICTION:MIRNA_TARGETS                                                      -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:HISAT2_ALIGN                                                -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:BAM_SORT_STATS_SAMTOOLS:SAMTOOLS_SORT                       -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:BAM_SORT_STATS_SAMTOOLS:SAMTOOLS_INDEX                      -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_STATS   -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_FLAG... -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_IDXS... -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:STRINGTIE_STRINGTIE                                         -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:STRINGTIE_PREPDE                                            -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:DESEQ2_DIFFERENTIAL_EXPRESSION                              -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:PARENT_GENE                                                 -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:PREPARE_CLR_TEST                                            -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:DIFFERENTIAL_EXPRESSION:CIRCTEST                                                    -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:CUSTOM_DUMPSOFTWAREVERSIONS                                                         -
[-        ] process > NFCORE_CIRCRNA:CIRCRNA:MULTIQC                                                                             -
ERROR ~ Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_1ST_PASS (elegans_unselected_1)'

Caused by:
  Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_1ST_PASS (elegans_unselected_1)` terminated with an error exit status (104)

Command executed:

  STAR \
      --genomeDir star \
      --readFilesIn input1/elegans_unselected_1_trimmed.fq.gz  \
      --runThreadN 24 \
      --outFileNamePrefix elegans_unselected_1. \
      --outSAMtype BAM Unsorted \
       \
      --outSAMattrRGline 'ID:elegans_unselected_1'  'SM:elegans_unselected_1'  \
      --chimOutType Junctions WithinBAM --outSAMunmapped Within --outFilterType BySJout --outReadsUnmapped None --readFilesCommand zcat --alignSJDBoverhangMin 10 --chimJunctionOverhangMin 10 --chimSegmentMin 10



  if [ -f elegans_unselected_1.Unmapped.out.mate1 ]; then
      mv elegans_unselected_1.Unmapped.out.mate1 elegans_unselected_1.unmapped_1.fastq
      gzip elegans_unselected_1.unmapped_1.fastq
  fi
  if [ -f elegans_unselected_1.Unmapped.out.mate2 ]; then
      mv elegans_unselected_1.Unmapped.out.mate2 elegans_unselected_1.unmapped_2.fastq
      gzip elegans_unselected_1.unmapped_2.fastq
  fi

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:STAR_1ST_PASS":
      star: $(STAR --version | sed -e "s/STAR_//g")
      samtools: $(echo $(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*$//')
      gawk: $(echo $(gawk --version 2>&1) | sed 's/^.*GNU Awk //; s/, .*$//')
  END_VERSIONS

Command exit status:
  104

Command output:
        STAR --genomeDir star --readFilesIn input1/elegans_unselected_1_trimmed.fq.gz --runThreadN 24 --outFileNamePrefix elegans_unselected_1. --outSAMtype BAM Unsorted --outSAMattrRGline ID:elegans_unselected_1 SM:elegans_unselected_1 --chimOutType Junctions WithinBAM --outSAMunmapped Within --outFilterType BySJout --outReadsUnmapped None --readFilesCommand zcat --alignSJDBoverhangMin 10 --chimJunctionOverhangMin 10 --chimSegmentMin 10
        STAR version: 2.7.9a   compiled: 2021-05-04T09:43:56-0400 vega:/home/dobin/data/STAR/STARcode/STAR.master/source
  Nov 23 14:00:32 ..... started STAR run
  Nov 23 14:00:32 ..... loading genome
  Nov 23 14:00:34 ..... started mapping

Command error:

  EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length
  @SRR19055922.2.1
  AGAATTGGCTCTAGAGAATGCAGATATCATTGAGGTCGAGACCAAAAAGCCTTACAAGACTAAAGAATAAGAAAAACTGTTTTCACAGCAATAACAGAATTGAAAAGATCCATGATTACGCACCTCACTGGTCTCGAGAATGTCATGCAAGAGCTTTCTCTGTCAAGATAACTTGAAAGAGGTTCCATTCCTCAGCTTTCGCTGGACTCAAGTGCTCAACATTCCAGCCAAATGCAACAAAATCGAAAACATCCACCAAGGCTTTGTCAACATGACTTCCCTCATCGATGTCAACTTGGGATGCAATCAAATCAGCATGGCAGCTGATACTTTCGCCAACGTTCAAGATGTCTCCAGAACTTGATTCTTGATAATAACTGCATGACTGAATTCCCAAGCAAAGCTGTGAGAAACATGAACAACTTGATTGCTCTCAAATATAACAAGATCAACGCCATTAGACAAACGACTTTGTTAACCTCTCCTCCCTCTCCATGCTCTTAATGGAAACATTTTCTTGGCTTTAAAGGAGGAGCCCTCCAGAACCATCCAAATCTTCATATCTGTATTTGAATCAGGAACAATCTGCAAACTCGACAACGGAGTCTTGGAGCAATCAAGCAACTCCTGAGGTTTCGATCTATCTTCAA

  SOLUTION: fix your fastq file

  Nov 23 14:00:36 ...... FATAL ERROR, exiting

Work dir:
  /nfs/data3/CIRCEST/runs/benchmarking/work/17/6ccdf33d1780d19e7b6bbe0973cdf6

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

 -- Check '.nextflow.log' file for details

-[nf-core/circrna] Pipeline completed with errors-


### Relevant files


### System information

_No response_

Decide how to proceed with the differential expression subworkflow

Description of feature

The way it is currently implemented is not state of the art. Especially the way the linear quantification is currently implemented is less than optimal. The results of the QUANTIFICATION subworkflow should be used here.

Tasks

Beta Give feedback

Remove linear quantification in DIFFERENTIAL_EXPRESSION
Replace DESeq2 with Fishpond/Swish
Update circtest implementation
Options

Implement circRNAfull

Documentation can be found here
The function extract_sequence appears to be most relevant

Tasks

Beta Give feedback

Make sure the necessary outputs from STAR are available
Set up environment
Implement circRNAFull process
Options

quantification tool sets

Include parameter such as --tool_filter <union/intersection> when multiple circRNA quant tools have been selected.

Add support for RNAse-R effect correction

Description of feature

This should be an optional column in the samplesheet. If available, it should be used in detection tools like CIRI.

The following things need to be done:

Add definition to the input schema
Update samplesheet processing here and here
Pass data through the necessary channels to the detection tools and utilize if available

STAR, CIRIquant, DCC errors during pipeline run

Description of the bug

Hiya,
thank you for the work on the pipeline!
Currently, when I try to run the pipeline using my own (paired-end) data, it seems that there are a few steps in the pipeline in which it fails and exits. When going through the test run/profile though (using the test profile i.e nextflow run nf-core/circrna -c ./hpc.config -profile test,singularity -r dev -ansi-log false -resume) it seems to work fine and the pipeline completes.

The first issue that arose was regarding STAR. If it uses the genome: GRCh37 parameter, from what I understand this obtains the necessary fies/indices from iGenome. The issue is that when it reaches the mapping step prior to DCC, it fails due to Genome & STAR version incompatibility (STAR output below). The image used for this step seems to contain STAR version 2.7.10a, whereas Genome was generated with 2.7.4a, so could be a need to downgrade the image to a older STAR version? [*1]

Alternatively, I saw that I can provide my own fasta/gtf (and also the required species) parameter, so I tried it using the files from Ensembl (https://grch37.ensembl.org/Homo_sapiens/Info/Index). This seemed to work fine, but during DCC’s execution results in a ValueError: invalid literal for int() with base 10: '4"' error (more details below). From what I have found so far is that the GTF doesn't get parsed correctly by the Circ_nonCirc_Exon_Match.py functions of DCC/circtools. Installing and running circtools detect/DCC with the same files seems to work fine.

There was another error I had run into when trying to add/use ciriquant as a tool which errored out with CIRIquant.utils.PipelineError: Empty hisat2 bam generated, please re-run CIRIquant with -v and check the fastq and hisat2-index. Re-running this via bash .command.run results in the same error. If I try on the other hand launching the singularity image myself and run the commands i.e

singularity exec --no-home --pid -B <path_to_folder>/nf-core-testing <path_to_folder>/nf-core-testing/tmp/depot.galaxyproject.org-singularity-ciriquant-1.1.2--pyhdfd78af_2.img bash <path_to_folder>/nf-core-testing/work/7b/c6590863cfa52ce00059592e7f0d89/.command.sh

works fine and runs.

I have copied the errors to the box below. The command that was run (which produced the errors)is: nextflow run nf-core/circrna -c ./hpc.config -params-file ./params.yaml -profile singularity -r dev -ansi-log false -resume. Do let me know if there is anything I can help with.

On a sidenote: in the targetscan_format.sh script, its mentioned in a comment that Subset mature.fa according to the species provided by user to '--genome' but from briefly looking around wasn't able to find where this might be included in the pipeline?

[*1] Tried using a custom image with a downgraded STAR version, still get the same error

  EXITING because of FATAL ERROR: Genome version: 20201 is INCOMPATIBLE with running STAR version: 2.7.4a
  SOLUTION: please re-generate genome from scratch with running version of STAR, or with version: 2.7.4a

Command used and terminal output

STAR

    EXITING because of FATAL ERROR: Genome version: 20201 is INCOMPATIBLE with running STAR version: 2.7.10a
    SOLUTION: please re-generate genome from scratch with running version of STAR, or with version: 2.7.4a

or the full output

    ERROR ~ Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_MATE2_1ST_PASS (sample1)'
    
    Caused by:
      Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_MATE2_1ST_PASS (sample1)` terminated with an error exit status (105)
    
    Command executed:
    
      STAR \
          --genomeDir STARIndex \
          --readFilesIn input1/sample1_2_val_2.fq.gz  \
          --runThreadN 12 \
          --outFileNamePrefix sample1_mate2. \
          --outSAMtype BAM Unsorted \
           \
          --outSAMattrRGline 'ID:sample1_mate2'  'SM:sample1_mate2'  \
          --chimOutType Junctions WithinBAM --outSAMunmapped Within --outFilterType BySJout --outReadsUnmapped None --readFilesCommand zcat --alignSJDBoverhangMin 10 --chimJunctionOverhangMin 10 --chimSegmentMin 10
    
    
    
      if [ -f sample1_mate2.Unmapped.out.mate1 ]; then
          mv sample1_mate2.Unmapped.out.mate1 sample1_mate2.unmapped_1.fastq
          gzip sample1_mate2.unmapped_1.fastq
      fi
      if [ -f sample1_mate2.Unmapped.out.mate2 ]; then
          mv sample1_mate2.Unmapped.out.mate2 sample1_mate2.unmapped_2.fastq
          gzip sample1_mate2.unmapped_2.fastq
      fi
    
      cat <<-END_VERSIONS > versions.yml
      "NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC_MATE2_1ST_PASS":
          star: $(STAR --version | sed -e "s/STAR_//g")
          samtools: $(echo $(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*$//')
          gawk: $(echo $(gawk --version 2>&1) | sed 's/^.*GNU Awk //; s/, .*$//')
      END_VERSIONS
    
    Command exit status:
      105
    
    Command output:
            STAR --genomeDir STARIndex --readFilesIn input1/sample1_2_val_2.fq.gz --runThreadN 12 --outFileNamePrefix sample1_mate2. --outSAMtype BAM Unsorted --outSAMattrRGline ID:sample1_mate2 SM:sample1_mate2 --chimOutType Junctions WithinBAM --outSAMunmapped Within --outFilterType BySJout --outReadsUnmapped None --readFilesCommand zcat --alignSJDBoverhangMin 10 --chimJunctionOverhangMin 10 --chimSegmentMin 10
            STAR version: 2.7.10a   compiled: 2022-01-14T18:50:00-05:00 :/home/dobin/data/STAR/STARcode/STAR.master/source
      Jan 18 13:24:45 ..... started STAR run
      Jan 18 13:24:45 ..... loading genome
    
    Command error:
      INFO:    Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
      INFO:    Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
      INFO:    Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
    
      EXITING because of FATAL ERROR: Genome version: 20201 is INCOMPATIBLE with running STAR version: 2.7.10a
      SOLUTION: please re-generate genome from scratch with running version of STAR, or with version: 2.7.4a
    
      Jan 18 13:24:45 ...... FATAL ERROR, exiting

CIRIquant

    ERROR ~ Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT (sample1)'
    
    Caused by:
      Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT (sample1)` terminated with an error exit status (1)
    
    Command executed:
    
      CIRIquant \
          -t 36 \
          -1 sample1_1_val_1.fq.gz \
          -2 sample1_2_val_2.fq.gz \
          --config travis.yml \
          --no-gene \
          -o sample1 \
          -p sample1
    
      cat <<-END_VERSIONS > versions.yml
      "NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT":
          bwa: $(echo $(bwa 2>&1) | sed 's/^.*Version: //; s/Contact:.*$//')
          ciriquant : $(echo $(CIRIquant --version 2>&1) | sed 's/CIRIquant //g' )
          samtools: $(echo $(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*$//')
          stringtie: $(stringtie --version 2>&1)
          hisat2: 2.1.0
      END_VERSIONS
    
    Command exit status:
      1
    
    Command output:
      (empty)
    
    Command error:
      INFO:    Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
      INFO:    Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
      INFO:    Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
      [Thu 2024-01-18 13:38:50] [INFO ] Input reads: sample1_1_val_1.fq.gz,sample1_2_val_2.fq.gz
      [Thu 2024-01-18 13:38:50] [INFO ] Library type: unstranded
      [Thu 2024-01-18 13:38:50] [INFO ] Output directory: sample1, Output prefix: sample1
      [Thu 2024-01-18 13:38:50] [INFO ] Config: ciriquant Loaded
      [Thu 2024-01-18 13:38:50] [INFO ] 256 CPU cores availble, using 36
      [Thu 2024-01-18 13:38:50] [INFO ] Align RNA-seq reads to reference genome ..
      Traceback (most recent call last):
        File "/usr/local/bin/CIRIquant", line 10, in <module>
          sys.exit(main())
        File "/usr/local/lib/python2.7/site-packages/CIRIquant/main.py", line 155, in main
          hisat_bam = pipeline.align_genome(log_file, thread, reads, outdir, prefix)
        File "/usr/local/lib/python2.7/site-packages/CIRIquant/pipeline.py", line 52, in align_genome
          raise utils.PipelineError('Empty hisat2 bam generated, please re-run CIRIquant with -v and check the fastq and hisat2-index.')
      CIRIquant.utils.PipelineError: Empty hisat2 bam generated, please re-run CIRIquant with -v and check the fastq and hisat2-index.

DCC (own fasta/gtf)

    ERROR ~ Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC (sample1)'
    
    Caused by:
      Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC (sample1)` terminated with an error exit status (1)
    
    Command executed:
    
      sed -i 's/^chr//g' Homo_sapiens.GRCh37.87.gtf
    
      mkdir sample1 && mv sample1.Chimeric.out.junction sample1 && printf "sample1/sample1.Chimeric.out.junction" > samplesheet
      mkdir sample1_mate1 && mv sample1_mate1.Chimeric.out.junction sample1_mate1 && printf "sample1_mate1/sample1_mate1.Chimeric.out.junction" > mate1file
      mkdir sample1_mate2 && mv sample1_mate2.Chimeric.out.junction sample1_mate2 && printf "sample1_mate2/sample1_mate2.Chimeric.out.junction" > mate2file
    
      DCC @samplesheet -mt1 @mate1file -mt2 @mate2file -D -an Homo_sapiens.GRCh37.87.gtf -Pi -ss -F -M -Nr 1 1 -fg -A Homo_sapiens.GRCh37.dna.primary_assembly.fa -N -T 12
    
      awk '{print $6}' CircCoordinates >> strand
      paste CircRNACount strand | tail -n +2 | awk -v OFS=" " '{print $1,$2,$3,$5,$4}' >> 20096b003L2_Q001H283AC.txt
    
      cat <<-END_VERSIONS > versions.yml
      "NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:DCC":
          dcc: $(DCC --version)
      END_VERSIONS
    
    Command exit status:
      1
    
    Command output:
      Output folder ./ already exists, reusing
      DCC 0.5.0 started
      256 CPU cores available, using 12
      WARNING: non-stranded data, the strand of circRNAs guessed from the strand of host genes
      Please make sure that the read pairs have been mapped both, combined and on a per mate basis
      Collecting chimera information from mates-separate mapping
      Combining individual circRNA read counts
      Using files _tmp_DCC/tmp_circCount and _tmp_DCC/tmp_coordinates for filtering
      Filtering by read counts
      Remove ChrM
      Count CircSkip junctions
    
    Command error:
      INFO:    Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
      INFO:    Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
      INFO:    Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
      Traceback (most recent call last):
        File "/usr/local/bin/DCC", line 10, in <module>
          sys.exit(main())
        File "/usr/local/lib/python3.10/site-packages/DCC/main.py", line 490, in main
          CircSkipfiles = findCircSkipJunction(output_coordinates, options.tmp_dir,
        File "/usr/local/lib/python3.10/site-packages/DCC/main.py", line 679, in findCircSkipJunction
          circStartAdjacentExons, circStartAdjacentExonsIv = CCEM.findcircAdjacent(circStartExons, Custom_exon_id2Iv,
        File "/usr/local/lib/python3.10/site-packages/DCC/Circ_nonCirc_Exon_Match.py", line 281, in findcircAdjacent
          interval = Custom_exon_id2Iv[self.getAdjacent(ids, start=start)]
        File "/usr/local/lib/python3.10/site-packages/DCC/Circ_nonCirc_Exon_Match.py", line 222, in getAdjacent
          exon_number = int(custom_exon_id.split(':')[1]) - 1
      ValueError: invalid literal for int() with base 10: '4"'

Relevant files

No response

System information

Nextflow Version: 23.10.0
Hardware: HPC/Cluster
Executor: Slurm
Container: Singularity
OS: Ubuntu
nf-core/circrna version: dev

Implement circatlas

CIRIquant test fails since commit 72dd514, PR #104

Description of the bug

I'm running into a ciriquant error since #104, specifically after this commit 72dd514.

Reproducible by checking out that commit and running the test data
git checkout 72dd514cc90d19797bf3869b9427d879677582f6
nextflow run circrna -profile test,docker,arm --tool ciriquant

Error:
Traceback (most recent call last): File "/usr/local/bin/CIRIquant", line 10, in <module> sys.exit(main()) File "/usr/local/lib/python2.7/site-packages/CIRIquant/main.py", line 89, in main config = check_config(check_file(args.config_file)) File "/usr/local/lib/python2.7/site-packages/CIRIquant/utils.py", line 107, in check_config raise ConfigError('Could not find hisat2 index with suffix: *.[1-8].ht2 or *.[1-8].ht2l, please check your configuration') CIRIquant.utils.ConfigError: Could not find hisat2 index with suffix: *.[1-8].ht2 or *.[1-8].ht2l, please check your configuration

When reverting back to commit c4a3e8e (one commit earlier), this error does not occur, and the CIRCRNA_DISCOVERY:CIRIQUANT processes end successfully. (Another error occurs later in the pipeline but I think this is not important here.)

@nictru , tagging you as you might have a better understanding of the changes :)

Command used and terminal output

No response

Relevant files

No response

System information

No response

nf-core / circrna Goto Github PK

circrna's Introduction

Introduction

Pipeline summary

Usage

Pipeline output

Pipeline output

Credits

Acknowledgements

Contributions and Support

Citations

circrna's People

Contributors

Stargazers

Watchers

Forkers

circrna's Issues

Description of feature

Tasks

Description of feature

Description of the bug

Command used and terminal output

Relevant files

System information

Description of feature

Description of the bug

Command used and terminal output

Relevant files

System information

Description of feature

Input files

Pre-processing

circRNA Discovery

CIRIquant

CIRCexplorer2

circRNA_finder

DCC

find_circ

Mapsplice

Segemehl

circRNA annotation

circRNA FASTA sequence

circRNA count matrix

miRNA target prediction

miRanda

TargetScan

Differential expression

Description of feature

Tasks

Description of feature

Tasks

Description of the bug

Command used and terminal output

Relevant files

System information

Tasks

Tasks

Description of the bug

Command used and terminal output

Relevant files

System information

Description of the bug

Command used and terminal output

Relevant files

System information

Description of the bug

Command used and terminal output

Relevant files

System information

Description of the bug

Command used and terminal output

Relevant files

System information

Description of feature

Description of the bug

Command used and terminal output

Relevant files

System information

Tasks

Description of feature