cfia-ncfad / nf-flu Goto Github PK
View Code? Open in Web Editor NEWThis project forked from peterk87/nf-flu
Influenza genome analysis Nextflow workflow
License: MIT License
This project forked from peterk87/nf-flu
Influenza genome analysis Nextflow workflow
License: MIT License
Conda not actually being used/enabled when using -profile conda
nextflow run CFIA-NCFAD/nf-flu --input samplesheet.csv --platform nanopore -profile conda
ERROR ~ Error executing process > 'NF_FLU:NANOPORE:CHECK_SAMPLE_SHEET (1)'
Caused by:
Process `NF_FLU:NANOPORE:CHECK_SAMPLE_SHEET (1)` terminated with an error exit status (1)
Command executed:
check_sample_sheet.py samplesheet.csv nanopore samplesheet.fixed.csv
Command exit status:
1
Command output:
(empty)
Command error:
Traceback (most recent call last):
File "/home/pkruczkiewicz/.nextflow/assets/CFIA-NCFAD/nf-flu/bin/check_sample_sheet.py", line 6, in <module>
import typer
ModuleNotFoundError: No module named 'typer'
local
23.04.1
openjdk version "20.0.1" 2023-04-18
OpenJDK Runtime Environment (build 20.0.1+9)
OpenJDK 64-Bit Server VM (build 20.0.1+9, mixed mode, sharing)
local
Arch
Conda
nextflow.config
is missing:
profiles {
conda {
params.enable_conda = true
conda.createTimeout = "120 min"
conda.enabled = true // <-- this, conda not enabled!!!
}
}
Custom sequences are not currently tested as part of CI. This is a big feature that needs testing.
README is out-of-date and there is no test
profile anymore. nf-flu can be run with either a test_illumina
or test_nanopore
profile. Docs and README need to be updated to reflect this.
See related issue peterk87#18
Test profile is not working
nextflow run CFIA-NCFAD/nf-flu -profile test,docker
N E X T F L O W ~ version 23.04.3
Pulling CFIA-NCFAD/nf-flu ...
downloaded from https://github.com/CFIA-NCFAD/nf-flu.git
Unknown configuration profile: 'test'
3.3.5
local
23.04.3.5875
11.0.20.1
Desktop
Ubuntu 22.04
Docker
No response
Hello,
I have come across an edge case where the reference used as the top match for a sample contained a degenerate nucleotide causing the error described below. Essentially, clair3 changed the degenerate nucleotide to "N", which resulted in an error in bcftools as the reference allele no longer matched the reference sequence.
I believe I was able to mitigate this issue making two changes:
clair3.nf
module, I added the following flag to the clair3
command:
--keep_iupac_bases True
BCF_FILTER
process.bcftools.nf
, I added the following flag to the bcftools norm
command:
-c w
bcftools norm
--check-ref
command, where the w
sets it to "warn"; as far as I can tell, this should allow the the degenerate nucleotide to exist, but should not result in any changes to how the results are processed (see: bcftools norm)In my testing, it has allowed me to successfully process the sample with the offending top match reference, though I would appreciate any feedback as to whether I have overlooked anything with my suggested parameter changes.
nextflow nf-flu_v3.3.2/cfia-ncfad-nf-flu-3.3.2/workflow/main.nf --input IRVC20230720IH1_Part1_Complete_AF25.csv --platform nanopore --outdir IRVC20230720IH1_Part1_Complete_AF25_nf-flu_results --major_allele_fraction 0.25 -profile singularity,slurm -resume
ERROR ~ Error executing process > 'NF_FLU:NANOPORE:BCF_CONSENSUS (RV00831-22-IAV|3_PA|OQ234674.1)'
Caused by:
Process `NF_FLU:NANOPORE:BCF_CONSENSUS (RV00831-22-IAV|3_PA|OQ234674.1)` terminated with an error exit status (255)
Command executed:
bgzip -c RV00831-22-IAV.Segment_3_PA.OQ234674.1.no_frameshifts.vcf > RV00831-22-IAV.Segment_3_PA.OQ234674.1.no_frameshifts.vcf.gz
tabix RV00831-22-IAV.Segment_3_PA.OQ234674.1.no_frameshifts.vcf.gz
# get low coverage depth mask BED file by filtering for regions with less than 10X
zcat RV00831-22-IAV.Segment_3_PA.OQ234674.1.per-base.bed.gz | awk '$4<10' > low_cov.bed
bcftools consensus \
-f RV00831-22-IAV.Segment_3_PA.OQ234674.1.reference.fasta \
-m low_cov.bed \
RV00831-22-IAV.Segment_3_PA.OQ234674.1.no_frameshifts.vcf.gz > RV00831-22-IAV.Segment_3_PA.OQ234674.1.bcftools.consensus.fasta
sed -i -E "s/^>(.*)/>RV00831-22-IAV_3_PA/g" RV00831-22-IAV.Segment_3_PA.OQ234674.1.bcftools.consensus.fasta
cat <<-END_VERSIONS > versions.yml
"NF_FLU:NANOPORE:BCF_CONSENSUS":
bcftools: $(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*$//')
END_VERSIONS
Command exit status:
255
Command output:
(empty)
Command error:
Note: the --sample option not given, applying all records regardless of the genotype
The fasta sequence does not match the REF allele at OQ234674.1:1946:
REF .vcf: [TATTCAATAGCCTATATGCATCACCACAATTGGAAGGANTTTCAGCAGAGTC]
ALT .vcf: [T]
REF .fa : [TATTCAATAGCCTATATGCATCACCACAATTGGAAGGAYTTTCAGCAGAGTC]AAGAAAACTGCTCCTTATTGTTCAGGCTCTTAGGGACAAACTCGAACCTGGGACTTTTGATCTTGGGGGGCTATATGAAGCAATTGAGGAGTGCCTGATTAATGATCCCTGGGTTTTGCTCAATGCGTCTTGGTTCAACTCCTTCCTGACACATGCACTAAAATAGTTATAGCAGTGCTACTATTTGTTATCCGTACTGTCCAAAAAAGTA
Work dir:
/work/b6/73a925fff13d66411ea7d7776773c1
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
-- Check '.nextflow.log' file for details
v3.3.2; revision: 91a5d05f86
No response
23.04.1
openjdk version "20-internal" 2023-03-21
OpenJDK Runtime Environment (build 20-internal-adhoc..src)
OpenJDK 64-Bit Server VM (build 20-internal-adhoc..src, mixed mode, sharing)
HPC Cluster
No response
None
slurm-27454074_AfterClair3Change.txt
slurm-27454201_OriginalError.txt
Instead of assigning H/N subtypes based on majority subtype from high % identity and alignment length matches to sequences with such info, the subtyping reports with user-specified sequences where those sequences are the top matching sequences report N/A
when an H and N subtype could be assigned easily.
There are also no total counts, proportions matching to certain subtypes, etc.
nextflow run CFIA-NCFAD/nf-flu --input samplesheet.csv --platform nanopore -profile docker --ref_db DB.fasta -resume -r 3.3.2
N/A
3.3.2
local
23.04.2
No response
desktop
No response
Docker
No response
Conda/Mamba process envs not being created.
nextflow run CFIA-NCFAD/nf-flu -r 3.3.0 -profile <conda/mamba> ...
"IRMA" (and other process tools) command not found
3.3.0
local
22.04.5, build 5708 (15-07-2022 16:09 UTC)
No response
desktop
No response
Conda
No response
Related to #53
See #53 (comment)
Users should be able to submit any sequences in FASTA format and the workflow should figure out if those sequences are valid input. The sequence names should NOT require any special formatting. Currently _Segment
is required in the user provided sequence name:
nf-flu/bin/get_blastn_report.py
Line 83 in bdc8942
But it is not necessary.
The full user sequence name should be preserved not dropped:
nf-flu/bin/get_blastn_report.py
Line 86 in bdc8942
The FASTA record description or comment should be preserved instead of stripped away/ignored:
Lines 27 to 29 in bdc8942
The code should be:
seqid, desc, seq = rec.id, rec.description, rec.seq
# replace non-word, non-digit, non-period or dash characters
new_seqid = re.sub(r'[^\w.\-]+, '_', seqid)
# remove leading and trailing underscores
new_seqid = re.sub(r'^_|_$', '', new_seqid)
# preserve seq description and document changes in FULL seq name
seq_name = f'{seqid}{" " + desc if desc else ""}'
new_seq_name = f'{new_seqid}{" " + desc if desc else ""}'
A subworkflow should be created to handle validation of user-specified sequences to ensure that they are valid input
-{seq index}
Variant calling results are already be produced and could simply be tabulated with bcftools stats
to compute number of SNPs, MNPs and indels combined with depth info.
Edlib global alignment would be more appropriate for genome to genome comparison than BLAST local alignment, which can easily show inaccurate mismatch or gap results for a gene segment if there are coverage issues (e.g. no coverage in the middle of a gene segment leading to 2 BLAST alignments).
Unable to create subtyping report due to NoDataError: empty CSV
in parse_influenza_blast_results.py
. One of the BLAST results files from BLAST search against user ref seqs was empty causing Polars to throw an exception. This type of thing should be handled more gracefully. The previous version 3.1.6 did not produce this error.
nextflow run CFIA-NCFAD/nf-flu -r 3.2.0 --input samplesheet.csv --platform nanopore -profile docker --ref_db DB.fasta
Error executing process > 'NF_FLU:NANOPORE:SUBTYPING_REPORT_BCF_CONSENSUS'
Caused by:
Process `NF_FLU:NANOPORE:SUBTYPING_REPORT_BCF_CONSENSUS` terminated with an error exit status (1)
Command executed:
parse_influenza_blast_results.py \
--threads 1 \
--flu-metadata genomeset.dat.gz \
--top 3 \
--excel-report iav-subtyping-report.xlsx \
--pident-threshold 0.85 \
SAMPLE-0239-2.blastn.txt SAMPLE-0239-1.blastn.txt SAMPLE-1370.blastn.txt SAMPLE-0238.blastn.txt SAMPLE-0233.blastn.txt SAMPLE-1096.blastn.txt SAMPLE-0052.blastn.txt
ln -s .command.log parse_influenza_blast_results.log
cat <<-END_VERSIONS > versions.yml
"NF_FLU:NANOPORE:SUBTYPING_REPORT_BCF_CONSENSUS":
python: $(python --version | sed 's/Python //g')
END_VERSIONS
Command exit status:
1
Command output:
(empty)
Command error:
│ │ │ ('mismatch', UInt16), │ │
│ │ │ ('gapopen', UInt16), │ │
│ │ │ ('qstart', UInt16), │ │
│ │ │ ('qend', UInt16), │ │
│ │ │ ('sstart', UInt16), │ │
│ │ │ ('send', UInt16), │ │
│ │ │ ... +6 │ │
│ │ ] │ │
│ │ dtypes = { │ │
│ │ │ 'qaccver': , │ │
│ │ │ 'saccver': , │ │
│ │ │ 'pident': , │ │
│ │ │ 'length': UInt16, │ │
│ │ │ 'mismatch': UInt16, │ │
│ │ │ 'gapopen': UInt16, │ │
│ │ │ 'qstart': UInt16, │ │
│ │ │ 'qend': UInt16, │ │
│ │ │ 'sstart': UInt16, │ │
│ │ │ 'send': UInt16, │ │
│ │ │ ... +6 │ │
│ │ } │ │
│ │ encoding = 'utf8' │ │
│ │ eol_char = '\n' │ │
│ │ has_header = False │ │
│ │ ignore_errors = False │ │
│ │ infer_schema_length = 100 │ │
│ │ k = 'stitle' │ │
│ │ low_memory = False │ │
│ │ missing_utf8_is_empty_string = False │ │
│ │ n_rows = None │ │
│ │ null_values = None │ │
│ │ processed_null_values = None │ │
│ │ quote_char = '"' │ │
│ │ rechunk = True │ │
│ │ row_count_name = None │ │
│ │ row_count_offset = 0 │ │
│ │ self = │ │
│ │ separator = '\t' │ │
│ │ skip_rows = 0 │ │
│ │ skip_rows_after_header = 0 │ │
│ │ source = 'SAMPLE-0239-2.blastn.txt' │ │
│ │ try_parse_dates = False │ │
│ │ v = │ │
│ │ with_column_names = .with_column_names at │ │
│ │ 0x7f8596a0d1b0> │ │
│ ╰──────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────╯
NoDataError: empty CSV
3.2.0
local
22.04.5, build 5708 (15-07-2022 16:09 UTC)
No response
local
No response
Docker
No response
Hello,
I have come across an edge-case related to the sequencing of negative controls samples.
On occasion, the read set for an NTC sample exceeds the minimum number of reads (default: 100), and are allowed to be processed by the pipeline. The issue is related to IRMA and how it handles samples with low numbers of reads.
The pipeline will crash when IRMA has an output as follows:
Loading config file 'irma_config.sh'
[2023-09-11 14:32:12] IRMA/FLU-minion started run 'GEN23-RTPCR-0829-NTC-3-1b'
[2023-09-11 14:32:12] IRMA/FLU-minion found 1306.5T free space, only needed ~6.0M
[2023-09-11 14:32:12] IRMA/FLU-minion pre-processed
[2023-09-11 14:32:12] IRMA/FLU-minion R1 started (253)
[2023-09-11 14:32:13] IRMA/FLU-minion R1 all-match with BLAT finished
[2023-09-11 14:32:13] IRMA/FLU-minion R1 consolidated & cleaned
[2023-09-11 14:32:13] IRMA/FLU-minion R1 aborted, no matches found
[2023-09-11 14:32:13] IRMA/FLU-minion converted back to fastq
[2023-09-11 14:32:13] IRMA/FLU-minion saved unmatched read patterns
[2023-09-11 14:32:13] IRMA/FLU-minion skipping final assembly, no reference files found
[2023-09-11 14:32:15] IRMA/FLU-minion moving project
[2023-09-11 14:32:15] IRMA/FLU-minion finished!
Essentially there are no influenza reads, so it doesn't generate any outputs, which appears to affect lines 38-42 in the irma.nf
file:
IRMA $irma_module $reads $meta.id
if [ -d "${meta.id}/amended_consensus/" ]; then
cat ${meta.id}/amended_consensus/*.fa > ${meta.id}.irma.consensus.fasta
fi
Resulting in the following error that crashes the pipeline:
cat: can't open 'GEN23-RTPCR-0829-NTC-3-1b/amended_consensus/*.fa': No such file or directory
However, when IRMA encounters a sample with, I'm assuming, with at least one RP or read the outputs generared are enough to allowe the pipeline to proceed:
Loading config file 'irma_config.sh'
[2023-09-11 14:32:18] IRMA/FLU-minion started run 'GEN23-RTPCR-0829-NTC-3-1a'
[2023-09-11 14:32:18] IRMA/FLU-minion found 1306.5T free space, only needed ~5.7M
[2023-09-11 14:32:18] IRMA/FLU-minion pre-processed
[2023-09-11 14:32:18] IRMA/FLU-minion R1 started (164)
[2023-09-11 14:32:18] IRMA/FLU-minion R1 all-match with BLAT finished
[2023-09-11 14:32:19] IRMA/FLU-minion R1 consolidated & cleaned
[2023-09-11 14:32:19] IRMA/FLU-minion R1 sorted using BLAT
[2023-09-11 14:32:19] IRMA/FLU-minion R1 aborted, found fewer than 3 RPs or 3 reads for all templates
[2023-09-11 14:32:19] IRMA/FLU-minion moving project
[2023-09-11 14:32:19] IRMA/FLU-minion finished!
For now, to mitigate this problem, I added errorStrategy = 'ignore'
to the modules_nanopore.config
:
withName: 'IRMA' {
// increased job time limit for IRMA process to accommodate large samples
time = '24h'
errorStrategy = 'ignore'
publishDir = [
[
path: { "${params.outdir}/irma"},
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
mode: params.publish_dir_mode
],
[
path: { "${params.outdir}/consensus/irma/" },
pattern: "*.irma.consensus.fasta",
mode: params.publish_dir_mode
]
]
}
However, I'm not sure if this is the most robust way around this particular issue.
Thank you very much for your support.
nextflow nf-flu_v3.3.4/cfia-ncfad-nf-flu-3.3.4/workflow/main.nf --input IRVC20230831IHN-1.csv --platform nanopore --outdir IRVC20230831IHN-1_nf-flu_results --major_allele_fraction 0.25 -profile singularity,slurm -resume
ERROR ~ Error executing process > 'NF_FLU:NANOPORE:IRMA (GEN23-RTPCR-0829-NTC-3-1b)'
Caused by:
Process `NF_FLU:NANOPORE:IRMA (GEN23-RTPCR-0829-NTC-3-1b)` terminated with an error exit status (1)
Command executed:
touch irma_config.sh
echo 'SINGLE_LOCAL_PROC=16' >> irma_config.sh
echo 'DOUBLE_LOCAL_PROC=8' >> irma_config.sh
# default tmp in current working directory instead of defaulting to /tmp
# which may be restricted in size on HPC clusters
echo 'ALLOW_TMP=1' >> irma_config.sh
echo 'TMP=$PWD' >> irma_config.sh
if [ true ]; then
echo 'DEL_TYPE="NNN"' >> irma_config.sh
echo 'ALIGN_PROG="BLAT"' >> irma_config.sh
fi
IRMA FLU-minion GEN23-RTPCR-0829-NTC-3-1b.merged.fastq.gz GEN23-RTPCR-0829-NTC-3-1b
if [ -d "GEN23-RTPCR-0829-NTC-3-1b/amended_consensus/" ]; then
cat GEN23-RTPCR-0829-NTC-3-1b/amended_consensus/*.fa > GEN23-RTPCR-0829-NTC-3-1b.irma.consensus.fasta
fi
ln -s .command.log GEN23-RTPCR-0829-NTC-3-1b.irma.log
cat <<-END_VERSIONS > versions.yml
"NF_FLU:NANOPORE:IRMA":
IRMA: $(IRMA | head -n1 | sed -E 's/^Iter.*IRMA\), v(\S+) .*/\1/')
END_VERSIONS
Command exit status:
1
Command output:
Loading config file 'irma_config.sh'
[2023-09-11 16:06:21] IRMA/FLU-minion started run 'GEN23-RTPCR-0829-NTC-3-1b'
[2023-09-11 16:06:21] IRMA/FLU-minion found 1305.9T free space, only needed ~6.0M
[2023-09-11 16:06:21] IRMA/FLU-minion pre-processed
[2023-09-11 16:06:21] IRMA/FLU-minion R1 started (253)
[2023-09-11 16:06:22] IRMA/FLU-minion R1 all-match with BLAT finished
[2023-09-11 16:06:23] IRMA/FLU-minion R1 consolidated & cleaned
[2023-09-11 16:06:23] IRMA/FLU-minion R1 aborted, no matches found
[2023-09-11 16:06:23] IRMA/FLU-minion converted back to fastq
[2023-09-11 16:06:23] IRMA/FLU-minion saved unmatched read patterns
[2023-09-11 16:06:23] IRMA/FLU-minion skipping final assembly, no reference files found
[2023-09-11 16:06:24] IRMA/FLU-minion moving project
[2023-09-11 16:06:24] IRMA/FLU-minion finished!
Command error:
cat: can't open 'GEN23-RTPCR-0829-NTC-3-1b/amended_consensus/*.fa': No such file or directory
3.3.4, revision: bda4dc7
No response
23.04.1
openjdk version "17.0.3-internal" 2022-04-19
OpenJDK Runtime Environment (build 17.0.3-internal+0-adhoc..src)
OpenJDK 64-Bit Server VM (build 17.0.3-internal+0-adhoc..src, mixed mode, sharing)
HPC Cluster
Distributor ID: CentOS Description: CentOS Linux release 7.9.2009 (Core) Release: 7.9.2009 Codename: Core
Singularity
On running our own flu samples on the latest nf-flu release, we're getting the following error:
ComputeError: ValueError: Remapping keys for map_dict could not be converted to
Utf8 without losing values in the conversion.
which seems to happen when querying the metadata file with polars.
I can't share the .gz files that we ran, but these same files ran successfully on previous nf-flu versions.
Your test samples run fine on the same versions.
nextflow run main.nf -config ~/conf/credentials.config -profile docker --input ~/samplesheets/test1.csv --max_memory 9.GB --max_cpus 6
Oct-26 13:01:28.571 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 9; name: NF_FLU:ILLUMINA:SUBTYPING_REPORT (1); status: COMPLETED; exit: 1; error: -; workDir: /nf-flu/work/95/e1a429083f7a0cca807a0a059be076]
Oct-26 13:01:28.571 [Task monitor] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
task: name=NF_FLU:ILLUMINA:SUBTYPING_REPORT (1); work-dir=/nf-flu/work/95/e1a429083f7a0cca807a0a059be076
error [nextflow.exception.ProcessFailedException]: Process `NF_FLU:ILLUMINA:SUBTYPING_REPORT (1)` terminated with an error exit status (1)
Oct-26 13:01:28.579 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'NF_FLU:ILLUMINA:SUBTYPING_REPORT (1)'
Caused by:
Process `NF_FLU:ILLUMINA:SUBTYPING_REPORT (1)` terminated with an error exit status (1)
Command executed:
parse_influenza_blast_results.py \
--flu-metadata 2023-06-14-NCBI-Viruses-Orthomyxoviridae_utf8-influenza.csv \
--top 3 \
--excel-report nf-flu-subtyping-report.xlsx \
--pident-threshold 0.85 \
--samplesheet samplesheet.fixed.csv \
442878.blastn.txt
ln -s .command.log parse_influenza_blast_results.log
cat <<-END_VERSIONS > versions.yml
"NF_FLU:ILLUMINA:SUBTYPING_REPORT":
python: $(python --version | sed 's/Python //g')
END_VERSIONS
Command exit status:
1
Command output:
(empty)
Command error:
│ │ │ ┆ ┆ .1 ┆ ┆ ┆ sapiens │ │
│ │ ┆ Missouri ┆ ┆ │ │ │
│ │ │ … ┆ … ┆ … ┆ … ┆ … ┆ … │ │
│ │ ┆ … ┆ … ┆ … │ │ │
│ │ │ 442878 ┆ 7 ┆ OQ462477 ┆ H1N1 ┆ … ┆ Homo │ │
│ │ ┆ USA: ┆ 2022-12-05 ┆ 2023-02-26 │ │ │
│ │ │ ┆ ┆ .1 ┆ ┆ ┆ sapiens │ │
│ │ ┆ Maryland ┆ ┆ │ │ │
│ │ │ 442878 ┆ 8 ┆ OX422577 ┆ nan ┆ … ┆ Homo │ │
│ │ ┆ United ┆ 2022-12-20 ┆ 2023-02-10 │ │ │
│ │ │ ┆ ┆ .1 ┆ ┆ ┆ sapiens │ │
│ │ ┆ Kingdom ┆ ┆ │ │ │
│ │ │ 442878 ┆ 8 ┆ OX436875 ┆ nan ┆ … ┆ Homo │ │
│ │ ┆ United ┆ 2023-01-01 ┆ 2023-02-21 │ │ │
│ │ │ ┆ ┆ .1 ┆ ┆ ┆ sapiens │ │
│ │ ┆ Kingdom ┆ ┆ │ │ │
│ │ │ 442878 ┆ 8 ┆ OX442452 ┆ nan ┆ … ┆ Homo │ │
│ │ ┆ United ┆ 2023-01-01 ┆ 2023-03-03 │ │ │
│ │ │ ┆ ┆ .1 ┆ ┆ ┆ sapiens │ │
│ │ ┆ Kingdom ┆ ┆ │ │ │
│ │ └────────┴────────────┴──────────┴──────────┴───┴─────────… │ │
│ ╰──────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /usr/local/lib/python3.10/site-packages/polars/lazyframe/frame.py:1606 in │
│ collect │
│ │
│ 1603 │ │ │ common_subplan_elimination, │
│ 1604 │ │ │ streaming, │
│ 1605 │ │ ) │
│ ❱ 1606 │ │ return wrap_df(ldf.collect()) │
│ 1607 │ │
│ 1608 │ def sink_parquet( │
│ 1609 │ │ self, │
│ │
│ ╭───────────────────────────────── locals ─────────────────────────────────╮ │
│ │ common_subplan_elimination = False │ │
│ │ ldf = <builtins.PyLazyFrame object at │ │
│ │ 0x7f12d0dd9530> │ │
│ │ no_optimization = True │ │
│ │ predicate_pushdown = False │ │
│ │ projection_pushdown = False │ │
│ │ self = <polars.LazyFrame object at 0x7F12D0D5ED10> │ │
│ │ simplify_expression = True │ │
│ │ slice_pushdown = False │ │
│ │ streaming = False │ │
│ │ type_coercion = True │ │
│ ╰──────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────╯
ComputeError: ValueError: Remapping keys for map_dict could not be converted to
Utf8 without losing values in the conversion.
v3.3.5 28183a8
local
23.10.0
openjdk version "18.0.2-ea" 2022-07-19
OpenJDK Runtime Environment (build 18.0.2-ea+9-Ubuntu-222.04)
OpenJDK 64-Bit Server VM (build 18.0.2-ea+9-Ubuntu-222.04, mixed mode, sharing)
Desktop
Ubuntu 22.04.2 LTS (GNU/Linux 5.15.90.1-microsoft-standard-WSL2 x86_64)
Docker
No response
IRMA assembly will silently fail when Illumina paired-end reads don't have "1:N:..."/"2:N:..." in their read headers. Reads straight off an Illumina sequencer will not have this problem, but reads extracted from NCBI SRA files with fasterq-dump
, some modified/filtered reads and synthetic reads may have issues.
IRMA will log "Irregular header for fastQ read pairs" and output empty consensus sequences.
A process should be added to append "1:N:..."/"2:N:..." to forward/reverse reads if those reads do not have that text in their read headers. Basically these command-lines wrapped in a Nextflow process:
# if paired-end Illumina reads, check for 1:N... and 2:N
fwd_reads_N_count=$(zcat ${reads[0]} | grep -c "^@.* [12]:N:.*")
rev_reads_N_count=$(zcat ${reads[1]} | grep -c "^@.* [12]:N:.*")
if [[ $fwd_reads_N_count == 0 && $rev_reads_N_count == 0 ]]; then
zcat ${reads[0]} | sed -r 's/^(@.*)/\1 1:N:0./' | pigz -ck > ${meta.id}_R1.fixed.fastq.gz
zcat ${reads[1]} | sed -r 's/^(@.*)/\1 2:N:0./' | pigz -ck > ${meta.id}_R2.fixed.fastq.gz
else
# reads okay, symlink?
fi
Hello,
On rare occasion we have sequencing runs comprising relatively few samples resulting in large input .fastq files (>2GB compressed) that evidently cause IRMA to fail. I have attempted to modify the "base.config" file to increase the "withLabel:process_high" parameter (which governs the IRMA module) to >32GB to mitigate this issue
e.g.,
However, the IRMA command executed doesn't appear to be influenced by changing parameters in the base.config:
e.g.,
Of course it is possible to down sample the reads, though it would be preferable if I didn't have to do that. I'm not sure if there are any other levers I can pull within the pipeline to overcome this issue. Any help would be appreciated.
sbatch -c 2 --mem=4GB -p OutbreakResponse --wrap="nextflow ${WORKFLOW_DIR} --input ${INPUT_SHEET} --platform ${PLATFORM} ${DATABASE} --outdir ${OUTDIR} -profile singularity,slurm -resume"
Note: platform = nanopore, and in this particular case, no user-defined database was used.
Oops... Pipeline execution stopped with the following message: Loading config file 'irma_config.sh'
[2023-05-15 10:16:54] IRMA/FLU-minion started run 'GEN23-0018-neat'
[2023-05-15 10:16:54] IRMA/FLU-minion ERROR: needed ~38765.72M to execute, but only 26447.94M available on disk
[2023-05-15 10:16:54] IRMA/FLU-minion ABORTED run: GEN23-0018-neat
[f7/e4e21a] NOTE: Process `NF_FLU:NANOPORE:IRMA (GEN23-0018-neat)` terminated with an error exit status (1) -- Execution is retried (1)
[ed/a49ef6] NOTE: Process `NF_FLU:NANOPORE:IRMA (GEN23-0018-neat)` terminated with an error exit status (1) -- Execution is retried (2)
Error executing process > 'NF_FLU:NANOPORE:IRMA (GEN23-0018-neat)'
Caused by:
Process `NF_FLU:NANOPORE:IRMA (GEN23-0018-neat)` terminated with an error exit status (1)
Command executed:
touch irma_config.sh
echo 'SINGLE_LOCAL_PROC=16' >> irma_config.sh
echo 'DOUBLE_LOCAL_PROC=8' >> irma_config.sh
if [ true ]; then
echo 'DEL_TYPE="NNN"' >> irma_config.sh
echo 'ALIGN_PROG="BLAT"' >> irma_config.sh
fi
IRMA FLU-minion GEN23-0018-neat.merged.fastq.gz GEN23-0018-neat
if [ -d "GEN23-0018-neat/amended_consensus/" ]; then
cat GEN23-0018-neat/amended_consensus/*.fa > GEN23-0018-neat.irma.consensus.fasta
fi
ln -s .command.log GEN23-0018-neat.irma.log
cat <<-END_VERSIONS > versions.yml
"NF_FLU:NANOPORE:IRMA":
IRMA: $(IRMA | head -n1 | sed -E 's/^Iter.*IRMA\), v(\S+) .*/\1/')
END_VERSIONS
Command exit status:
1
Command output:
Loading config file 'irma_config.sh'
[2023-05-15 10:16:54] IRMA/FLU-minion started run 'GEN23-0018-neat'
[2023-05-15 10:16:54] IRMA/FLU-minion ERROR: needed ~38765.72M to execute, but only 26447.94M available on disk
[2023-05-15 10:16:54] IRMA/FLU-minion ABORTED run: GEN23-0018-neat
Command wrapper:
Loading config file 'irma_config.sh'
[2023-05-15 10:16:54] IRMA/FLU-minion started run 'GEN23-0018-neat'
[2023-05-15 10:16:54] IRMA/FLU-minion ERROR: needed ~38765.72M to execute, but only 26447.94M available on disk
[2023-05-15 10:16:54] IRMA/FLU-minion ABORTED run: GEN23-0018-neat
Work dir:
/path/to/workdir/IRVC20230417IHN_analysis/20230417_samples_nf-flu_results/work/a5/67d3b6c885164cd99afcf1598e54ef
Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
3.1.2; revision: 9473cbaed9
slurm
22.10.1
openjdk version "17.0.3-internal" 2022-04-19
OpenJDK Runtime Environment (build 17.0.3-internal+0-adhoc..src)
OpenJDK 64-Bit Server VM (build 17.0.3-internal+0-adhoc..src, mixed mode, sharing)
HPC Cluster
Distributor ID: CentOS Description: CentOS Linux release 7.9.2009 (Core) Release: 7.9.2009 Codename: Core
Singularity
No response
Hello,
When performing analysis on a run today, I got an error on the sub-typing report generation step near the end of the workflow. The error said "NoDataError: empty CSV" and this was a sample that did not amplify well at all, so I assume the issue is that there is poor quality data associated with the sample and this caused an issue for sub-typing report generation. When I repeated the analysis with v3.1.6, the workflow went to completion with no problem (I ran it exactly the same except with -r 3.1.6).
Thanks,
Mat
nextflow run CFIA-NCFAD/nf-flu -r 3.2.0 --input samplesheet_20230707_all.csv --platform nanopore -profile docker --ref_db /Zarls/users/Mat/2021-22_AIV-Outbreak_WGS-DB/230630_AIV-Outbreak-DB.fasta --outdir results_all_final
Workflow execution completed unsuccessfully!
The exit status of the task that caused the workflow execution to fail was: 1.
The full error message was:
Error executing process > 'NF_FLU:NANOPORE:SUBTYPING_REPORT_BCF_CONSENSUS'
Caused by:
Process `NF_FLU:NANOPORE:SUBTYPING_REPORT_BCF_CONSENSUS` terminated with an error exit status (1)
Command executed:
parse_influenza_blast_results.py \
--threads 1 \
--flu-metadata genomeset.dat.gz \
--top 3 \
--excel-report iav-subtyping-report.xlsx \
--pident-threshold 0.85 \
WIN-AH-2023-FAV-0239-2-OS.blastn.txt WIN-AH-2023-FAV-0239-1-OS.blastn.txt WIN-AH-2022-FAV-1370-5-1ce4dpi-rpt.blastn.txt WIN-AH-2023-FAV-0238-OS.blastn.txt WIN-AH-2023-FAV-0233-1ce2dpi-rpt.blastn.txt WIN-AH-2022-FAV-1096-14-1ce4dpi.blastn.txt WIN-AH-2023-OTH-0052-6-OS.blastn.txt
ln -s .command.log parse_influenza_blast_results.log
cat <<-END_VERSIONS > versions.yml
"NF_FLU:NANOPORE:SUBTYPING_REPORT_BCF_CONSENSUS":
python: $(python --version | sed 's/Python //g')
END_VERSIONS
Command exit status:
1
Command output:
(empty)
Command error:
│ │ │ ('mismatch', UInt16), │ │
│ │ │ ('gapopen', UInt16), │ │
│ │ │ ('qstart', UInt16), │ │
│ │ │ ('qend', UInt16), │ │
│ │ │ ('sstart', UInt16), │ │
│ │ │ ('send', UInt16), │ │
│ │ │ ... +6 │ │
│ │ ] │ │
│ │ dtypes = { │ │
│ │ │ 'qaccver': , │ │
│ │ │ 'saccver': , │ │
│ │ │ 'pident': , │ │
│ │ │ 'length': UInt16, │ │
│ │ │ 'mismatch': UInt16, │ │
│ │ │ 'gapopen': UInt16, │ │
│ │ │ 'qstart': UInt16, │ │
│ │ │ 'qend': UInt16, │ │
│ │ │ 'sstart': UInt16, │ │
│ │ │ 'send': UInt16, │ │
│ │ │ ... +6 │ │
│ │ } │ │
│ │ encoding = 'utf8' │ │
│ │ eol_char = '\n' │ │
│ │ has_header = False │ │
│ │ ignore_errors = False │ │
│ │ infer_schema_length = 100 │ │
│ │ k = 'stitle' │ │
│ │ low_memory = False │ │
│ │ missing_utf8_is_empty_string = False │ │
│ │ n_rows = None │ │
│ │ null_values = None │ │
│ │ processed_null_values = None │ │
│ │ quote_char = '"' │ │
│ │ rechunk = True │ │
│ │ row_count_name = None │ │
│ │ row_count_offset = 0 │ │
│ │ self = │ │
│ │ separator = '\t' │ │
│ │ skip_rows = 0 │ │
│ │ skip_rows_after_header = 0 │ │
│ │ source = 'WIN-AH-2023-FAV-0239-2-OS.blastn.txt' │ │
│ │ try_parse_dates = False │ │
│ │ v = │ │
│ │ with_column_names = .with_column_names at │ │
│ │ 0x7f8596a0d1b0> │ │
│ ╰──────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────╯
NoDataError: empty CSV
Work dir:
/home/CSCScience.ca/mfisher/Desktop/Temp/2023-07-07-AIV-Diagnostic-Nanopore-Rapid96-Miso-Run-657/work/59/ef44f96032897f47586da09bdf1881
Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
v3.2.0
local
22.04.5
No response
Desktop
Ubuntu 18.04.6 LTS
Docker
No response
Hi all,
First, thanks for all the work being done on this pipeline.
I'm having an issue running the pipeline at the SUBTYPING REPORT step where it throws an "Illegal instruction" error.
The same issue happens when using either Docker (24.0.6) or Podman (3.4.4), I haven't tried other containers yet.
I suspect it may be related to a hardware compatibility issue, but I thought I'd post here to see if anyone has come across this as well. The sever I am running is an older Dell T7500 with 6-core Intel Xeon x5650. (Note: This processor does not have AVX support, which I think may be causing the issue.)
This issue happens with any samples I've run so far, either FluA or FluB.
Thanks in advance!
nextflow run CFIA-NCFAD/nf-flu --input test_samplesheet_ab.csv --platform illumina --outdir testruns/test_a -profile podman
ERROR ~ Error executing process > 'NF_FLU:ILLUMINA:SUBTYPING_REPORT (1)'
Caused by:
Process `NF_FLU:ILLUMINA:SUBTYPING_REPORT (1)` terminated with an error exit status (132)
Command executed:
parse_influenza_blast_results.py \
--flu-metadata 41415333-influenza.csv \
--top 3 \
--excel-report nf-flu-subtyping-report.xlsx \
--pident-threshold 0.85 \
--samplesheet samplesheet.fixed.csv \
FluB-pB-040523-MM00001U-Qc.blastn.txt
ln -s .command.log parse_influenza_blast_results.log
cat <<-END_VERSIONS > versions.yml
"NF_FLU:ILLUMINA:SUBTYPING_REPORT":
python: $(python --version | sed 's/Python //g')
END_VERSIONS
Command exit status:
132
Command output:
(empty)
Command error:
.command.sh: line 8: 23 Illegal instruction (core dumped) parse_influenza_blast_results.py --flu-metadata 41415333-influenza.csv --top 3 --excel-report nf-flu-subtyping-report.x
lsx --pident-threshold 0.85 --samplesheet samplesheet.fixed.csv FluB-pB-040523-MM00001U-Qc.blastn.txt
Work dir:
/home/vitalite-dev/nf-flu/work/ea/b95127acda6a0a5e626cc44dc3b0a2
Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`
-- Check '.nextflow.log' file for details
Workflow 3.3.4, revision: bda4dc7
local
23.04.3
openjdk version "11.0.20.1" 2023-08-24
OpenJDK Runtime Environment (build 11.0.20.1+1-post-Ubuntu-0ubuntu122.04)
OpenJDK 64-Bit Server VM (build 11.0.20.1+1-post-Ubuntu-0ubuntu122.04, mixed mode, sharing)
Dell T7500
Ubuntu 22.04
Podman
Data at https://ftp.ncbi.nih.gov/genomes/INFLUENZA/ is out of date. New sequences since 2020-10-13 are available elsewhere on NCBI, e.g. https://ftp.ncbi.nlm.nih.gov/genomes/Viruses/AllNuclMetadata/
nf-flu should be using an updated DB of IAV and IBV seqs from NCBI.
Hi!
I noticed that the check_sample_sheet.py is not offering compatibility with cloud storage paths for the samples (such as az:// for azure or s3:// for AWS buckets). We simply changed line 30:
if p.startswith("http") or p.startswith("ftp"):
to:
if p.startswith("http") or p.startswith("ftp") or p.startswith("az://") or p.startswith("s3://") or p.startswith("gs://"):
and it seems to access the samples successfully.
Thank you!
Hi @peterk87,
We would like to ask whether the r1041_e82_400bps_sup_g615
model is compatible with the --clair3_variant_model
option? Thank you very much!
Best regards,
Eddie
-
-
v3.1.3
No response
21.10.4.5656
No response
Desktop
No response
Docker
No response
Implement dehosting of reads prior to analysis with Kraken2 with a few common host indexes that could be downloaded from a fast place, e.g. human T2T, chicken, pig. Maybe an index with all 3?
Related to #49
Clair3 does not call 16 bp at each end of a reference sequence (HKU-BAL/Clair3#257)
Supplement Clair3 calls with Bcftools mpileup/call?
Hello,
I was wondering about the feasibility of emitting the concatenated, re-named .fastq files into a subdirectory (e.g., fastq_files) within the nf-flu results directory. This would facilitate downstream processes such as uploading to various repositories (e.g., IRIDA).
Thank you for your consideration.
BLAST_BLASTN
process.nextflow run main.nf \
-profile conda \
--use_mamba \
--ncbi_influenza_fasta ${HOME}/code/nf-flu/test_input/db/influenza.fna.gz \
--ncbi_influenza_metadata ${HOME}/code/nf-flu/test_input/db/genomeset.dat.gz \
--input ${HOME}/code/nf-flu/test_input/samplesheet_illumina.csv \
--platform illumina \
--outdir ${HOME}/code/nf-flu/test_output/illumina \
-with-trace ${HOME}/code/nf-flu/trace_illumina.tsv \
-with-report ${HOME}/code/nf-flu/report_illumina.html \
-work-dir ${HOME}/scratch/work-nf-flu
executor > slurm (5)
[aa/85af98] process > NF_FLU:ILLUMINA:GUNZIP_NCBI_FLU_FASTA (influenza.fna.gz) [100%] 1 of 1 ✔
[99/85ad49] process > NF_FLU:ILLUMINA:BLAST_MAKEBLASTDB (influenza.fna) [100%] 1 of 1 ✔
[88/56f970] process > NF_FLU:ILLUMINA:CHECK_SAMPLE_SHEET (1) [100%] 1 of 1 ✔
[- ] process > NF_FLU:ILLUMINA:CAT_FASTQ -
[09/c5a1d9] process > NF_FLU:ILLUMINA:IRMA (ERR3338653) [100%] 1 of 1 ✔
[94/039244] process > NF_FLU:ILLUMINA:BLAST_BLASTN (ERR3338653) [100%] 1 of 1, failed: 1 ✘
[- ] process > NF_FLU:ILLUMINA:SUBTYPING_REPORT -
Oops... Pipeline execution stopped with the following message: BLAST engine error: Warning: Sequence contains no data Warning: Sequence contains no data Warning: Sequence contains no data Warning: Sequence co\ntains no data Warning: Sequence contains no data Warning: Sequence contains no data Warning: Sequence contains no data Warning: Sequence contains no data
Error executing process > 'NF_FLU:ILLUMINA:BLAST_BLASTN (ERR3338653)' Caused by: Process `NF_FLU:ILLUMINA:BLAST_BLASTN (ERR3338653)` terminated with an error exit status (3) Command executed: DB=`find -L ./ -name "*.ndb" | sed 's/.ndb//'` blastn \ -num_threads 4 \ -db $DB \ -query ERR3338653.irma.consensus.fasta \ -outfmt "6 qaccver saccver pident length mismatch gapopen qstart qend sstart send evalue bitscore qlen slen qcovs stitle" -num_alignments 1000000 -evalue 1e-6 \ -out ERR3338653.blastn.txt cat <<-END_VERSIONS > versions.yml "NF_FLU:ILLUMINA:BLAST_BLASTN": blast: $(blastn -version 2>&1 | sed 's/^.*blastn: //; s/ .*$//') END_VERSIONS Command exit status: 3 Command output: (empty)
Command error: BLAST engine error: Warning: Sequence contains no data Warning: Sequence contains no data Warning: Sequence contains no data Warning: Sequence contains no data Warning: Sequence contains no data Warning: Se\quence contains no data Warning: Sequence contains no data Warning: Sequence contains no data Work dir: /home/dfornika/scratch/work-nf-flu/94/0392446a487ae16a1cda1728cc211b Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
commit f0eb199
slurm
21.10.4
openjdk version "11.0.18" 2023-01-17 LTS
OpenJDK Runtime Environment (Red_Hat-11.0.18.0.10-2.el8_7) (build 11.0.18+10-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-11.0.18.0.10-2.el8_7) (build 11.0.18+10-LTS, mixed mode, sharing)
HPC cluster
RHEL 8
Conda
The contents of the irma/ERR3338653.irma.consensus.fasta
output file is:
>ERR3338653_1
>ERR3338653_2
>ERR3338653_3
>ERR3338653_4
>ERR3338653_5
>ERR3338653_6
>ERR3338653_7
>ERR3338653_8
See also ERR3338653.irma.log
Hello,
First of all, thank you very much for your efforts to continue to enhance the nf-flu workflow. It is truly appreciated.
I have been testing each release against Illumina and Nanopore datasets for both influenza A and B.
I encountered an error during the subtyping report generation step (logs attached) runing nf-flu against an Illumina-based influenza B dataset. I haven't been able to reproduce this error running release 3.3.0 on any other dataset. I'm inclined to think it is related to parsing the specific metadata associated with the BLAST results for this sample.
I'll be sure to update this thread with any new information.
Thank you in advance.
EDIT: It definitely appears to be sample-specific. Trying to isolate offending samples for further investigation.
nextflow nf-flu_v3.3.0/cfia-ncfad-nf-flu-3.3.0/workflow/main.nf --input IRVC20230711_SK_Illumina_fluB_validation_setup.csv --platform illumina --outdir IRVC20230711_SK_Illumina_fluB_validation_setup_nf-flu_results -profile singularity,slurm
[cd/928ea9] process > NF_FLU:ILLUMINA:SUBTYPING_R... [ 50%] 1 of 2, failed: 1...
[- ] process > NF_FLU:ILLUMINA:SOFTWARE_VE... -
[d2/0af9b2] NOTE: Process `NF_FLU:ILLUMINA:SUBTYPING_REPORT` terminated with an error exit status (1) -- Execution is retried (1)
ERROR ~ Error executing process > 'NF_FLU:ILLUMINA:SUBTYPING_REPORT'
Caused by:
Process `NF_FLU:ILLUMINA:SUBTYPING_REPORT` terminated with an error exit status (1)
Command executed:
parse_influenza_blast_results.py \
--flu-metadata 41415333-influenza.csv \
--top 3 \
--excel-report nf-flu-subtyping-report.xlsx \
--pident-threshold 0.85 \
sample36.blastn.txt sample46.blastn.txt sample38.blastn.txt sample28.blastn.txt sample30.blastn.txt sample44.blastn.txt sample26.blastn.txt sample6.blastn.txt sample32.blastn.txt sample40.blastn.txt sample14.blastn.txt sample10.blastn.txt sample22.blastn.txt sample18.blastn.txt sample24.blastn.txt sample42.blastn.txt sample16.blastn.txt sample34.blastn.txt sample8.blastn.txt sample20.blastn.txt sample4.blastn.txt sample2.blastn.txt sample12.blastn.txt
ln -s .command.log parse_influenza_blast_results.log
cat <<-END_VERSIONS > versions.yml
"NF_FLU:ILLUMINA:SUBTYPING_REPORT":
python: $(python --version | sed 's/Python //g')
END_VERSIONS
Command exit status:
1
Command output:
(empty)
Command error:
│ │ ┆ null ┆ null ┆ Sequence │ │ │
│ │ │ ┆ ┆ ┆ ┆ ┆ │ │
│ │ ┆ ┆ ┆ 7703 from │ �� │
│ │ │ ┆ ┆ ┆ ┆ ┆ │ │
│ │ ┆ ┆ ┆ Patent │ │ │
│ │ │ ┆ ┆ ┆ ┆ ┆ │ │
│ │ ┆ ┆ ┆ WO2007… │ │ │
│ │ │ sample6_6 ┆ GN357980.1 ┆ 94.737 ┆ 57 ┆ … ┆ null │ │
│ │ ┆ null ┆ null ┆ Sequence │ │ │
│ │ │ ┆ ┆ ┆ ┆ ┆ │ │
│ │ ┆ ┆ ┆ 7744 from │ │ │
│ │ │ ┆ ┆ ┆ ┆ ┆ │ │
│ │ ┆ ┆ ┆ Patent │ │ │
│ │ │ ┆ ┆ ┆ ┆ ┆ │ │
│ │ ┆ ┆ ┆ WO2007… │ │ │
│ │ │ sample6_6 ┆ GN357979.1 ┆ 93.333 ┆ 60 ┆ … ┆ null │ │
│ │ ┆ null ┆ null ┆ Sequence │ │ │
│ │ │ ┆ ┆ ┆ ┆ ┆ │ │
│ │ ┆ ┆ ┆ 7743 from │ │ │
│ │ │ ┆ ┆ ┆ ┆ ┆ │ │
│ │ ┆ ┆ ┆ Patent │ │ │
│ │ │ ┆ ┆ ┆ ┆ ┆ │ │
│ │ ┆ ┆ ┆ WO2007… │ │ │
│ │ └───────────┴────────────┴────────┴────────┴───┴─────… │ │
│ │ df_type_counts = shape: (0, 3) │ │
│ │ ┌──────────┬────────┬────────┐ │ │
│ │ │ Genotype ┆ counts ┆ N_type │ │ │
│ │ │ --- ┆ --- ┆ --- │ │ │
│ │ │ str ┆ u32 ┆ str │ │ │
│ │ ╞══════════╪════════╪════════╡ │ │
│ │ └──────────┴────────┴────────┘ │ │
│ │ h_or_n = 'N' │ │
│ │ is_iav = True │ │
│ │ reg_h_or_n_type = '[Nn]' │ │
│ │ seg = '6' │ │
│ │ type_counts = shape: (3, 2) │ │
│ │ ┌──────────┬────────┐ │ │
│ │ │ Genotype ┆ counts │ │ │
│ │ │ --- ┆ --- │ │ │
│ │ │ str ┆ u32 │ │ │
│ │ ╞══════════╪════════╡ │ │
│ │ │ B ┆ 86 │ │ │
│ │ │ Victoria ┆ 64 │ │ │
│ │ │ Yamagata ┆ 33 │ │ │
│ │ └──────────┴────────┘ │ │
│ │ type_name = 'N_type' │ │
│ │ type_to_count = [] │ │
│ ╰──────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────╯
IndexError: list index out of range
Work dir:
validation_results/nf-flu_v3.3.0/Illumina_FluB/IRVC20230711_SK_Illumina_fluB_validation_setup_nf-flu_results/work/cd/928ea9d2849072e0835391fa054a0c
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
-- Check '.nextflow.log' file for details
v3.3.0; revision: 91a5d05f86
Slurm
23.04.1
openjdk version "20-internal" 2023-03-21
OpenJDK Runtime Environment (build 20-internal-adhoc..src)
OpenJDK 64-Bit Server VM (build 20-internal-adhoc..src, mixed mode, sharing)
HPC Cluster
No response
Singularity
Pipeline unable to resolve mixed fluA and fluB run. I performed an analysis were the majority of the samples for the run were fluB confirmed and a handful were fluA confirmed. The subtyping report returned all samples as fluB.
nextflow run CFIA-NCFAD/nf-flu -profile singularity,slurm --platform nanopore --input samplesheet_test.csv -r 3.3.2
Subtyping report file gives subtype for all samples as fluB even when fluA samples are present.
v3.3.2 and v3.3.3
slurm
22.04.3
No response
cluster
No response
Singularity
No response
Hi there,
I just installed the updated version of this pipeline and tried re-analyzing a previously successful batch. I got the following error:
Oops... Pipeline execution stopped with the following message:<run_path>/work/c1/6c9d454ee21fddbc49551f39ee88d3/.command.sh: line 14: IRMA: command not found
ERROR ~ Error executing process > 'NF_FLU:NANOPORE:IRMA (sample1)'
Caused by:
Process `NF_FLU:NANOPORE:IRMA (sample1)` terminated with an error exit status (127)
Command executed:
touch irma_config.sh
echo 'SINGLE_LOCAL_PROC=8' >> irma_config.sh
echo 'DOUBLE_LOCAL_PROC=4' >> irma_config.sh
# default tmp in current working directory instead of defaulting to /tmp
# which may be restricted in size on HPC clusters
echo 'ALLOW_TMP=1' >> irma_config.sh
echo 'TMP=$PWD' >> irma_config.sh
if [ true ]; then
echo 'DEL_TYPE="NNN"' >> irma_config.sh
echo 'ALIGN_PROG="BLAT"' >> irma_config.sh
fi
IRMA FLU-minion sample1.merged.fastq.gz sample1
if [ -d "sample1/amended_consensus/" ]; then
cat sample1/amended_consensus/*.fa > sample1.irma.consensus.fasta
fi
ln -s .command.log sample1.irma.log
cat <<-END_VERSIONS > versions.yml
"NF_FLU:NANOPORE:IRMA":
IRMA: $(IRMA | head -n1 | sed -E 's/^Iter.*IRMA\), v(\S+) .*/\1/')
END_VERSIONS
Command exit status:
127
Command output:
(empty)
Command error:
.command.sh: line 14: IRMA: command not found
Work dir:
<run_path>/work/c1/6c9d454ee21fddbc49551f39ee88d3
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
-- Check '.nextflow.log' file for details
I'd appreciate any help you can provide! Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.