This repository contains a nextflow workflow for analysing bacterial genomes.
If no reference is included assembly will be completed using flye and polished with medaka. If a reference is provided alignment will be done with mini_align and variant called using medaka. The workflow has a few optional extras. It can run prokka to annotate the resulting consensus sequence or ResFinder to check it against a database of antimicrobial resistance genes.
The workflow uses nextflow to manage compute and software resources, as such nextflow will need to be installed before attempting to run the workflow.
The workflow can currently be run using either
Docker (default) or
Singularity (-profile singularity
) to provide isolation of
the required software. Both methods are automated out-of-the-box provided
either Docker or Singularity is installed.
It is not required to clone or download the git repository in order to run the workflow. For more information on running EPI2ME Labs workflows visit out website.
Workflow options
To obtain the workflow, having installed nextflow
, users can run:
nextflow run epi2me-labs/wf-bacterial-genomes --help
to see the options for the workflow.
Workflow outputs
The primary outputs of the workflow include:
- a FASTA consensus sequence scaffolded from a provided reference sequence,
- a VCF file containing variants in the sample compared to the reference (if provided),
- an HTML report document detailing QC metrics and the primary findings of the workflow,
- (optionally) an annotation of the consensus sequence using prokka.
- (optionally) a per-sample ResFinder output directory with various results.