The workflows developed in the framework of this project are based on pipeline-v5
of the MGnify resource.
This branch is a child of the
pipeline_5.1
branch that contains all CWL descriptions of the MGnify pipeline version 5.1.
The following comes from the initial repo and describes how to get the databases required.
This repository contains all CWL descriptions of the MGnify pipeline version 5.0.
For a thorough read-the-docs, click here.
We kindly recommend use the MGnify resource for data processing.
If you want to run pipeline locally, we recommend you use our pre-build docker containers.
-
python3 [v 3.6+]
-
docker [v 19.+] or singularity
-
cwltool [v 3.+] or toil [v 4.2+]
-
hdd for databases ~133G
All the tools are containerized.
Unfortunately, antiSMASH and InterProScan containers are very big. We provide two options:
-
Pre-install these tools. The instructions on how to setup the environment are here.
-
Use containers. First of all you need to uncomment hints in InterProScan-v5.cwl and antismash_v4.cwl. Pre-pull containers from https://hub.docker.com/u/microbiomeinformatics
docker pull microbiomeinformatics/pipeline-v5.interproscan:v5.36-75.0
docker pull microbiomeinformatics/pipeline-v5.antismash:v4.2.0
git clone https://github.com/EBI-Metagenomics/pipeline-v5.git
cd pipeline-v5
You can download databases for the EOSC-Life GOs workflow by running the
download_dbs.sh
script.
If you have one or more already in your system, then create a symbolic link pointing
at the ref-dbs
folder.
-
activate the conda env
-
edit the
gos_wf.yml
file to set the parameter values of your choice -
In case you are working in a HPC with Singularity, enable Singularity
-
run
./run_wf.sh -n false -n osd-short -d short-test-case -f test_input/wgs-paired-SRR1620013_1.fastq.gz -r test_input/wgs-paired-SRR1620013_2.fastq.gz
In case you are using Docker, it is strongly recommended to avoid installing it through
snap
RuntimeError
: slurm currently does not support shared caching, because it does not support cleaning up a worker after the last job finishes.
Set the --disableCaching
flag if you want to use this batch system.