GithubHelp home page GithubHelp logo

kinow / pipeline-v5 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from emo-bon/metagoflow

0.0 1.0 0.0 410.06 MB

This repository contains all CWL descriptions of the MGnify pipeline version 5.0 and their implementation for the Marine Genomic Observatories oriented pipeline, developed in the framework of an EOSC-Life funded project

Home Page: https://www.ebi.ac.uk/metagenomics/

License: Apache License 2.0

Shell 4.89% Python 40.22% Perl 6.37% Dockerfile 4.70% Common Workflow Language 43.83%

pipeline-v5's Introduction

A workflow for marine Genomic Observatories data analysis

An EOSC-Life project

Build Status

The workflows developed in the framework of this project are based on pipeline-v5 of the MGnify resource.

This branch is a child of the pipeline_5.1 branch that contains all CWL descriptions of the MGnify pipeline version 5.1.

The following comes from the initial repo and describes how to get the databases required.


pipeline-v5

This repository contains all CWL descriptions of the MGnify pipeline version 5.0.

Documentation

For a thorough read-the-docs, click here.


We kindly recommend use the MGnify resource for data processing.

If you want to run pipeline locally, we recommend you use our pre-build docker containers.

Requirements to run pipeline

  • python3 [v 3.6+]

  • docker [v 19.+] or singularity

  • cwltool [v 3.+] or toil [v 4.2+]

  • hdd for databases ~133G

Docker

All the tools are containerized.

Unfortunately, antiSMASH and InterProScan containers are very big. We provide two options:

  1. Pre-install these tools. The instructions on how to setup the environment are here.

  2. Use containers. First of all you need to uncomment hints in InterProScan-v5.cwl and antismash_v4.cwl. Pre-pull containers from https://hub.docker.com/u/microbiomeinformatics

docker pull microbiomeinformatics/pipeline-v5.interproscan:v5.36-75.0
docker pull microbiomeinformatics/pipeline-v5.antismash:v4.2.0

Installation

Create conda environment

Get the EOSC-Life marine GOs workflow

git clone https://github.com/EBI-Metagenomics/pipeline-v5.git 
cd pipeline-v5

Download necessary dbs

You can download databases for the EOSC-Life GOs workflow by running the download_dbs.sh script. If you have one or more already in your system, then create a symbolic link pointing at the ref-dbs folder.

How to run

  • activate the conda env

  • edit the gos_wf.yml file to set the parameter values of your choice

  • In case you are working in a HPC with Singularity, enable Singularity

  • run

./run_wf.sh -n false -n osd-short -d short-test-case -f test_input/wgs-paired-SRR1620013_1.fastq.gz -r test_input/wgs-paired-SRR1620013_2.fastq.gz

In case you are using Docker, it is strongly recommended to avoid installing it through snap

RuntimeError: slurm currently does not support shared caching, because it does not support cleaning up a worker after the last job finishes. Set the --disableCaching flag if you want to use this batch system.

pipeline-v5's People

Contributors

gdemoro avatar hariszaf avatar katesakharova avatar kinow avatar mb1069 avatar mberacochea avatar mr-c avatar mscheremetjew avatar steninidak avatar vkale1 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.