GithubHelp home page GithubHelp logo

alvesrco / pipecov Goto Github PK

View Code? Open in Web Editor NEW
13.0 3.0 8.0 28.95 MB

This project proposes to contribute to fill the knowledge gap about Covid-19SARS-CoV-2 in Brazil and with the global knowledge about the pathogen. The scientific community and governments of other countries are looking for actions along these lines. The United Kingdom recently established a research network for genomic studies of SARS-CoV-2Covid-19 with a contribution of GBP 20 million (https://www.gov.uk/government/news/uk-launches-whole- genome-sequence-alliance-to-map-spread-of-coronavirus).

Dockerfile 6.49% Shell 39.42% R 52.56% Python 1.52%
covid19 genomics bioinformatics-pipeline bioinformatics-analysis bioinformatics-scripts

pipecov's Introduction

PiPeCOV

This project proposes to contribute to fill the knowledge gap about Covid-19SARS-CoV-2 in Brazil and with the global knowledge about the pathogen.

The scientific community and governments of other countries are looking for actions along these lines. The United Kingdom recently established a research network for genomic studies of SARS-CoV-2Covid-19 with a contribution of GBP 20 million (https://www.gov.uk/government/news/uk-launches-whole- genome-sequence-alliance-to-map-spread-of-coronavirus).

The PiPeCOV pipeline can handle quality assesment, assembly and annotation of SARS-CoV-2 genomes sequenced by Illumina (Amplicon & mNGS). Once running PiPeCOV you obtain the assembled and annotated SARS-CoV-2 genomes at the end of the procedure.

PipeCoV is free to use for non-commercial users, under a GPLv3 License.

The PiPeCOV workflow: Screenshot

All the steps and commands used in the pipeline for Quality Assesment and Mapping, Assembly, Annotation, and Phylogenetic Assignment of lineages of the SARS-CoV-2 Genomes are encapsulated in images and Dockers containers. The user just needs to have Docker installed on his machine, without worrying about installing all the tools used in the pipelines.

The docker images used in the pipeline can be found at (https://hub.docker.com/u/itvds)

PiPeCOV must be downloaded from this repo (https://github.com/alvesrco/covid19_itvds)

All Dockerfiles, and pipes repo are developped by the Covid19 Project Network @ ITVDS.

Sample file

Samples in .fastq.gz format. They should be in the pattern: EC114_S15_R1_001.fastq.gz

How To

: Quality Assesment :

$ ./qc_docker.sh -i illumina -1 SAMPLE_R1.fastq -2 SAMPLE_R2.fastq -a adapters.txt -q 20 -l 50 -o output_qc -t 24

Parameters

  • i illumina [Sequencing platform]
  • q 20 [Minimum PHRED quality for trimming and filtering. Default: 20]
  • l 50 [Minimum size of post-trimming sequences. Default: 50]
  • t 24 [Number of threads to be used. Default: 1]

Input

  • 1 SAMPLE_R1.fastq [Forward strings in the original raw format]
  • 2 SAMPLE_R2.fastq [Reverse strings in the original raw format]
  • a adapters.txt [File with sequence adapters that must be removed]

Output

  • o output_qc [Folder where the results will be saved. Default: “output”]

: Genome Assembly, Annotation and Phylogenetic Assignment :

$ ./assembly_docker.sh -i illumina -1 output_qc/SRR11587600_good.pair1.truncated -2 output_qc/SRR11587600_good.pair2.truncated -r sars-cov-2_MN908947.fasta -k 31 -m 2 -l 100 -c 10 -o output_assembly -t 24 -s illumina_rtpcr

Parameters

  • i illumina [Sequencing platform]
  • k 31 [Size of the kmer in the decontamination step. Default: 31]
  • m 2 [Maximum mismatch to be accepted in kmers. Default: 2]
  • l 100 [Minimum contig size. Default: 100]
  • c 10 [Minimum contig coverage. Default: 10]
  • m 2 [Maximum mismatch to be accepted in kmers. Default: 2]
  • s SAMPLE_NAME [Sample name. Default: “sample”]
  • t 24 [Number of threads to be used. Default: 1]
  • g 80 [Maximum of memory in Gigabytes to use in decontamination step. Defaul: 80]

Input

  • 1 SAMPLE_good_R1.fastq [Forward sequences after quality treatment]
  • 2 SAMPLE_good_R2.fastq [Reverse sequences after quality treatment]
  • r reference.fasta [Fasta file with the reference (s) to be used]

Output

  • o output_assembly [Folder where the results will be saved. Default: “output”]

pipecov's People

Contributors

alvesrco avatar fabriciooliveirasilva avatar kpadovani avatar reinator avatar robertoxavierjr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pipecov's Issues

Erro no Rscript

Olá Ronnie, tudo bem?

O pipeline de montagem está funcionando normalmente?
Porque eu implementei ele na Fiocruz e estava dando um erro, que ao "debugar" o script percebi que era na linha 175 do script com as funções para gerar o consenso (https://github.com/alvesrco/covid19_itvds/blob/master/dockerfiles/R/wgs_functions.R). A linha está escrita assim:
"poor_cov<-which(colSums(cm)<0);"

Olhando o script original (https://github.com/proychou/ViralWGS/blob/master/wgs_functions.R), percebi que ao invés do '0', seria '10'.
"poor_cov<-which(colSums(cm)<10);"

Foi intencional essa modificação? E não gerou um erro para vocês?

Um abraço,
Daniel Moreira

cp: can't stat 'usher_lineage_report.csv': No such file or directory error during assembly

Hi, I'm experiencing the error given below during the assembly process

Using UShER as inference engine.
****
Pangolin running in usher mode.
****
Maximum ambiguity allowed is 0.3.
****
Query file:	/output/home/thermite/setu_test/pipecov/11-pangolin_lineages/../test1_closedgap.fasta
****
Data files found:
usher_pb:	/opt/conda/envs/pangolin/lib/python3.10/site-packages/pangolin_data/data/lineageTree.pb
****
****
Output file written to: /output/home/thermite/setu_test/pipecov/11-pangolin_lineages/lineage_report.csv
****
Output alignment written to: /output/home/thermite/setu_test/pipecov/11-pangolin_lineages/alignment.fasta
cp: can't stat 'usher_lineage_report.csv': No such file or directory

Any help is greatly appreciated!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.