GithubHelp home page GithubHelp logo

ngsfiles's Introduction

NGS related files for testing

Usage

workspace="path/to/workspace" && cd ${workspace}
git clone https://github.com/yh549848/ngsfiles.git

Sources

FYR: Generating procedure

Tools used

Download and subsampling FASTQ

wget -P tmp -i scripts/uri_fastq.txt 
find tmp/*.fastq.gz | xargs scripts/subsample_fastq.sh
rm tmp/SRR896743_1.fastq.gz tmp/SRR896743_2.fastq.gz tmp/SRR896663_1.fastq.gz tmp/SRR896663_2.fastq.gz
mv tmp/*.*.fastq.gz assets/FASTQ

Download and subsampling GTF

wget ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_32/gencode.v32.basic.annotation.gtf.gz -P tmp
gunzip tmp/gencode.v32.basic.annotation.gtf.gz
./scripts/subsample_gtf.sh tmp/gencode.v32.basic.annotation.gtf
mv tmp/*.chr*.gtf assets/GTF

Convert GTF to GFF

for f in `find assets/GTF -name "*.gtf"`; do gffread ${f} -o ${f/.gtf/.gff}; done
mv assets/GTF/*.gff assets/GFF

Align to reference and extract records located specified chromosome

wget ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_32/GRCh38.primary_assembly.genome.fa.gz -P tmp
gunzip tmp/GRCh38.primary_assembly.genome.fa.gz
qsub -V scripts/build_idx_star.sh tmp/GRCh38.primary_assembly.genome.fa
find assets/FASTQ/*.fastq.gz | sort | xargs qsub -V -t 1-4:2 scripts/align_star_pe.sh
cd tmp && cp --parents */*bam ../assets/BAM && cd ${workspace}
find assets/BAM/*/Aligned.sortedByCoord.out.bam | sort | xargs -I {} samtools index {}
find assets/BAM/*/Aligned.sortedByCoord.out.bam | sort | xargs -I {} scripts/subsample_bam.sh {}
find assets/BAM/*/Aligned.sortedByCoord.out*.bam | sort | xargs -I {} samtools index {}

Quant by RSEM

gffread -w tmp/gencode.v32.basic.transcripts_with_attr.fa -g tmp/GRCh38.primary_assembly.genome.fa tmp/gencode.v32.basic.annotation.gtf
./scripts/strip_attributes_fasta.py tmp/gencode.v32.basic.transcripts_with_attr.fa > tmp/gencode.v32.basic.transcripts.fa
qsub -V scripts/build_idx_rsem_with_ebseq_ngvector.sh tmp/gencode.v32.basic.transcripts.fa
find assets/BAM/*/Aligned.sortedByCoord.out.bam | sort | xargs qsub -V -t 1-2 scripts/quant_rsem.sh

Quant by StringTie

find assets/BAM/*/Aligned.sortedByCoord.out.bam | sort | xargs qsub -V -t 1-2 scripts/quant_stringtie.sh
find assets/StringTie/*/*.gtf | xargs -I {} gzip {}

Quant by kallisto

find assets/FASTQ/*.fastq.gz | sort | xargs qsub -V -t 1-4:2 scripts/quant_kallisto.sh

Quant by htseq-count

find assets/BAM/*/Aligned.sortedByCoord.out.bam | sort | xargs qsub -V -t 1-2 scripts/quant_htseq.sh 

Convert BAM to BED

for f in `find assets/BAM -name "*.bam"`; do bedtools bamtobed -i ${f} > ${f/.bam/.bed}; done
cd assets/BAM; cp --parents */*.bed ../BED/ && rm */*.bed; cd ${workspace}

Clean up workspace

rm tmp/*

ngsfiles's People

Contributors

yh549848 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.