-
Analysis_STAR.Rmd
- RNA-seq analysis pipeline forSTAR
counts. Prerequisites:- A path to data folder. This folder should have 3 subfolders:
02_STAR-align
- gzipped count files with.tab
extension outputted bySTAR
alignerresults
- folder where the results will be storeddata
- Must havesample_annotation.csv
file, example below
- A path to data folder. This folder should have 3 subfolders:
-
enrichR_analysis.Rmd
- Analyze gene lists using enrichR. Analyze all genes, and up- and downregulated genes separately. UsesDEGs.xlsx
produced byAnalysis*.Rmd
. -
enrichR_plot.Rmd
- barplot of selected enrichment results, similar to Example. WIP -
GSEA.Rmd
- GSEA analysis using MSigDb. -
Pathview.Rmd
- visualization of top KEGG pathways. UsesDEGs.xlsx
produced byAnalysis*.Rmd
. Example
Analysis_featurecounts.Rmd
- RNA-seq analysis pipeline forfeatureCount
counts. Prerequisites:- A path to data folder. This folder should have 3 subfolders:
03_featureCount
- gzipped count files outputted byfeatureCount
results
- folder where the results will be storeddata
- Must havesample_annotation.csv
file. Annotation file should have "Sample" column with sample names, and any other annotation columns. Include "Group" column containing covariate of interest. Example:
- A path to data folder. This folder should have 3 subfolders:
# Sample,Group
VLI10_AA_S61_L006_R1_001.txt.gz,AA
VLI10_AA_S61_L007_R1_001.txt.gz,AA
VLI10_AA_S61_L008_R1_001.txt.gz,AA
VLI11_C_S62_L006_R1_001.txt.gz,C
VLI11_C_S62_L007_R1_001.txt.gz,C
VLI11_C_S62_L008_R1_001.txt.gz,C
-
Figure_clusterProfiler_nes.Rmd
- Takes the results of edgeR analysis from an Excel file, performs GO and KEGG GSEA and plots the results as horizontal barplots, sorted by normalized enrichment score (NES). Example -
Figure_clusterProfiler_asis.Rmd
- Takes the results of edgeR analysis from an Excel file, performs GO and KEGG GSEA and plots the results as horizontal barplots, sorted by p-value, as they come out of the enrichment analysis. -
Figure_heatmap.Rmd
- make heatmap for selected genes. UsesTMP.xlsx
produced byAnalysis*.Rmd
and a custom signature of gene names
-
calcTPM.R
- a function to calculate TPMs from gene counts -
utils.R
- helper functions
Scripts for running RNA-seq preprocessing steps on a cluster using PBS job submission system. subread-featurecounts
scripts are in the dcaf/ngs.rna-seq repository
- submit00_fastqc.sh - FASTQC on raw FASTQ files
- MultiQC commands to summarize QC reports generated by TrimGalore and STAR
multiqc --filename multiqc_01_trimmed.html --outdir multiqc_01_trimmed 01_trimmed/
multiqc --filename multiqc_02_STAR-align.html --outdir multiqc_02_STAR-align 02_STAR-align/
- submit01_trimgalore.sh - Adapter trimming using TrimGalore
- submit02_STAR-index.sh - Index the genome for the STAR aligner
- submit02_STAR.sh - Align samples using STAR. Requires
input01_toStarAlign.list
text file with the list of input files, each string contains (comma-separated) file name(s), space separates first and second read pairs
CaSpER pipeline detecting CNVs from RNA-seq data
Dedicated repository with detailed instructions: mdozmorov/CaSpER_pipeline
- submit05_BAFExtract-index.sh - indexing the genome for BAFExtract
- submit05_BAFExtract.sh - BAFExtract run
- DESeq results to pathways in 60 Seconds with the fgsea package, https://stephenturner.github.io/deseq-to-fgsea/