GithubHelp home page GithubHelp logo

charles-jh / genedmrs Goto Github PK

View Code? Open in Web Editor NEW

This project forked from xiaowangcn/genedmrs

0.0 0.0 0.0 72.44 MB

GeneDMRs is an R package to detect the differentially methylated regions based on genes, gene body, CpG islands and gene body interacted with CpG island features.

License: GNU Lesser General Public License v3.0

R 100.00%

genedmrs's Introduction

GeneDMRs

Gene-based differentially methylated regions analysis

Getting Started

Description

GeneDMRs is an R package to detect the differentially methylated regions based on genes (DMG), gene body (DMP, DME, DMI), CpG islands and gene body interacted with CpG island features (e.g., DMG/DMP/DME/DMI_CpG island and DMG/DMP/DME/DMI_CpG island shore).

Dependencies

Use the annotation dataset for enrichment, e.g., "org.Mm.eg.db" of mouse

if (!requireNamespace("BiocManager", quietly = TRUE))

    install.packages("BiocManager")
    
  BiocManager::install(c("devtools", "clusterProfiler", "corrplot", "dplyr", "ffbase", "genomation", 
                         "pheatmap", "plotrix", "qqman", "RCircos", "VennDiagram", "org.Mm.eg.db"))

Installation

source("https://install-github.me/xiaowangCN/GeneDMRs")

or

library("devtools")

install_github("xiaowangCN/GeneDMRs")

User manual

See the GeneDMRs.pdf file

Sample data

Before starting quickly or starting step by step, the user could download the sample data or the whole folder from "/methdata" for testing (https://github.com/xiaowangCN/GeneDMRs/tree/master/methdata). In the folder "/methdata", "1_1.gz", "1_2.gz" and "1_3.gz" files are from the control group, while "2_1.gz" and "2_2.gz" files are from the case group. "refseq.bed.txt" and "cpgi.bed.txt" files are the bed files that are downloaded or copied from UCSC (http://genome.ucsc.edu/cgi-bin/hgTables) by changing the "genome (e.g., Human, Mouse, Cow, Pig)", "assembly", "group (e.g., Genes and Gene Predictions, Regulation)", "track (e.g., CpG Islands)" and "output format (i.e., BED - browser extensible data)" channels, where "refseq.bed.txt" is used for reference genes and "cpgi.bed.txt" is used for CpG islands.

The user just needs to give one path for GeneDMRS package, e.g., "paths = paste(system.file(package = "GeneDMRs")" which is the package systme path. If the folder is downloaded on the desktop, just use the desktop as the path like:

inputmethfile <- Methfile_read(paths = "C:/Users/Desktop/methdata", suffix = ".gz")
inputrefseqfile <- Bedfile_read(paths = "C:/Users/Desktop/methdata", bedfile = "refseq", suffix = ".txt", feature = FALSE)

Examples

1. If get all differentially methylated genes (DMGs) quickly

allDMGs <- Quick_GeneDMRs(paths = paste(system.file(package = "GeneDMRs"), "/methdata", sep=""))

Or if it is a case-control design, the user can specify arbitrary file names for case group and control group, separately, where the paths = "C:/Users/GeneDMRs/methdata" here is mainly used for reading bedfile, i.e., inputrefseqfile <- Bedfile_read(paths = "C:/Users/GeneDMRs/methdata"). For example:

controls <- c("C:/Users/GeneDMRs/methdata/1_1.gz", "C:/Users/GeneDMRs/methdata/1_2.gz", "C:/Users/GeneDMRs/methdata/1_3.gz")
cases <- c("C:/Users/GeneDMRs/methdata/2_1.gz", "C:/Users/GeneDMRs/methdata/2_1.gz")
allDMGs <- Quick_GeneDMRs(paths = "C:/Users/GeneDMRs/methdata", control_paths = controls, case_paths = cases)

2. If get all differentially methylated cytosine sites (DMCs) quickly

allDMCs <- Quick_DMCs(paths = paste(system.file(package = "GeneDMRs"), "/methdata", sep=""))

Or if it is a case-control design, then the user can specify arbitrary file names for case group and control group, separately, such as:

controls <- c("C:/Users/GeneDMRs/methdata/1_1.gz", "C:/Users/GeneDMRs/methdata/1_2.gz", "C:/Users/GeneDMRs/methdata/1_3.gz")
cases <- c("C:/Users/GeneDMRs/methdata/2_1.gz", "C:/Users/GeneDMRs/methdata/2_1.gz")
allDMCs <- Quick_DMCs(control_paths = controls, case_paths = cases)

3. If get all DMGs step by step

# read the methylation file #
inputmethfile <- Methfile_read(paths = paste(system.file(package = "GeneDMRs"), "/methdata", sep=""), suffix = ".gz")

# or if it is a case-control design #
controls <- c("C:/Users/GeneDMRs/methdata/1_1.gz", "C:/Users/GeneDMRs/methdata/1_2.gz", "C:/Users/GeneDMRs/methdata/1_3.gz")
cases <- c("C:/Users/GeneDMRs/methdata/2_1.gz", "C:/Users/GeneDMRs/methdata/2_1.gz")
inputmethfile <- Methfile_read(control_paths = controls, case_paths = cases)

# quality control #
inputmethfile_QC <- Methfile_QC(inputmethfile)

# read the bedfile #
inputrefseqfile <- Bedfile_read(paths = paste(system.file(package = "GeneDMRs"), "/methdata", sep=""), bedfile = "refseq", suffix = ".txt", feature = FALSE)
  
# methylation mean #
regiongeneall <- Methmean_region(inputmethfile_QC, inputrefseqfile, chrnum = "all")
  
# statistical test #
regiongeneall_Qvalue <- Logic_regression(regiongeneall)
  
# sifnificant filter #
regiongeneall_significant <- Significant_filter(regiongeneall_Qvalue)

4. If get all DMGs by fitting other environmental factors as covariates step by step

Note: The input files "1_1.gz", "1_2.gz", "1_3.gz", "2_1.gz" and "2_2.gz" in the "methdata" folder need to be renamed to "1_1.gz", "2_1.gz", "3_1.gz", "4_1.gz" and "5_1.gz" as the individual sample. So no replicate here!
# define the covariates #
covariateinfo <- data.frame(Group = c("Control","Control","Control","Case","Case"), Diet = c("High","High","Low","High","Low"), Timepoint = c("Week1","Week2","Week1","Week2","Week2"))

# read the methylation file #
inputmethfile <- Methfile_read(paths = paste(system.file(package = "GeneDMRs"), "/methdata", sep=""), suffix = ".gz")

# quality control #
inputmethfile_QC <- Methfile_QC(inputmethfile)

# read the bedfile #
inputrefseqfile <- Bedfile_read(paths = paste(system.file(package = "GeneDMRs"), "/methdata", sep=""), bedfile = "refseq", suffix = ".txt", feature = FALSE)
  
# methylation mean #
regiongeneall <- Methmean_region(inputmethfile_QC, inputrefseqfile, chrnum = "all")
  
# statistical test #
regiongeneall_Qvalue <- Logic_regression(regiongeneall, covariates = covariateinfo)
  
# sifnificant filter #
regiongeneall_significant <- Significant_filter(regiongeneall_Qvalue)

5. If use gene body

# read the bedfile by Bedfile_read(feature = TRUE) to get gene body information #
inputgenebodyfile <- Bedfile_read(paths = paste(system.file(package = "GeneDMRs"), "/methdata", sep=""), bedfile = "refseq", suffix = ".txt", feature = TRUE, featurewrite = FALSE)

Author

Xiao Wang, Dan Hao, Haja N. Kadarmideen.

Maintainer

Xiao Wang. [email protected]

Reference

Xiao Wang, Dan Hao and Haja N. Kadarmideen. GeneDMRs: an R package for Gene-based Differentially Methylated Regions analysis. F1000Research 2019, 8(ISCB Comm J):1299 (slides). (https://doi.org/10.7490/f1000research.1117223.1)

Xiao Wang, Dan Hao and Haja N. Kadarmideen. GeneDMRs: an R package for Gene-based Differentially Methylated Regions analysis. bioRxiv 2020, 04.11.037168. (https://doi.org/10.1101/2020.04.11.037168)

Xiao Wang, Dan Hao and Haja N. Kadarmideen. GeneDMRs: an R package for Gene-based Differentially Methylated Regions analysis. Journal of Computational Biology 2021, 28(3), 304-316. (https://doi.org/10.1089/cmb.2020.0081)

genedmrs's People

Contributors

xiaowangcn avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.