GithubHelp home page GithubHelp logo

mutual-ai / maftools Goto Github PK

View Code? Open in Web Editor NEW

This project forked from poisonalien/maftools

0.0 2.0 0.0 20 MB

:cancer: Summarize, Analyze and Visualize MAF files from TCGA or in house studies.

License: Other

R 100.00%

maftools's Introduction


maftools - An R package to summarize, analyze and visualize MAF files.


bioc bioc bioc

Introduction.

With advances in Cancer Genomics, Mutation Annotation Format (MAF) is being widley accepted and used to store variants detected. The Cancer Genome Atlas Project has seqenced over 30 different cancers with sample size of each cancer type being over 200. The resulting data consisting of genetic variants is stored in the form of Mutation Annotation Format. This package attempts to summarize, analyze, annotate and visualize MAF files in an efficient manner either from TCGA sources or any in-house studies as long as the data is in MAF format. Maftools can also handle ICGC Simple Somatic Mutation format.

maftools is on ๐Ÿ‘‰ bioRxiv :bowtie:

Please cite the below if you find this tool useful for you.

Mayakonda, A. and H.P. Koeffler, Maftools: Efficient analysis, visualization and summarization of MAF files from large-scale cohort based cancer studies. bioRxiv, 2016. doi: http://dx.doi.org/10.1101/052662

MAF field requirements.

MAF files contains many fields ranging from chromosome names to cosmic annotations. However, most of the analysis in maftools uses following fields. Please stick to MAF specifications for better results.

  • Mandatoty fields: Hugo_Symbol, Chromosome, Start_Position, End_position, Reference_Allele, Tumor_Seq_Allele2, Variant_Classification, Variant_Type and Tumor_Sample_Barcode.
  • Recommended optional fields: non MAF specific fields containing vaf and amino acid change information. Complete specififcation of MAF files can be found on NCI TCGA page.

NOTE: If you have variants stored as VCFs or as an MAF like tab seperated format, convert them to MAF using vcf2maf/maf2maf. Merge MAFs from all samples into a single MAF before processing with maftools.

Vignette and a case study.

A complete documentation of maftools using TCGA LAML1 as a case study can be found here.

Usage.

Simple usage: Just read maf file using read.maf and pass the resulting maf object to any one of the function for plotting or analysis or set operations.

Stuffs maftools can do.

  1. Analysis
  • Detect cancer driver genes based on positional clustering of variants2.
  • Detect Mutually exclusive set of genes3.
  • Compare two MAF files (cohorts) to detect differentially mutated genes.
  • Add pfam domains and summarize.
  • Extract mutational signatures and compare them to validated signatures4.
  • APOBEC Enrichment score estimation5.
  • Tumor heterogenity and MATH (Mutant-Allele Tumor Heterogeneity) score estimation6.
  • Read and summarize GISTIC results.
  • Pan-cancer analysis/comparisison
  • Survival analysis
  • Compare mutation load against all 33 TCGA cohorts
  1. Rich Visualizations
  • Make oncoplots7.
  • Make lollipop plots.
  • Map variants on copy number (CBS) segments
  • Forest plots
  • Plot Transitions and Transversions.
  • Plot maf summary.
  • CoOncoplots
  • Genecloud
  • Rainfall plots and change point detection
  1. Annotation
  • Annotate variants locally using Oncotator API.
  • Convert Annovar annotations into MAF.
  • Convert ICGC simple somatic mutation format into MAF.

Installation:

Easy way: Install from Bioconductor.

## try http:// if https:// URLs are not supported
source("https://bioconductor.org/biocLite.R")
biocLite("maftools")

Install from Github for updated features (some of functions from here may not be available on Bioconductor release branch).

#Install Bioconductor dependencies.
source("http://bioconductor.org/biocLite.R")
biocLite("ComplexHeatmap")
biocLite("VariantAnnotation")
biocLite("Biostrings")

#Install maftools from github repository.
library("devtools")
install_github(repo = "PoisonAlien/maftools")

For full documentation please refer to vignette.

References.

  1. Cancer Genome Atlas Research, N., Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. N Engl J Med, 2013. 368(22): p. 2059-74.
  2. Tamborero, D., A. Gonzalez-Perez, and N. Lopez-Bigas, OncodriveCLUST: exploiting the positional clustering of somatic mutations to identify cancer genes. Bioinformatics, 2013. 29(18): p. 2238-44.
  3. Leiserson, M.D., Wu, H.T., Vandin, F. & Raphael, B.J. CoMEt: a statistical approach to identify combinations of mutually exclusive alterations in cancer. Genome Biol 16, 160 (2015).
  4. Alexandrov, L.B., et al., Signatures of mutational processes in human cancer. Nature, 2013. 500(7463): p. 415-21.
  5. Roberts SA, Lawrence MS, Klimczak LJ, et al. An APOBEC Cytidine Deaminase Mutagenesis Pattern is Widespread in Human Cancers. Nature genetics. 2013;45(9):970-976. doi:10.1038/ng.2702.
  6. Mroz, E.A. & Rocco, J.W. MATH, a novel measure of intratumor genetic heterogeneity, is high in poor-outcome classes of head and neck squamous cell carcinoma. Oral Oncol 49, 211-5 (2013).
  7. Gu, Z., Eils, R. & Schlesner, M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics (2016).

Powered By:

maftools's People

Contributors

hpages avatar poisonalien avatar rdmorin avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.