GithubHelp home page GithubHelp logo

lkampoli / t-sne-heatmaps Goto Github PK

View Code? Open in Web Editor NEW

This project forked from klugerlab/t-sne-heatmaps

0.0 1.0 0.0 11 KB

Beta version of 1D t-SNE heatmaps to visualize expression patterns of hundreds of genes simultaneously in scRNA-seq

R 100.00%

t-sne-heatmaps's Introduction

t-SNE Heatmaps

Introduction

When exploring a scRNA-seq dataset, it is common to visualize the cells with t-SNE and color them by expression of different genes that are markers for known cell types. The expression patterns of these marker genes suggest which clusters correspond to which known cell types. However, the expression of these marker genes can only be visualized one at a time, and only so many can fit on a page and compared side-by-side. We present t-SNE Heatmaps, which use 1D t-SNE to address this constraint, allowing for visualization of the expression patterns of hundreds of genes simulatneously.

Example

In 1d_tsne_heatmaps_tutorial.R we demonstrate t-SNE heatmaps using the sCNRA-seq data from Macosko et al. (2015) [1].

Here is a screenshot of the result: alt text The interactive version is available here, and thanks to heatmaply, allows for more detailed exploration by zooming, panning, and more.

How it works

Linderman and Steinerberger (2017) [2] prove that t-SNE will faithfully embed well clustered data, independent of the dimension. Real life data is not well clustered, but in our experience, 2D embeddings (in particular the clusters) tend to be generally consistent with 1D embedding. However, 1D embeddings are much more compact, and allow us to build the heatmap visualization. To build a t-SNE heatmap, we do the following:

  1. Compute 1D t-SNE of the cells. With FIt-SNE [3], this can be done for millions of cells in very short amount of time.
  2. Discretize the 1D t-SNE embedding into 100 bins.
  3. Represent each gene by the sum of its expression in the cells contained in each bin. Each gene is now a vector in R^100.
  4. Visualize these vectors in heatmap format (i.e. each row is a gene and each column is a bin) using heatmaply.

The representation of the genes by their binned expression pattern in 1D t-SNE induces a distance function on the genes. In particular, we can perform hierarchical clustering on the genes using this new distance function, which often gives more meaningful results than Euclidean distance on the original data. This distance function also allows us to find genes whose expression pattern is associated with genes of interest. The user provides a list of genes of interest, and the algorithm then "enriches" this set with genes that have a similar expression pattern in the t-SNE.

If you have any questions, please feel free to contact George Linderman. Also, if you find t-SNE Heatmaps to be useful, please cite Linderman et al. 2017 [3].

References

  1. Macosko, Evan Z., et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161.5 (2015): 1202-1214.
  2. Linderman, G. C., & Steinerberger, S. (2017). Clustering with t-SNE, Provably. arXiv:1706.02582
  3. George C. Linderman, Manas Rachh, Jeremy G. Hoskins, Stefan Steinerberger, Yuval Kluger. (2017). Efficient Algorithms for t-distributed Stochastic Neighborhood Embedding. (2017) arXiv:1712.09005

t-sne-heatmaps's People

Contributors

b-dawes avatar linqiaozhi avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.