GithubHelp home page GithubHelp logo

isabella232 / understanding-umap Goto Github PK

View Code? Open in Web Editor NEW

This project forked from pair-code/understanding-umap

0.0 0.0 0.0 35.67 MB

Understanding the theory behind UMAP

Home Page: https://pair-code.github.io/understanding-umap

License: Apache License 2.0

JavaScript 51.12% CSS 2.50% HTML 46.38%

understanding-umap's Introduction

Understanding UMAP

Dimensionality reduction is a powerful tool for machine learning practitioners to visualize and understand large, high dimensional datasets. One of the most widely used techniques for visualization is t-SNE, but its performance suffers with large datasets and using it correctly can be challenging.

UMAP is a new technique by McInnes et al. that offers a number of advantages over t-SNE, most notably increased speed and better preservation of the data's global structure. In this article, we'll take a look at the theory behind UMAP in order to better understand how the algorithm works, how to use it effectively, and how its performance compares with t-SNE.

yarn
yarn dev

Publishing to github pages

yarn pub

To develop figures individually

yarn dev:cech
yarn dev:hyperparameters
yarn dev:mammoth-umap
yarn dev:mammoth-tsne
yarn dev:supplement
yarn dev:toy
yarn dev:toy_comparison

Data preprocessing

For the mammoth figures, the raw 3D data was downsampled to 50,000 points before being projected with UMAP / t-SNE. These 50,000 points were then randomly subsampled to 10,000 points in order to minimize the payload size.

Understanding UMAP uses a few tricks to make the data payloads for some of the interactive figures small enough to download in a reasonable time. The mammoth figures use a 10-bit encoding scheme to compress the 10,000 data points into a significantly smaller payload. The hyperparameters and toy_comparison figures precompute UMAP embeddings for all of their different combinations, then use the same 10-bit encoding scheme to compress the data.

yarn preprocess:hyperparameters
yarn preprocess:mammoth
yarn preprocess:toy_comparison

understanding-umap's People

Contributors

1wheel avatar cannoneyed avatar willettk avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.