Comments (5)
That's a great idea for making the comparison clear. It certainly raises the priority on getting the custom initialisation done!
from umap.
great! it works!
the only hurdle i had to get across: t-SNE tends towards larger embeddings than umap. for example, t-SNE with default parameters embeds MNIST into a +/-35 range while UMAP is closer to +/-15. so initializing with the raw output of t-SNE produces some minor artifacts. once i scale the t-SNE output closer to the UMAP output it works better.
some gifs for your trouble:
https://www.dropbox.com/s/wbv73swh6qlrexg/mnist-all-init-0.8.gif?dl=0
this one is UMAP with min_dist=0.8 and initialized with a t-SNE embedding scaled by 0.4.
https://www.dropbox.com/s/cufrbjsbm79a3kh/mnist-all-init.gif?dl=0
this one is UMAP with default parameters also initialized with a t-SNE embedding scaled by 0.4.
the final embeddings of both are normalized uniformly across both axes to fill the screen with a little padding.
from umap.
That's a great idea that I hadn't considered. It is something along these lines that I was hoping to use to make an "update" procedure, but what you are proposing here is the easy concrete way to move toward that.
As to whether the embedding would be preserved at all -- as long as it is "close" in objective function space to the final embedding then it will be preserved. That's really what the spectral embedding is doing: it provides a good starting point and ensures a degree of consistency over multiple runs (up to rotation/reflection).
from umap.
Awesome! For these tests I will use the spectral initialization to encourage consistency between runs. But I was also considering what it would look like to initialize with a t-SNE embedding, and then it would be much easier to see how the embedding changes after running UMAP.
from umap.
Provisional support turned out to be straightforward. Of course it may be a little glitchy if I didn't catch all the ways it can go wrong, but you should now be able to pass a numpy array of initial positions to the init parameter and have it work from there (in current master).
from umap.
Related Issues (20)
- Implementation of sciki-learn's get_feature_names_out() API is not correct
- Is 'n_training_epochs' working for parameteric UMAP?
- visualize video data
- How to combine UMAP models in new data?
- Edit instructions to make them compatible with zsh
- Empty API page on UMAP API Guide? HOT 1
- PCA diagnostic error HOT 2
- Speed inquries HOT 2
- UMAP crashes when torch also imported before first run HOT 2
- Unable to pickle trained UMAP instance
- Reducing Model Size for UMAP on Large Datasets HOT 2
- umap.UMAP accepts strings as n_neighbors and min_dist, causing later failures
- Optimal dimensions
- RunUMAP Failing HOT 1
- Semi-deterministic output even though randon_state is set
- TypeError: Dispatcher._rebuild() got an unexpected keyword argument 'impl_kind' HOT 1
- illegal hardware instruction python HOT 2
- Transform new input with composite model HOT 1
- Inquiry on Utilizing UMAP for Text Similarity and Clustering HOT 4
- No clear documentation of default parameter values HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from umap.