GithubHelp home page GithubHelp logo

Comments (6)

pavlin-policar avatar pavlin-policar commented on May 11, 2024

Almost all the problems I've encountered while maintaining openTSNE have been pynndescent related. As such, I would be very hesitant to add anything that ties openTSNE even closer to pynndescent.

However, it should be fairly simple to achieve what you want. I've made it so that the neighbors parameter can be an instance of openTSNE.nearest_neighbors.KNNIndex, so you can just copy over the NNDescent class and remove the check that disallows custom metrics. You can probably just change check_metric to return True and I'd guess that would do it.

from opentsne.

jgraving avatar jgraving commented on May 11, 2024

Sorry, I wasn't clear. I'm not suggesting you tie openTSNE closer to pynndescent. Passing a callable metric is a standard feature for many packages. scikit-learn also allows it for many of their algorithms, including their implementation of TSNE. Adding a check if metric is callable is fairly straightforward. If you're short of time/motivation then I can make a PR once I have time to look through the code. You're of course free to reject it, but I think this is a reasonable feature to have for a t-SNE implementation and would be potentially useful to quite a few people.

I'm trying to apply openTSNE to use these methods for analyzing animal behavior:
https://royalsocietypublishing.org/doi/10.1098/rsif.2014.0672
as an example analysis for my recently developed package:
https://github.com/jgraving/behavelet
Applying t-SNE to these data requires using KL-divergence as a distance metric between normalized time-frequency vectors. If accomplishing that requires that users subclass and modify your code to get it working then I will likely just find something else, as I'd like it to be straightforward for people to use.

Again, I appreciate the work you've done on this. I'm not trying to be pushy. Just wanted to let you know that openTSNE would be more widely useful if it had this feature.

from opentsne.

pavlin-policar avatar pavlin-policar commented on May 11, 2024

Hmm, I'm all for custom distance metrics, but the "numba-compiled callable" threw me off a little bit. I'm not very familiar with numba so is it possible to call numba compiled from regular old python code? What I'm getting at is that opentsne supports both exact neighbor search via scikit-learn and approximate neighbor search via pynndescent, and I'd like to keep some consistency between the two. My concern was that if the function is numba-compiled, that it wouldn't be possible to use that callable in the scikit-learn BallTree.

Putting all numba-related things aside, adding custom distance metrics was definitely on my to-do list, but I've been and will likely be short on time for a while, so if you're willing to open a PR, I'd be more than happy to take a look. The changes should be fairly minimal anyhow.

from opentsne.

jgraving avatar jgraving commented on May 11, 2024

Ah, I see. Sorry I assumed you were familiar with numba as it's one of the main dependencies for pynndescent. numba is a library that JIT-compiles pure python/numpy functions to make them faster, so a numba compiled function works just like any other python function in practice. scikit-learn's BallTree accepts user-defined functions, which includes numba compiled functions. Here is an example:

from sklearn.neighbors import BallTree
from numba import njit
import numpy as np

@njit(fastmath=True)
def l1(x, y):
    return np.sum(np.abs(x - y))

tree = BallTree(np.random.normal(size=(1000,10)), metric=l1)
distances, indices = tree.query(np.random.normal(size=(100,10)), k=5)

The only issue I can see is if the function isn't numba compiled and it's passed to pynndescent then pynndescent will throw an uninformative error message from numba. Whether or how to deal with this might require some thought, or maybe just make clear in the docstring and leave this for pynndescent to deal with. I assume if the user is savvy enough to pass a custom metric then they'd expect error messages are a possibility.

I'll take a look at the code and submit a PR with the changes.

from opentsne.

pavlin-policar avatar pavlin-policar commented on May 11, 2024

The only issue I can see is if the function isn't numba compiled and it's passed to pynndescent then pynndescent will throw an uninformative error message from numba.

I would hope that numba has some kind of way to check if a function is compiled. Then if it isn't, we could do that within the pynndescent wrapper to avoid potential errors. A decorator is just a function, after all.

After a bit of searching, I found it should be fairly straightforward to check if a function is an instance of the Dispatcher type and then if it isn't, to just call numba.njit()(f) otherwise.

from opentsne.

pavlin-policar avatar pavlin-policar commented on May 11, 2024

Fixed via #94.

from opentsne.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.