GithubHelp home page GithubHelp logo

dataplayer12 / fly-lsh Goto Github PK

View Code? Open in Web Editor NEW
85.0 85.0 28.0 219 KB

An implementation of efficient LSH inspired by fruit fly brain

License: MIT License

Python 100.00%
locality-sensitive-hashing machine-learning-algorithms nearest-neighbor-search retrieval

fly-lsh's People

Contributors

dataplayer12 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

fly-lsh's Issues

Blog directory

Why is there a blog directory with duplicated files? Please consolidate the files and remove the duplicates.

代码不理解

elif self.name=='CIFAR10':
for batch_num in [1,2,3,4,5]:
filename=self.name+'/train_batch_'+str(batch_num)+'.p'
with open(filename,mode='rb') as f:
features,labels=pickle.load(f)
for begin in range(0,len(features),batch_size):
end=min(begin+batch_size,len(features))
yield features[begin:end],labels[begin:end]
这段代码是什么意思,怎能对下载的cifar文件做处理

Code incomprehension

Hello author! I found that the code you gave just used the distance between images as labels, where findmap does not meet the requirements of image retrieval map. Moreover, the data set cifar10 was used for testing, and the hash value was extracted and saved as CSV file. According to the general map calculation method, the result retrieval accuracy was only 10%.

How to construct a nearest neighbor network efficiently?

Dear author! I am trying to use DenseFly to find k nearest neighbor to construct a nearest neighbor network. May I ask if there is any suggestion from you to construct this nearest neighbor network efficiently? Or I just iteratively use the query function in the LSH class? Thank you so much!

m and k seems confused?

According to the definition of WTAHash in the paper, FlyHash should make the maximum become 1 in every block of length k, and make the other values 0 in the block. Finally FlyHash should produce a binary vector of length mk, where m values are 1.

But in the implementation of flylsh, I found the code actually do it in the opposite way: the block's length is m (hash_length) because the code find top hash_length elements in each row of the activation.

yindices=all_activations.argsort(axis=1)[:,-hash_length:]

Do I misunderstand the problem? If I am right, it is still not a very severe problem, you may fix it just by swap the parameters' names.

Could the DenseFly processes large scale and sparse datasets?

Dear author, I tried to apply DenseFly algorithm on a large scale and a very sparse dataset which has 21612 rows x 28065 columns and the density is about 0.017 which is very sparse. My purpose is to hash the row and query their nearest neighbors efficiently. I notice that the DenseFly would firstly project the data into a high dimensional space (self.hash) and then reduce the hash dimension into a low dimensional space (self.lowd_hashes). Does that mean I have to set the embedding_size variable larger than 28065? (Or reduce the column dimension smaller than embedding_size)
Meanwhile, as far as I know, the precondition of firstly projecting the data into high dimensional space is that the data matrix is density. So that in high dimensional space the elements would be more obvious in a sparse data matrix. Does that mean the DenseFly is not appropriate for processing the sparse dataset?
Thank you very much.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.