I downloaded the source code and read it today. But I still have some difficulty under

There is some deion of it in the readme, but I agree, it could be explained bett

Hi, I have read through the example, your deion is easily understood, I think th

Improve hash algorithm description in README,about spotify/sparkey

spkrka commented on July 25, 2024

There is some description of it in the readme, but I agree, it could be explained better and in more detail. I'll try to do that if I find the time.

from sparkey.

liangdong commented on July 25, 2024

aha, I guess I have got the idea how the hash algorithem works, I am right? thank you~

1, To insert a pair of k/v
2, calculate the hash(key)
3, calculate the hash(key) % capacity(which is a linear-hashtable with num_of_entry * 1.3 slots) to put the key down
4, if the slot is unused, put it there.
5, or we must step one by one slot to find another empty slots.
6, as README.md says: "As soon as we reach a slot with a smaller displacement than our own, we shift the following slots up until the first empty slot one step and insert our own element."

if we find the stepped slot 's displacement is smaller than out own, we store this slot's hash and address first, put our k/v into this slot, and continue step to the next slot with the new hash and address.

according to the algorithm, we can abort step if we find the stepped slot's displacement is smaller than us, because we can't be after the position anymore.

I guess what you want is to reduce the displacement of any keys, to be balanced. Am I right?

from sparkey.

spkrka commented on July 25, 2024

Yes, that's correct. The reordering of hash entries is to balance the displacements, by minimizing the maximum. This does nothing to improve the average lookup, but the worst case lookup will be better. And by knowing the maximum displacement for all of the hash table, we know when to abort on a key-miss.

from sparkey.

liangdong commented on July 25, 2024

thank you very very much, you are so enthusiasm ^ ^. and the algorithm is so ingenious too ~

from sparkey.

rohansingh commented on July 25, 2024

Renamed to better reflect actual issue.

from sparkey.

nresare commented on July 25, 2024

Given that persistent data will be read in block sized (something like 512 bytes) chunks, the likelihood of a displaced hash ending up resulting in more than 1 read request is very small. I think that hash entry reordering might have a complexity cost that is higher than the performance benefit but now that we have it we might as well keep it :)

from sparkey.

spkrka commented on July 25, 2024

It's actually the same complexity cost compared to just adding at the nearest free slot. There are some extra memory writes, but it's private memory that should be in cache already, so that should not be a big cost.

The average displacement (which you can get by running "sparkey info" on a file) is really low since we have a 30% extra hash capacity, i thInk it evaluates to slightly less than 2. I've seen maximum displacements up at around 50 slots, which would mean 800 bytes away - but then again, that's the extreme worst case.

from sparkey.

liangdong commented on July 25, 2024

Thanks again for sharing, it helps a lot.

发件人: Kristofer Karlsson [mailto:[email protected]]
发送时间: 2013年9月3日 21:26
收件人: spotify/sparkey
抄送: Liang,Dong(Client-RD)
主题: Re: [sparkey] Improve hash algorithm description in README (#4)

It's actually the same complexity cost compared to just adding at the nearest free slot. There are some extra memory writes, but it's private memory that should be in cache already, so that should not be a big cost.

The average displacement (which you can get by running "sparkey info" on a file) is really low since we have a 30% extra hash capacity, i thInk it evaluates to slightly less than 2. I've seen maximum displacements up at around 50 slots, which would mean 800 bytes away - but then again, that's the extreme worst case.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/4#issuecomment-23711974.

from sparkey.

spkrka commented on July 25, 2024

I added a more visual example of the hash algorithm now - if you think it helps explain it I could close the issue.

from sparkey.

liangdong commented on July 25, 2024

Hi, I have read through the example, your description is easily understood, I think this may help other people who are intersted in your project like me, thanks again for you enthusiastic reply :)

from sparkey.

Improve hash algorithm description in README about sparkey HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs