GithubHelp home page GithubHelp logo

Ask about hash expansion about nudb HOT 7 OPEN

KeeProMise avatar KeeProMise commented on August 16, 2024
Ask about hash expansion

from nudb.

Comments (7)

vinniefalco avatar vinniefalco commented on August 16, 2024

I'm not sure I understand the question. NuDB only works with SSD drives, and the performance is near the theroetical maximum for what is possible with a single write queue. Are you asking me if the overall performance of the library is affected by the performance of the SSD drive? Obviously yes, so I am maybe not understanding the question fully.

from nudb.

KeeProMise avatar KeeProMise commented on August 16, 2024

I'm not sure I understand the question. NuDB only works with SSD drives, and the performance is near the theroetical maximum for what is possible with a single write queue. Are you asking me if the overall performance of the library is affected by the performance of the SSD drive? Obviously yes, so I am maybe not understanding the question fully.

I know that you use linear hash to solve hash table expansion. Do you directly use the native linear hash expansion algorithm? In my understanding, the expansion of the native linear hash table will cause the data in some slots to be moved to the new slot. I think this process will involve random io. Have you optimized this process?

from nudb.

vinniefalco avatar vinniefalco commented on August 16, 2024

Do you directly use the native linear hash expansion algorithm?

No. The principle is the same, but since NuDB is block/bucket based, every expansion of the hash table adds an entire block of slots and not just one. NuDB is very fast. You should run your own benchmarks to confirm that the performance meets your requirements.

There's a benchmark program that compares NuDB to other key/value stores:
https://github.com/CPPAlliance/NuDB/tree/master/bench

from nudb.

KeeProMise avatar KeeProMise commented on August 16, 2024

Do you directly use the native linear hash expansion algorithm?

No. The principle is the same, but since NuDB is block/bucket based, every expansion of the hash table adds an entire block of slots and not just one. NuDB is very fast. You should run your own benchmarks to confirm that the performance meets your requirements.

There's a benchmark program that compares NuDB to other key/value stores: https://github.com/CPPAlliance/NuDB/tree/master/bench

Thank you for your reply, I would also like to ask, is your key a fixed length? How many bits do you map the key to, and what is the hash algorithm for the key? What is the approximate number of keys you support? What is the approximate conflict rate?

from nudb.

vinniefalco avatar vinniefalco commented on August 16, 2024

is your key a fixed length?

Yes. You set the key size when the database is created.

How many bits do you map the key to

64 bits

what is the hash algorithm for the key

The default algorithm uses xxhasher which is fine for general purpose use. But this can be changed, if you can take advantage of how your keys are distributed.

What is the approximate number of keys you support

Because of the birthday problem, you can insert 2^32 (over 4 billion) keys before performance will begin to degrade. But since a database can be trivially split up (by dividing the key space by a power of two) the number of keys is essentially unlimited. NuDB is in production use with databases that are over 13 terabytes in size.

What is the approximate conflict rate?

The default settings aim for 50% occupancy in buckets. Less than 1% of buckets require two blocks on disk ("spills").

Remember that each bucket is designed to hold many keys not just one. And the cost for I/O is much higher than the cost for in-memory lookup.

from nudb.

KeeProMise avatar KeeProMise commented on August 16, 2024

Thank you very much for your patient answer. At present, I want to design a kv database, the key is a string, and the value is a list, which is used to store some integer values. Currently, I need to support adding kv operations and adding numbers to a list of a key. And want to get the list of a key as quickly as possible, do you mind? At present, my idea is to use linear hash to solve the expansion of the hash table when adding keys, but adding an integer value to a key to the list may cause the value of a key to be distributed in multiple disk blocks, which requires multiple disk io to obtain the complete value, so the read performance is poor.

from nudb.

vinniefalco avatar vinniefalco commented on August 16, 2024

NuDB is not going to work well for you, because you intend to change the value of already-inserted entries. Modification and deletion of existing values is not possible in NuDB.

from nudb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.