GithubHelp home page GithubHelp logo

Simple, Non-DOS-resistent seed about ahash HOT 14 CLOSED

emilk avatar emilk commented on June 3, 2024
Simple, Non-DOS-resistent seed

from ahash.

Comments (14)

emilk avatar emilk commented on June 3, 2024 2

Thanks for your replies @tkaitchuck !

This can be done. See my comment here: #123 (comment)

The problem you are encountering is that the Builder does not implement Default in such a case. However you can still use it.

What I would like is for ahash::RandomState::default() and ahash::RandomStae::new to work even without enabling runtime-rng nor compile-time-rng.

One of the great things about ahash is that ahash::HashMap is a drop-in replacement for std::collections::HashMap, but that is not true without runtime-rng or compile-time-rng.

My library could use RandomState::generate_with, but that means replacing all my #[derive(Default)] around everything containing a ahash::HashMap. I don't want to do that. If instead I instead could just use ahash::HashMap::default() (without runtime-rng or compile-time-rng) then my users can still opt-in to DOS-resistance by enabling runtime-rng themselves.

from ahash.

rklaehn avatar rklaehn commented on June 3, 2024 2

Sorry I believe it is with_seeds (I made typo) which is completely deterministic. with_seed (singular) will use random numbers.

You can look into the source to see that it just takes the four numbers as the seed without any processing.

Yes, I checked. Just found the difference a bit surprising. I would expect a method that is called with_seed to also be completely deterministic. That is usually what you want if you pass in an explicit seed.

I eventually expose an option to provide fixed seeds.

Thanks a lot. Going with the solution above for now, I guess...

from ahash.

repi avatar repi commented on June 3, 2024

We would like to have this as well, we compile with ahash WebAssembly modules for use outside of the browser which need to have minimal dependencies, fast compile times, and DOS-resistent hash seeds are not important at all.

from ahash.

notgull avatar notgull commented on June 3, 2024

I don't see the issue with compile-time-rng. At worst, it slightly increases compile times at the cost of preventing worst-case quadratic behavior in certain data structures. Especially in webapps that use packages like winit, I doubt that the three relatively small packages that that feature adds are truly impactful.

from ahash.

tkaitchuck avatar tkaitchuck commented on June 3, 2024

I want to be able to use ahash without enabling runtime-rng nor compile-time-rng.

This can be done. See my comment here: #123 (comment)

compile-time-rng has the downside that it pulls in extra dependencies, and increase compile times.

compile-time-rng is a build time dependency only. So there should be no deployment cost.

I would suggest that ahash can be used without both runtime-rng and compile-time-rng by just using a fixed seed.

This is already the case. The problem you are encountering is that the Builder does not implement Default in such a case. However you can still use it.

from ahash.

notgull avatar notgull commented on June 3, 2024

@emilk Here's the solution I use in my own projects.

struct HashMap<K, V> {
    inner: hashbrown::HashMap<K, V, RandomState>,
}

impl<K, V> Default for HashMap<K, V> {
    fn default() -> Self {
        #[cfg(has_random)]
        let inner = HashMap::with_hasher(RandomState::default());
        #[cfg(not(has_random))]
        let inner = HashMap::with_hasher(RandomState::generate_with(SEED1, SEED2, SEED3, SEED4));

        Self { inner }
    }
}

I find it to be a robust solution that allows for a drop-in replacement.

from ahash.

rklaehn avatar rklaehn commented on June 3, 2024

To add to this discussion: Sometimes you want to explicitly have deterministic behaviour. E.g. when writing tests. In particular, with property based tests it is really desirable that the same seed produces the same results.

As far as I can see there is no straightforward way to get a fully deterministic ahash HashMap. Having that would be very useful. I got a case where I am using the fnv crate instead of ahash despite our code base already depending on ahash, just because I want strictly deterministic behaviour...

from ahash.

schungx avatar schungx commented on June 3, 2024

On this issue, I have recently encountered a different problem.

The core of the prob is: ahash is too popular.

My crate would pull in ahash with default-features=false but some other dependency down the tree would also pull in ahash with std.

Since cargo would merge features, I end up always having std.

RandomState::with_seed works fine, but that problem took quite a while to diagnose.

from ahash.

rklaehn avatar rklaehn commented on June 3, 2024

Sorry if I am dense, but how do I generate a fully deterministic ahash hashtable using RandomState::with_seed? Got a bit lost...

from ahash.

notgull avatar notgull commented on June 3, 2024

To add to this discussion: Sometimes you want to explicitly have deterministic behaviour. E.g. when writing tests. In particular, with property based tests it is really desirable that the same seed produces the same results.

At this point you should probably expose with_seed in some way, e.g. have an alternate constructor that initializes with with_seed rather than default.

My crate would pull in ahash with default-features=false but some other dependency down the tree would also pull in ahash with std.

You should open a PR with these crates.

Sorry if I am dense, but how do I generate a fully deterministic ahash hashtable using RandomState::with_seed? Got a bit lost...

Use the with_hasher method on HashMap.

from ahash.

rklaehn avatar rklaehn commented on June 3, 2024

Sorry if I am dense, but how do I generate a fully deterministic ahash hashtable using RandomState::with_seed?
Got a bit lost...

Use the with_hasher method on HashMap.

So this is what I came up with to make a type of AHashMap that is always deterministic. Still feels a bit like walking a minefield since seemingly innocent methods like RandomState::with_seed are introducing nondeterminism...

So with_seed mixes in randomness, but with_seeds does not?

    pub struct DeterministicHasher(RandomState);

    impl Default for DeterministicHasher {
        fn default() -> Self {
            Self(RandomState::with_seeds(0, 0, 0, 0))
        }
    }

    impl BuildHasher for DeterministicHasher {
        type Hasher = ahash::AHasher;

        fn build_hasher(&self) -> Self::Hasher {
            self.0.build_hasher()
        }
    }

    type DetAHashMap<K, V> = ahash::AHashMap<K, V, DeterministicHasher>;

from ahash.

schungx avatar schungx commented on June 3, 2024

Sorry I believe it is with_seeds (I made typo) which is completely deterministic. with_seed (singular) will use random numbers.

You can look into the source to see that it just takes the four numbers as the seed without any processing.

I eventually expose an option to provide fixed seeds.

from ahash.

tkaitchuck avatar tkaitchuck commented on June 3, 2024

Both with_seed and with_seeds when provided with will provided identical hashers for identical seeds. Either of these can be used for a deterministic hash table.

generate_with and new both use randomness and will create different hashers each time.

from ahash.

schungx avatar schungx commented on June 3, 2024

It is probably better to mention these in the documentation as it takes a while reading through the code to figure out. Especially the differences between with_seed and with_seeds (I guess with_seeds is "lower level") are not entirely apparent.

Some form of table would be helpful, such as...

Constructor Dynamically random? Consistent cross compiles? Seeds
new Y random
generate_with Y u64 x 4
with_seed N N u64 + compile-time random number
with_seeds N Y u64 x 4

from ahash.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.