GithubHelp home page GithubHelp logo

tessil / array-hash Goto Github PK

View Code? Open in Web Editor NEW
174.0 174.0 28.0 1.01 MB

C++ implementation of a fast and memory efficient hash map and hash set specialized for strings

License: MIT License

CMake 2.24% C++ 97.76%
c-plus-plus cpp data-structures hash-map hash-table header-only

array-hash's Issues

array_set::deserialize() creates an undesired new array_set(0)

I have been experimenting with the new (de)serialization code for this container. Thanks for putting it in. (actually I have been using Boost to serialize it, which could be smoother, but nonetheless).

I have a class that defines one of these and I do not wish deserialize() to return a new/different one.
At the moment the code is as follows:

template<class Deserializer>
static array_set deserialize(Deserializer& deserializer, bool hash_compatible = false) {
    array_set set(0);
    set.m_ht.deserialize(deserializer, hash_compatible);

    return set;
}

I added the following which seems to work for my purposes (and changed the currently existing static version to be called static_deserialize()):

void deserialize(Deserializer& deserializer, bool hash_compatible = false) {
    assert(m_ht.size() == 0);
    m_ht.deserialize(deserializer, hash_compatible);
    return;
}

There may be different/better ways to achieve this result (?), but I would be pleased if there was at least some way to avoid the static allocation currently there. Regards... -k

CRC32 as a hash function option?

CRC32 is nice for speed when we can use _mm_crc32_u*. I was curious if benchmarking had been done with a function like the following:

static inline size_t CRCHash(const char *__s, size_t len) {
	uint32_t __h = 5183;
	int curr = len;
	uint64_t *chunks = (uint64_t *)__s;
	while (curr >= 8) {
		__h = (uint32_t)_mm_crc32_u64((uint64_t)__h, *chunks);
		chunks++;
		curr -= 8;
	}
	if (curr >= 4) {
		uint32_t *bits = (uint32_t *)(__s + len - curr);
		__h = _mm_crc32_u32(__h, *bits);
		curr -= 4;
	}
	if (curr >= 2) {
		uint16_t *bits = (uint16_t *)(__s + len - curr);
		__h = _mm_crc32_u16(__h, *bits);
		curr -= 2;
	}
	if (curr >= 1) {
		__h = _mm_crc32_u8(__h, __s[len - 1]);
	}
	return (size_t)__h;
}

vcpkg?

Hi!
Saw your awesome libs on vcpkg. Would it be possible to include this as well? :)

Custom allocators

It looks like a significant % of time can be spent in malloc and realloc. It would be awesome to be able to pass in custom allocators to speed those up.

array_map<char, bool> compiles error.

code:

tsl::array_map<char, bool> m;
m["abc"] = false;

compiles error:

array-hash/include/tsl/array_hash.h:1230:14: error: non-const lvalue reference to type 'bool' cannot bind to a temporary of type 'std::vector<bool>::reference' (aka '__bit_reference<std::vector<bool>>')
      return this->m_values[it_find.first.value()];

std::vector is specialized

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.