- she/her
Professional C++ programmer, wannabe writer, amateur game designer, Pokémon fan
Find out more at my personal webiste.
Port of the xxhash library to C++17.
License: BSD 2-Clause "Simplified" License
Professional C++ programmer, wannabe writer, amateur game designer, Pokémon fan
Find out more at my personal webiste.
Hi,
Apologies in advance for the vague description. I am not really sure what's going on so I'll just do my best to describe my usage scenario:
Somewhere in my program I am building a std::vector<uint64_t>
:
std::vector<uint64_t> hashes;
std::transform(symbols.begin() + i - size, symbols.begin() + i, std::back_inserter(hashes), [](const Symbol& sm) { return sm.CalculatedHashValue; });
hashes.push_back(s.HashValue);
auto hash = xxh::xxhash<64>(hashes);
fmt::print("hash(");
for (auto h : hashes) { fmt::print("{} ", h); }
fmt::print(") = {}\n", hash);
s.CalculatedHashValue = hash;
This outputs:
hash(123 456 0) = 14523173615704738576
I wanted to debug this result so in main.cpp
I did:
std::array<uint64_t, 3> arr { 123, 456, 0 };
std::cout << "array hash = " << xxh::xxhash<64>(arr, 0) << std::endl;
This outputs:
array hash = 13763445824703203362
As you can see there's an inconsistency even though the values are the same.
The next thing I added another test in my main.cpp
:
std::vector<uint64_t> vec { 123, 456, 0 };
std::cout << "vector hash = " << xxh::xxhash<64>(vec) << std::endl;
And now the weird part. After adding the above two lines, I get:
hash(123 456 0 ) = 13763445824703203362
array hash = 13763445824703203362
vector hash = 13763445824703203362
Now the hashes are consistent as they should be. Could this be a bug?
EDIT:
The problem seems to go away if I manually define the endianess before including xxhash.hpp
:
#define XXH_CPU_LITTLE_ENDIAN 1
Best,
Bogdan
Hi,
Using xxhash.hpp
in my static lib, needs to be included more or less globally in order to have access to the exposed hash types. Linkage is broken unless the following methods are made static:
942c942
< static void accumulate(uint64_t* XXH_RESTRICT acc, const uint8_t* XXH_RESTRICT input, const uint8_t* XXH_RESTRICT secret, size_t nbStripes, acc_width accWidth)
---
> void accumulate(uint64_t* XXH_RESTRICT acc, const uint8_t* XXH_RESTRICT input, const uint8_t* XXH_RESTRICT secret, size_t nbStripes, acc_width accWidth)
951c951
< static void hash_long_internal_loop(uint64_t* XXH_RESTRICT acc, const uint8_t* XXH_RESTRICT input, size_t len, const uint8_t* XXH_RESTRICT secret, size_t secretSize, acc_width accWidth)
---
> void hash_long_internal_loop(uint64_t* XXH_RESTRICT acc, const uint8_t* XXH_RESTRICT input, size_t len, const uint8_t* XXH_RESTRICT secret, size_t secretSize, acc_width accWidth)
973c973
< static uint64_t mix_2_accs(const uint64_t* XXH_RESTRICT acc, const uint8_t* XXH_RESTRICT secret)
---
> uint64_t mix_2_accs(const uint64_t* XXH_RESTRICT acc, const uint8_t* XXH_RESTRICT secret)
978c978
< static uint64_t merge_accs(const uint64_t* XXH_RESTRICT acc, const uint8_t* XXH_RESTRICT secret, uint64_t start)
---
> uint64_t merge_accs(const uint64_t* XXH_RESTRICT acc, const uint8_t* XXH_RESTRICT secret, uint64_t start)
990c990
< static void init_custom_secret(uint8_t* customSecret, uint64_t seed64)
---
> void init_custom_secret(uint8_t* customSecret, uint64_t seed64)
1128c1128
< static uint64_t mix_16b(const uint8_t* XXH_RESTRICT input, const uint8_t* XXH_RESTRICT secret, hash64_t seed64)
---
> uint64_t mix_16b(const uint8_t* XXH_RESTRICT input, const uint8_t* XXH_RESTRICT secret, hash64_t seed64)
1135c1135
< static uint128_t mix_32b(hash128_t acc, const uint8_t* input_1, const uint8_t* input_2, const uint8_t* secret, hash64_t seed)
---
> uint128_t mix_32b(hash128_t acc, const uint8_t* input_1, const uint8_t* input_2, const uint8_t* secret, hash64_t seed)
This is with gcc 9.2.
Best,
Bogdan
Compiling the current source throws a couple warnings (g++-6:
I suggest to build with -Werror
on CI and take care of the warnings
This version is slightly behind the head version of xxHash. Are you planning to update?
The most recent release of xxhash is 0.8.0
Do you have plans to update xxhash_cpp to that version?
Changes to be incorporated are as far as I see:
I would suggest to create a tag 0.7.3 from the currently latest commit on master, i.e. 6246966 , before proceeding with implementing 0.8.0 changes.
It seems the makefile in the project root is a copy of the real makefile in the subfolder. Issuing make
in the root folder results in an error as the files are not found.
In fixing this I'd suggest to reorganize the folder structure. A common way to use 3rd party libraries is by including them as a git submodule and adding their include
folder to -I
. With the current structure this will bring in Catch and a couple source files which may be a problem.
From what I understand it is enough to include xxhash.hpp
into an application and it will work. None of the other files are required for consumers. Is this correct?
If so I suggest to create a folder include
in the root containing only xxhash.hpp
and a folder test
(or tests
) containing the rest. The top-level makefile will just include/redirect/... (I'm not familiar with Makefiles TBH) the tests makefile.
Thank you :)
While the readme states that g++-5 would be supported it does not work:
xxhash.hpp:625:3: error: array must be initialized with a brace-enclosed initializer
xxhash.hpp:625:3: error: too many initializers for ‘std::array<long unsigned int, 4ul>’
It seems to be a problem with brace elision, so adding extra braces should work.
I'd suggest to add the "supported compilers" from the readme to CI
In the example usage the constructor parameter (hash) is missing:
xxh::hash_state_t<64> hash_stream(hash)
Describe the bug
I was compiling with the x86 version of MSVC, before applying the change in #15. xxhash.hpp
was included in multiple source files and I got the following linker error:
b.obj : error LNK2005: "void __cdecl xxh::intrin::prefetch(void const *)" (?prefetch@intrin@xxh@@YAXPBX@Z) already defined in a.obj
a.exe : fatal error LNK1169: one or more multiply defined symbols found
To Reproduce
The sample code & reproduction steps are documented in this gist.
Expected behavior
The code should build successfully.
Desktop (please complete the following information):
Additional context
#15 fixes this issue for x86 MSVC. However, this issue also arises when XXH_NO_PREFETCH
is used, or when a compiler other than MSVC or g++ (or clang++ - it defines __GNUC__
, so masquerades as g++) is used. It seems to me that all the prefetch implementations should be marked as inline
to denote that there may be multiple definitions of the function in different translation units.
Hi,
I'm attempting to implement xxhash_cpp in a project, where unsurprisingly, I need to generate the xxhash64 for a file and check it against a known hash string. But I've run into the problem that my output hash is reversed.
Output from QuickHash: 9FD684E536C4C0B9
Output from xxhsum: 9fd684e536c4c0b9
Output from my implementation B9C0C436E584D69F
This seems like an endianness problem, but I can't see any option for setting a specific endianness.
What would be the best strategy to to an equality check for the known hash string?
hash64_t
object and compare with the hash object created from the fileMy current implementation:
std::string filename = "/my_input_file";
std::ifstream filestream(filename, std::ifstream::in | std::istream::binary);
xxh::hash_state64_t hash_stream;
std::vector<uint8_t> contents((std::istreambuf_iterator<char>(filestream)), std::istreambuf_iterator<char>());
hash_stream.update(contents);
xxh::hash64_t final_hash = hash_stream.digest();
std::cout << byte_print(final_hash) <<std::endl;
Hello,
When trying to use this library on macos with clan on arm64, I get the following error:
/Applications/Xcode_14.2.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/14.0.0/include/immintrin.h:14:2: error: "This header is only meant to be used on x86 and x64 architecture"
The issue seems to be caused simply by importing immintrin.h.
Is there a known workaround for this problem?
Best
Peter
How do I build it?
Some minimal (example) usage instructions would be great too
Comments for issue #17 say that from_canonical()
was added, but after extracting the 0.8.1 tarball fgrep cannot find any such function.
Would be great :) :
See https://xxhash.com/
At least my GCC7.3.0 environment needs an #include <cstring>
to resolve memcpy
See also https://en.cppreference.com/w/cpp/string/byte/memcpy
I'm on a project with C++14 requirement. Is it possible to adjust the implementation to be C++14 compatible?
From what I have seen only if constexpr
is used from C++17. In once case (swap32/64
) it can be replaced by a simple overload instead of 2 names and the other can be solved by extracting a subfunction (potentially merging code from here and (here)[https://github.com/RedSpah/xxhash_cpp/blob/92cf55f21d341520137e4a7eb155290d390bdbff/xxhash/xxhash.hpp#L589] although I'm not sure)
This would make it available to a wider audience especially as this seems the only C++ implementation. Thanks for that! 👍
Compiling this function, which worked with version 0.7.3. to_string() not given here, but what it does is reasonably obvious. Note the error is not in this function, but in the header:
std::string
xxhash3(std::string_view data)
{
#define checksum_bits 128
static_assert(checksum_bits == 64 || checksum_bits == 128);
xxh::hash3_state_t<checksum_bits> state;
state.update(data.data(), data.size());
// convert checksum to canonical byte order
xxh::canonical_t<checksum_bits> const canonical{state.digest()};
auto const hash{canonical.get_hash()};
#if checksum_bits == 128
return to_string(hash.low64, hash.high64);
#else
return to_string(hash);
#endif
}
g++ 8.3 in C++17 mode:
xxhash/0.8.1/include/xxhash.hpp: In function ‘xxh::uint128_t xxh::intrin::bit_ops::mult64to128(uint64_t, uint64_t)’:
xxhash/0.8.1/include/xxhash.hpp:290:10: error: request for member ‘low64’ in ‘r128’, which is of non-class type ‘__int128 unsigned’
r128.low64 = (uint64_t)(product);
^~~~~
xxhash/0.8.1/include/xxhash.hpp:291:10: error: request for member ‘high64’ in ‘r128’, which is of non-class type ‘__int128 unsigned’
r128.high64 = (uint64_t)(product >> 64);
^~~~~~
xxhash/0.8.1/include/xxhash.hpp:292:12: error: could not convert ‘r128’ from ‘__int128 unsigned’ to ‘xxh::uint128_t’ {aka ‘xxh::typedefs::uint128_t’}
return r128;
^~~~
xxhash/0.8.1/include/xxhash.hpp: At global scope:
xxhash/0.8.1/include/xxhash.hpp:915:3: error: explicit template specialization cannot have a storage class
static inline uint_t<32> avalanche<32>(uint_t<32> hash) {
^~~~~~
xxhash/0.8.1/include/xxhash.hpp:925:3: error: explicit template specialization cannot have a storage class
static inline uint_t<64> avalanche<64>(uint_t<64> hash) {
^~~~~~
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.