fastfilter / xor_singleheader Goto Github PK
View Code? Open in Web Editor NEWHeader-only binary fuse and xor filter library
License: Apache License 2.0
Header-only binary fuse and xor filter library
License: Apache License 2.0
For extra performance, this look can be unrolled...
for (size_t i = 0; i < size; i++) {
uint64_t key = keys[i];
xor_hashes_t hs = xor8_get_h0_h1_h2(key, filter);
sets[hs.h0].xormask ^= hs.h;
sets[hs.h0].count++;
sets[hs.h1].xormask ^= hs.h;
sets[hs.h1].count++;
sets[hs.h2].xormask ^= hs.h;
sets[hs.h2].count++;
}
Do the computations first and then do the memory accesses.
While implementing Fuse8 in rust, I had the following error:
Benchmarking fuse8_populate: Collecting 100 samples in estimated 74.312 s (100 iterations)thread 'main' panicked at 'index out of bounds: the len is 10000001 but the index is 10000001', /home/
prataprc/myworld/devrs/xorfilter/src/fuse8.rs:345:23
stack backtrace:
0: 0x563ed8b922c0 - std::backtrace_rs::backtrace::libunwind::trace::hdcf4f90f85129e83
at /rustc/5c029265465301fe9cb3960ce2a5da6c99b8dcf2/library/std/src/../../backtrace/src/backtrace/libunwind.rs:90:5
...
18: 0x563ed89cd17d - <alloc::vec::Vec<T,A> as core::ops::index::Index<I>>::index::h92de800ab79df56c
at /home/prataprc/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/vec/mod.rs:2427:9
19: 0x563ed89cd17d - xorfilter::fuse8::Fuse8<H>::build_keys::h29347249efad0362
at /home/prataprc/myworld/devrs/xorfilter/src/fuse8.rs:345:23
20: 0x563ed89d44d8 - xorfilter::fuse8::Fuse8<H>::build::hbcd0e61df247a22f
at /home/prataprc/myworld/devrs/xorfilter/src/fuse8.rs:294:9
21: 0x563ed89d44d8 - fuse8_bench::bench_fuse8_populate::{{closure}}::{{closure}}::hd59d7c6b8d7c2c10
at /home/prataprc/myworld/devrs/xorfilter/benches/fuse8_bench.rs:58:13
22: 0x563ed89d44d8 - criterion::Bencher<M>::iter::h64d37255ed521f5a
I got this while bench-marking the Fuse8 filter. Also I am not able to re-produce this again, since I am generating the keys randomly.
Question is, can startPos[segment_index]++;
value at segment_index go beyond size+1
(where size is number of keys).
xor_singleheader/include/binaryfusefilter.h
Line 261 in 9c68073
PS: In my benchmark code I am not printing the seed value so that if it happens again I should be able to reproduce it.
Hello - super interesting work on the binary fuse filters. I'm enjoying them a lot. I noticed the readme says:
The construction of a binary fuse filter is fast but it needs a fair amount of temporary memory: plan for about 24 bytes of memory per set entry. It is possible to construct a binary fuse filter with almost no temporary memory, but the construction is then somewhat slower.
I'm exploring application of binary fuse filters in search engines (using them as an ngram set lookup). I previously was investigating moving the memory allocations from the population code ot instead utilize mmaped system memory as a means to reduce the physical memory requirements of populating filters - but to hear you think it may be possible to accept a slower construction time with almost no temporary memory is extremely interesting to me.
I am curious if anyone has had more in-depth thoughts about how this would be approached, or tried an implementation of this? If not I will likely try my hand at it, just figured I'd ask in case anyone already had and I might be spared a few hours :)
For large inputs (that exceed the CPU cache), we should be more careful with cache and memory usage during the construction. It should be possible to gain 10% to 20% in performance.
In the paper, I read that the fingerprint function maps each possible value from the universe to a word value
, I wanted to see how that's implemented here and found:
xor_singleheader/include/binaryfusefilter.h
Lines 394 to 396 in 177cf03
What is the reasoning for this? I am a bit confused at how we arrived at this function.
Thanks!
Hello,
I've built out Erlang bindings for the binary fuse filter: https://github.com/mpope9/efuse_filter and things look very positive when benchmarking, performance-wise. I was curious on how the binary fuse filter's false positive rate compared to the xor16?
Porting this code to Go ought to be quite easy.
It is not necessary to port everything, porting the xor8_populate
and xor8_allocate
functions would be a good start.
Should include binaryfusefilter.h
instead in the C++ wrapper example.
Line 77 in 1f7e18b
The first parameter of AddAll() cannot be const, because it may be edited by function binary_fuse_sort_and_remove_dup().
Lines 90 to 92 in 1f7e18b
Member variable name already changed from fingerprints to Fingerprints in commit b8268c9 .
Line 100 in 1f7e18b
Has there been profiling done to estimate the probability of filter construction failure? I'm not an expert so I'd thought to ask :)
If I allocate enough space at the beginning, can I add elements dynamically? I don't see an API for dynamically adding elements
Both are quite rare use case in my view. Xor maps would have, instead of a contain
method, have a get
method to get the stored data. Supporting these features would probably require to rewrite most of the code in the form of pre-processor statements. The advantage is, it would reduce code duplication. It would be harder to read and debug, but running the pre-processor could be a separate build phase, which generates the source code for xor_8, xor_16, xor_32, as well as xor_map_8, xor_map_16, xor_map_32, xor_map_64.
Do I understand your work correctly that it's not possible to combine multiple filters into one?
We are having a partitioned vector, which is prepared in multiple threads. We currently build BloomFilters and and
them in the end.
Is something similar possible with the xor of binary fuse filters?
Line 91 in 914c73a
return binary_fuse8_populate(data + start, end - start, &filter)
;xor_singleheader/include/binaryfusefilter.h
Lines 20 to 30 in 1f7e18b
In this function, why j
starts from 0?
For example, if keys[0]
< keys[1]
, this function will assign the value of keys[1]
to keys[0]
, and keys[0]
will be no longer available.
why return j+1;
?
Should be easily portable to Swift.
I've noticed with very small input sizes (0, 1, 10, 100) that binary_fuse8_allocate
sometimes relies on some.. questionable behavior.
For example, if called with size=0
, filter->Fingerprints
is ultimately allocated with filter->ArrayLength = 786432
(768 KiB), which seems high given a filter size of zero.
With size=1
, sizeFactor
ends up being INFINITY
, which makes:
xor_singleheader/include/binaryfusefilter.h
Line 153 in f190e5a
result in effectively:
uint32_t capacity = (uint32_t)(round((double)0 * INFINITY));
https://godbolt.org/z/7vqY9ofY1
or more simply:
uint32_t capacity = (uint32_t)((double)INFINITY);
I think this cast may be undefined behavior, as in GCC this results in -1
while in clang it results in an undefined value:
https://godbolt.org/z/daxcT7n5K
All this to say, I think very small input sizes (specifically 0, 1, 10, 100) may fail during binary_fuse8_allocate
in the worst case scenario and, in the best case scenario, allocate a perhaps larger set of fingerprints than needed.
This is not an issue, but a question. Can I do the following query to the Xor filter: "does the given element appear definitely less than the threshold number of times"? Counting Bloom filter seems to be a traditional approach for that, but I wonder what is the current state-of-the-art library that allows that out of the box? I looked at cuckoofilter library and this operation is not supported (even though, I presume, it is possible: filter-tutorial). It would be great if Xor Filter allowed that. Otherwise, what would you recommend, maybe something from fastfilter_cpp? If that matters, I plan to put up to 3 billion queries to the filter and search for elements that were added less than ~10 times.
Thanks!
Much of the code should be easily parallelizable three-way.
Seems that the class name is Xor8, but the actual content is binary fuse filter.
Currently the user is responsible to ensure that there are no duplicated keys. We should handle this for the user (it is relatively easy to do efficiently, without even sorting).
Very interesting work.
I have some production use case where i want to use filters in different systems , is there any support for serialization of these filters or any serialization spec ? Any implementations you would be aware of will be helpful
Let say I have 1 million items. After the filter is built, it's saved to disk. Later I have some more items, can I reload the saved filter the incrementally add items without building the filter from scratch?
As far as I know, bloom filter can do that.
See discussion in #20
I have 2 c++ files, which both include the binaryfusefilter.h header. And they are compiled by cmake together.
when I compile my project, there are errors like:
/usr/bin/ld: ../lib/libPSI.a(PsiSender.cpp.o): in function `binary_fuse_mulhi(unsigned long, unsigned long)':
PsiSender.cpp:(.text._Z17binary_fuse_mulhimm+0x0): multiple definition of `binary_fuse_mulhi(unsigned long, unsigned long)'; ../lib/libPSI.a(PsiReceiver.cpp.o):PsiReceiver.cpp:(.text._Z17binary_fuse_mulhimm+0x0): first defined here
/usr/bin/ld: ../lib/libPSI.a(PsiSender.cpp.o): in function `binary_fuse_max(double, double)':
PsiSender.cpp:(.text._Z15binary_fuse_maxdd+0x0): multiple definition of `binary_fuse_max(double, double)'; ../lib/libPSI.a(PsiReceiver.cpp.o):PsiReceiver.cpp:(.text._Z15binary_fuse_maxdd+0x0): first defined here
/usr/bin/ld: ../lib/libPSI.a(PsiSender.cpp.o): in function `binary_fuse8_populate(unsigned long const*, unsigned int, binary_fuse8_s*)':
PsiSender.cpp:(.text._Z21binary_fuse8_populatePKmjP14binary_fuse8_s+0x0): multiple definition of `binary_fuse8_populate(unsigned long const*, unsigned int, binary_fuse8_s*)'; ../lib/libPSI.a(PsiReceiver.cpp.o):PsiReceiver.cpp:(.text._Z21binary_fuse8_populatePKmjP14binary_fuse8_s+0x0): first defined here
/usr/bin/ld: ../lib/libPSI.a(PsiSender.cpp.o): in function `binary_fuse16_populate(unsigned long const*, unsigned int, binary_fuse16_s*)':
PsiSender.cpp:(.text._Z22binary_fuse16_populatePKmjP15binary_fuse16_s+0x0): multiple definition of `binary_fuse16_populate(unsigned long const*, unsigned int, binary_fuse16_s*)'; ../lib/libPSI.a(PsiReceiver.cpp.o):PsiReceiver.cpp:(.text._Z22binary_fuse16_populatePKmjP15binary_fuse16_s+0x0): first defined here
collect2: error: ld returned 1 exit status
I know when 2 files include a same header, this may happen. But I looked at binaryfusefilter.h, there is defines like:
#ifndef BINARYFUSEFILTER_H
#define BINARYFUSEFILTER_H
So why does this error still happen? How can I avoid it?
Hello! Love the library.
I don't feel like the README talks about the dangers of duplicate keys enough (the infinite loop). Not sure what the phrasing should be, but You should ensure that you have no duplicated keys.
doesn't give the gravity of the situation correctly.
Hi Daniel,
Thanks for making your fuse work available.
In your paper, " Binary Fuse Filters: Fast and Smaller Than Xor Filters", you say:
"The false positives may be later pruned after checking against the actual set."
Apologies if there is an obvious answer, but how would this pruning be achieved in a fuse8 filter?
Thanks.
Hi @lemire was working on distributed bloom filters.
Is it possible to merge bloom filters like we do merging in regular bit based bloom filter by oring(|) bytes.
As you said the struct of bit wise filter is very simple, and can be saved in the disk. So I tried to copy a filter to another filter named filter_clone like this:
uint64_t Seed_clone = filter.Seed;
uint32_t SegmentLength_clone = filter.SegmentLength;
uint32_t SegmentLengthMask_clone = filter.SegmentLengthMask;
uint32_t SegmentCount_clone = filter.SegmentCount;
uint32_t SegmentCountLength_clone = filter.SegmentCountLength;
uint32_t ArrayLenth_clone = filter.ArrayLength;
uint8_t *Fingerprints_clone;
binary_fuse8_t filter_clone;
filter_clone.Seed = Seed_clone;
filter_clone.SegmentLength = SegmentLength_clone;
filter_clone.SegmentLengthMask = SegmentLengthMask_clone;
filter_clone.SegmentCount = SegmentCount_clone;
filter_clone.SegmentCountLength = SegmentCountLength_clone;
filter_clone.ArrayLength = ArrayLenth_clone;
memcpy(filter_clone.Fingerprints, filter.Fingerprints, sizeof(*filter.Fingerprints));
However, Segmentation fault occured. I think it's my way of cloning filter.Fingerprints wrong. How can I do that?
Hi,
thank you for your great work.
I wonder if there is a way to get the index of the hash when I found it in the set.
for example.
I have a table like below:
7C4A8D09CA3762AF61E59520943DC26494F8941B:23174662
F7C3BC1D808E04732ADF679965CCC34CA7AE3441:7671364
B1B3773A05C0ED0176787A4F1574FF0075F7521E:3810555
5BAA61E4C9B93F3F0682250B6CF8331B7EE68FD8:3645804
3D4F2BF07DC1BE38B20CD6E46949A1071F9D0E3D:3093220
7C222FB2927D828AF22F592134E8932480637C0D:2889079
6367C48DD193D56EA7B0BAAD25B19455E529F5EE:2834058
20EABE5D64B0E216796E834F52D61FD0B70332FC:2484157
E38AD214943DAAD1D64C102FAEC29DE4AFE9DA3D:2401761
8CB2237D0679CA88DB6464EAC60DA96345513964:2333232
01B307ACBA4F54F55AAFC33BB06BBBF6CA803E9A:2224432
601F1889667EFAEBB33B8C12572835DA3F027F78:2194818
C984AED014AEC7623A54F0591DA07A85FD4B762D:1942768
EE8D8728F435FD550F83852AABAB5234CE1DA528:1593388
7110EDA4D09E062AA5E4A390B0A572AC0D2C0220:1256907
B80A9AED8AF17118E51D4D0C2D7872AE26E2109E:1141300
B0399D2029F64D445BD131FFAA399A42D2F8E7DC:1081655
40BD001563085FC35165329EA1FF5C5ECBDBBEEF:1023001
AB87D24BDC7452E55738DEB5F868E1F16DEA5ACE:980209
AF8978B1797B72ACFFF9595A5A2A373EC3D9106D:968625
when i use xor8 query 7110EDA4D09E062AA5E4A390B0A572AC0D2C0220 hash and return it in the set.
how to get the index of this hash?(index=15)
thanks.
Should be easily portable to Rust.
It is not necessary to port everything, porting the xor8_populate
and xor8_allocate
functions would be a good start.
In the paper I understood the filter requires segment length with the power of two, however the following segment:
xor_singleheader/include/binaryfusefilter.h
Lines 439 to 441 in 177cf03
I don't really understand much of the BinaryFuse datastructure yet, but while reading through the source I noted that the reverseOrder array is not cleared completely.
Here it is initialized: https://github.com/FastFilter/xor_singleheader/blob/master/include/binaryfusefilter.h#L208
Notice that it is a length of size +1.
Here it is cleared: https://github.com/FastFilter/xor_singleheader/blob/master/include/binaryfusefilter.h#L326
Notice that it s only up to size.
If the intention is to clear the whole array, the memset should be size +1.
Currently I'm working on python bindings of this C implementation of xorfilter. I was wondering if there would be any support for strings and floats in future.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.