GithubHelp home page GithubHelp logo

cuckoofilter's Introduction

Cuckoo Filter

Cuckoo filter is a Bloom filter replacement for approximated set-membership queries. While Bloom filters are well-known space-efficient data structures to serve queries like "if item x is in a set?", they do not support deletion. Their variances to enable deletion (like counting Bloom filters) usually require much more space.

Cuckoo filters provide the flexibility to add and remove items dynamically. A cuckoo filter is based on cuckoo hashing (and therefore named as cuckoo filter). It is essentially a cuckoo hash table storing each key's fingerprint. Cuckoo hash tables can be highly compact, thus a cuckoo filter could use less space than conventional Bloom filters, for applications that require low false positive rates (< 3%). For details about the algorithm and citations please use:

"Cuckoo Filter: Practically Better Than Bloom" in proceedings of ACM CoNEXT 2014 by Bin Fan, Dave Andersen and Michael Kaminsky

This is Node.js binding for the amazing cuckoofilter C++ library.

Install

$ npm install cuckoofilter-native

Usage

Here is a simple example for the basic usage of cuckoo filter.

const CuckooFilter = require('cuckoofilter-native')

const filter = new CuckooFilter(1024)

filter.add('hello world') // returns the filter
filter.contain('hello world') // returns true
filter.size // returns 1
filter.delete('hello world') // returns the filter
filter.size // now returns 0, as we have deleted

API

A cuckoo filter supports following operations:

  • add(item): insert an item to the filter
  • contain(item): return if item is already in the filter. Note that this method may return false positive results like Bloom filters
  • delete(item): delete the given item from the filter. Note that to use this method, it must be ensured that this item is in the filter (e.g., based on records on external storage); otherwise, a false item may be deleted.
  • size: return the total number of items currently in the filter
  • bytes: return the filter size in bytes

item must be a string, but Buffer could be supported in the future. Feel free to send a PR.

Benchmarks

This library is fast:

cuckoofilter-native add x 1,097,031 ops/sec ±2.12% (82 runs sampled)
cuckoofilter-native contain x 3,286,894 ops/sec ±2.08% (84 runs sampled)
cuckoo-filter add x 2,267 ops/sec ±13.26% (16 runs sampled)
cuckoo-filter contains x 31,680 ops/sec ±4.85% (70 runs sampled)
bloomfilter add x 773,499 ops/sec ±2.38% (86 runs sampled)
bloomfilter contains x 1,866,273 ops/sec ±2.20% (85 runs sampled)

License

Apache-2.0

cuckoofilter's People

Contributors

amallia avatar apc999 avatar arnib avatar binfan999 avatar dave-andersen avatar dnbaker avatar florianjacob avatar jbapple-cloudera avatar mcollina avatar mkaminsky avatar mkuraj avatar nicknaso avatar onaips avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.