GithubHelp home page GithubHelp logo

jahash's Introduction

Just Another Hash Library

Introduction

This library is based on splendid uthash library and is inspired by its design principles. However, jahash tries to follow *BSD sys/queue.h and sys/tree.h interfaces. Moreover, it tries to adopt more for custom hash and compare functions, allocators, blooming filters and so on and so forth. Furthermore, jahash tries to provide really thread safe access model based on custom rwlock implementation. Another goal of jahash is to reduce overhead per element. The current version uses just extra 8 bytes per hash element (uthash uses about 56 bytes per element, on the contrary).

Uthash deriving

jahash is based on algorithms from uthash hash table. That means that the primitive operations are all well tested and the performance of those operations is optimized as well. Moreover, uthash comes with the extensive API that is documented and supported by jahash.

Performance benchmark

Lookup benchmark with hit probability 50%. Hardware: AMD Opteron(tm) Processor 6344

graphics

Design principles

You need to define HASH_ENTRY element in your target structure, then you need to declare head in top-level structure by adding HASH_HEAD(name, type) somewhere in it, for example:

struct node {
	...
	HASH_ENTRY(node) hh;
}
struct top {
	...
	HASH_HEAD(, node) head;
}

Then you need to generate operations functions for your defined type. In a trivial case you should just write something like this:

/* 
 * This generates a static global variable named _hash_ops_node_hh {...}
 * and a set of auxiliary functions named _hash_op_node_hh_hash and
 * _hash_op_node_hh_cmp
 */
HASH_GENERATE_STR(node, hh);
...
/* Somewhere in the code */
void init(struct top *top) 
{
	struct node *new, *found, search;
	HASH_INIT(&top->head, node, hh);
	...
	/* Add elements */
	HASH_ADD(&head, node, hh, new);
	
	/* Search for an element */
	search.key = "test";
	found = HASH_FIND_ELT(&top->head, node, hh, &search);
	
	/* Or even simpler */
	found = HASH_FIND(&top->head, node, hh, "test");
}

This defines all standard methods (with no locking, no blooming and using strcmp as the compare routine). You can also define more complex definitions for discovering all features available. TODO describe advanced interface.

Hashing principles

jahash is designed to be able to use custom compare and hashing routines. This is extremely useful if you have, for instance, a case-insensitive hash. In that case you can just define the caseless compare and hash ops and you are done. Another useful appliance is using of keyed hash functions, such as siphash for creating secure hash tables.

TODO examples.

Locking principles

Whilst uthash requires to have external locking, jahash offers an advanced locking techniques based on individual hash bucket read/write locking. However, an additional rwlock is used for the global hash table to allow resize operations to be implemented in a safe matter. If your hash table is mostly immutable you can use very effective locking primitives to reach the maximum performance of hash tables.

TODO describe the exact use cases.

Memory management

TODO describe custom memory management

Built-in operations

TODO implement and document

Performance comparision

String keys, 1 million of insertions:

  • UTHash:

CPU

% time ./a.out < ~/local/words-random.txt
./a.out < ~/local/words-random.txt  0,37s user 0,11s system 99% cpu 0,482 total

Memory

 total heap usage: 2,000,014 allocs, 12 frees, 85,540,622 bytes allocated
  • jahash:

CPU

% time ./a.out < ~/local/words-random.txt 
./a.out < ~/local/words-random.txt  0,32s user 0,07s system 98% cpu 0,391 total

Memory

total heap usage: 2,000,013 allocs, 12 frees, 47,637,454 bytes allocated

The size of tested structure contains 16 bytes of payload leading to approximately 70Mb overhead for UTHash and 31Mb overhead for jahash.

Here are dragons

jahash's People

Contributors

vstakhov avatar

Watchers

Dave Täht avatar James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.