GithubHelp home page GithubHelp logo

w8r / avl Goto Github PK

View Code? Open in Web Editor NEW
111.0 4.0 32.0 453 KB

:eyeglasses: Fast AVL tree for Node and browser

Home Page: https://npm.runkit.com/avl

License: MIT License

JavaScript 100.00%
data-structure tree binary-search-tree algorithms avl-tree balanced-search-trees

avl's Introduction

AVL tree npm version

AVL-tree: fast(non-recursive) and simple(< 500 lines of code)

AVL-tree

Operation Average Worst case
Space O(n) O(n)
Search O(log n) O(log n)
Insert O(log n) O(log n)
Delete O(log n) O(log n)

Install

npm i -S avl
import AVLTree from 'avl';
const tree = new AVLTree();

Or get it from CDN

<script src="https://unpkg.com/avl"></script>
<script>
  var tree = new AVLTree();
  ...
</script>

Or use the compiled version 'dist/avl.js'.

Try it in your browser

API

  • new AVLTree([comparator], [noDuplicates:Boolean]), where compare is optional comparison function
  • tree.insert(key:any, [data:any]) - Insert item
  • tree.remove(key:any) - Remove item
  • tree.find(key):Node|Null - Return node by its key
  • tree.at(index:Number):Node|Null - Return node by its index in sorted order of keys
  • tree.contains(key):Boolean - Whether a node with the given key is in the tree
  • tree.forEach(function(node) {...}):Tree In-order traversal
  • tree.range(lo, high, function(node) {} [, context]):Tree - Walks the range of keys in order. Stops, if the visitor function returns a non-zero value.
  • tree.keys():Array<key> - Returns the array of keys in order
  • tree.values():Array<*> - Returns the array of data fields in order
  • tree.pop():Node - Removes smallest node
  • tree.popMax():Node - Removes highest node
  • tree.min():key - Returns min key
  • tree.max():key - Returns max key
  • tree.minNode():Node - Returns the node with smallest key
  • tree.maxNode():Node - Returns the node with highest key
  • tree.prev(node):Node - Predecessor node
  • tree.next(node):Node - Successor node
  • tree.load(keys:Array<*>, [values:Array<*>], presort = false):Tree - Bulk-load items
  • tree.destroy():Tree, tree.clear():Tree - Empty the tree

Comparator

function(a:key,b:key):Number - Comparator function between two keys, it returns

  • 0 if the keys are equal
  • <0 if a < b
  • >0 if a > b

The comparator function is extremely important, in case of errors you might end up with a wrongly constructed tree or would not be able to retrieve your items. It is crucial to test the return values of your comparator(a,b) and comparator(b,a) to make sure it's working correctly, otherwise you may have bugs that are very unpredictable and hard to catch.

Duplicate keys

By default, tree allows duplicate keys. You can disable that by passing true as a second parameter to the tree constructor. In that case if you would try to instert an item with the key, that is already present in the tree, it will not be inserted. However, the default behavior allows for duplicate keys, cause there are cases where you cannot predict that the keys would be unique (example: overlapping points in 2D).

Example

import Tree from 'avl';

const t = new Tree();
t.insert(5);
t.insert(-10);
t.insert(0);
t.insert(33);
t.insert(2);

console.log(t.keys()); // [-10, 0, 2, 5, 33]
console.log(t.size);   // 5
console.log(t.min());  // -10
console.log(t.max());  // 33

t.remove(0);
console.log(t.size);   // 4

Custom comparator (reverse sort)

import Tree from 'avl';

const t = new Tree((a, b) => b - a);
t.insert(5);
t.insert(-10);
t.insert(0);
t.insert(33);
t.insert(2);

console.log(t.keys()); // [33, 5, 2, 0, -10]

Bulk insert

import Tree from 'avl';

const t = new Tree();
t.load([3,2,-10,20], ['C', 'B', 'A', 'D'], true);
console.log(t.keys());   // [-10, 2, 3, 20]
console.log(t.values()); // ['A', 'B', 'C', 'D']

Benchmarks

npm run benchmark
Insert (x1000)
Bintrees x 3,742 ops/sec ±0.89% (90 runs sampled)
Functional red black tree x 1,880 ops/sec ±4.02% (78 runs sampled)
Google Closure library AVL x 622 ops/sec ±4.22% (81 runs sampled)
AVL (current) x 6,151 ops/sec ±8.50% (72 runs sampled)
- Fastest is AVL (current)

Random read (x1000)
Bintrees x 7,371 ops/sec ±2.69% (83 runs sampled)
Functional red black tree x 13,010 ops/sec ±2.93% (83 runs sampled)
Google Closure library AVL x 27.63 ops/sec ±1.04% (49 runs sampled)
AVL (current) x 12,921 ops/sec ±1.83% (86 runs sampled)
- Fastest is AVL (current)

Remove (x1000)
Bintrees x 106,837 ops/sec ±0.74% (86 runs sampled)
Functional red black tree x 25,368 ops/sec ±0.89% (88 runs sampled)
Google Closure library AVL x 31,719 ops/sec ±1.21% (89 runs sampled)
AVL (current) x 108,131 ops/sec ±0.70% (88 runs sampled)
- Fastest is AVL (current)

Adding google closure library to the benchmark is, of course, unfair, cause the node.js version of it is not optimised by the compiler, but in this case it plays the role of straight-forward robust implementation.

Develop

npm i
npm t
npm run build

License

The MIT License (MIT)

Copyright (c) 2017 Alexander Milevski [email protected]

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

avl's People

Contributors

deniscarriere avatar dependabot[bot] avatar eugene-chernyshenko avatar ouuan avatar quadramadery avatar scg82 avatar spirinvladimir avatar w8r avatar wcauchois avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

avl's Issues

Promise-based avl

Hi there,

I'm currently adapting a fork of your avl module to work with an asynchronous comparator function.
I'm doing this because I have a use case where my comparator needs to perform a database query (and the data is too large to be pre-fetched before building the tree).

Basically, avl-promise works the same, except that the insert, remove, contains, range, load and find methods all return Promises instead of immediate values -- because they all depend on the comparator function.

I don't think it would be helpful to just add async versions of those methods to your module, so that's why I'm creating a new one.

I just wanted to give you a heads up that a module very similar to your own will appear on NPM and thank you for having written avl 👍 😄

All the best

André

runkit example

I tried to run an example at

https://npm.runkit.com/avl

but I don't know the correct syntax. Could you help me figure out how to fix this so it runs?

Here's the code I'm trying to run:

var avl = require("avl");

const t = new avl.AVLTree();

t.insert(5);
t.insert(-10);
t.insert(0);
t.insert(33);
t.insert(2);

console.log(t.keys()); // [-10, 0, 2, 5, 33]
console.log(t.size);   // 5
console.log(t.min());  // -10
console.log(t.max());  // -33

t.remove(0);
console.log(t.size);   // 4

I get:

TypeError: avl.AVLTree is not a constructor
in this notebook — line 3

Bulk load is recursive

The readme stats "non-recursive" but the load function is clearly recursive.
We must state that.

feature proposal: iterable interfaces

Using iterables to walk over a large amount of items/data has two advantages:

  • In contrast to computing a full array all items, items can be read from the tree on the fly. This prevents unnecessary graph walks.
  • Iterables are well-supported within ECMAScript and libs: String, Array, Map, Set etc. are iterable.

Examples of how iterables would be useful for this lib:

for (const key of tree.keys()) {
  await userRequestedAKey()
  console.log(key)
}

for (const node of tree.range('A', 'O')) {
  console.log(node.data)
}

suggested optimization for range()

When I was writing find_ge() and find_gt() (see #34) I noticed that range() is essentially a copy of forEach() with a terminating case at the end of the range.

That is, range() will iterate all nodes from the start of the dataset to the low key.

This seems like more work than range() needs to do. range() could avoid going down a node.left if node.key is already large enough that all the nodes in the left sub-branch are guaranteed to have keys less than low

See #34 for some suggested code that could also be used in range()

Here I will take a stab at applying the same logic to range() but I haven't tested this range()...

  /**
   * Walk key range from `low` to `high` inclusive. 
   * Stops if `fn` returns a true value.
   * Walk starts (first callback) at first node >= `low`.
   * Walk ends (no callback) at first node > `high`.
   * If tree was created with !noDuplicates and tree contains 
   * more than one node with the `low` or `high`,
   * range() stops/starts at the first such node.
   * @param{Key}    low
   * @param{Key}    high
   * @param{Function} fn
   * @param{*?}     ctx
   * @return {SplayTree}
   */
  AVLTree.prototype.range = function range (low, high, fn, ctx) {
      var this$1 = this;

    var Q = [];
    var compare = this._comparator;
    var node = this._root, cmp;

    while (Q.length !== 0 || node) {
      if (node) {
        Q.push(node);
// CHANGED CODE STARTS HERE
        /* if ( this._noDuplicates) node.key >  node.left subtree nodes
         * if (!this._noDuplicates) node.key >= node.left subtree nodes
         */
        cmp = compare(node.key, low);
        if (this._noDuplicates && cmp <= 0)
          /* node.key <= low, so node.left subtree
           * cannot contain the node we want to start at.
           * if node.key === low, we can still skip node.left
           * because tree contains no duplicates.
           */
          node = null;
        else if (!this._noDuplicates && cmp < 0)
          /* node.key < low, so node.left subtree
           * cannot contain the node we want to start at.
           * if node.key === low, we must still examine node.left
           * because tree might contain duplicates and we
           * must start at first node matching low.
           */
          node = null;
        else
// CHANGED CODE ENDS HERE
          node = node.left;
      } else {
        node = Q.pop();
        cmp = compare(node.key, high);
        if (cmp > 0) {
          break;
        } else if (compare(node.key, low) >= 0) {
          if (fn.call(ctx, node)) { return this$1; } // stop if smth is returned
        }
        node = node.right;
      }
    }
    return this;
  };

suggested new API funcs (with code): find_ge() and find_gt()

Currently, find() lets us fetch the node EXACTLY equal to a key.

Currently there is no easy way to fetch the node >= or > key. range() kind of does it, but only works for the >= case, and, much more importantly, requires the user to provide a key value for "end" that is "infinity" or "past the end of any possible key," which is not always easy to create depending on the application.

Even worse, when noDuplicates is false, find() essentially returns a random node when the tree contains more than one node with the same key (whichever node happens to be higher up in the tree), which is unlikely to be what any user wants in any application.

To fix all 3 of these problems, let me offer you 2 new API funcs that you could drop in:

  /**
   * Return the first node whose key is >= `key`.
   * Return null if tree is empty or all node keys are < `key`.
   * If tree was created with !noDuplicates and tree contains 
   * more than one node with the first key >= `key`, 
   * returns the first such node.
   * @param{Key}    key
   * @return {?Node}
   */
  AVLTree.prototype.find_ge = function find_ge(key) {

    var Q = [];
    var compare = this._comparator;
    var node = this._root, cmp;

    while (Q.length !== 0 || node) {
      if (node) {
        Q.push(node);
        /* if ( this._noDuplicates) node.key >  node.left subtree nodes
         * if (!this._noDuplicates) node.key >= node.left subtree nodes
         */
        cmp = compare(node.key, key);
        if (this._noDuplicates && cmp <= 0)
          /* node.key <= key, so node.left subtree
           * cannot contain the node we want to return.
           * if node.key === key, we can still skip node.left
           * because tree contains no duplicates.
           */
          node = null;
        else if (!this._noDuplicates && cmp < 0)
          /* node.key < key, so node.left subtree
           * cannot contain the node we want to return.
           * if node.key === key, we must still examine node.left
           * because tree might contain duplicates and we
           * must return first node matching key.
           */
          node = null;
        else
          node = node.left;
      } else {
        node = Q.pop();
        if (compare(node.key, key) >= 0) {
          return node;
        }
        node = node.right;
      }
    }
    return null;
  };

  /**
   * Return the first node whose key is > `key`.
   * Return null if tree is empty or all node keys are <= `key`.
   * If tree was created with !noDuplicates and tree contains 
   * more than one node with the first key > `key`, 
   * returns the first such node.
   * @param{Key}    key
   * @return {?Node}
   */
  AVLTree.prototype.find_gt = function find_gt(key) {

    var Q = [];
    var compare = this._comparator;
    var node = this._root, cmp;

    while (Q.length !== 0 || node) {
      if (node) {
        Q.push(node);
        /* if ( this._noDuplicates) node.key >  node.left subtree nodes
         * if (!this._noDuplicates) node.key >= node.left subtree nodes
         */
        cmp = compare(node.key, key);
        if (cmp <= 0)
          /* node.key <= key, so node.left subtree
           * cannot contain the node we want to return.
           * if node.key === key, we can still skip node.left
           * regardless of this._noDuplicates.
           */
          node = null;
        else
          node = node.left;
      } else {
        node = Q.pop();
        if (compare(node.key, key) >= 0) {
          return node;
        }
        node = node.right;
      }
    }
    return null;
  };

NOTE I have tested find_ge() somewhat and have not yet tested find_gt() so be sure to check them over...

It would be nice to also include find_lt() and find_le() which return the HIGHEST node satisfying the condition and, if noDuplicates is false, return the last node in a possible group of nodes that have the highest key.

Worth refactoring to ES5?

Might be worth looking into simply refactoring this repo to pure ES5, that way you could eliminate the entire Rollup build process (nothing against Rollup), but this repo doesn't seem very large and I think there's more lines of code in the Rollup config then actual lines of source code.

A good example of simple ES5 Class based library is rbush (pretty sure you know this one already 😄).

isEmpty() doesnt seem to work

I'm using the tree and a few times i have been checking for is empty and I have been getting true when there is one node still in there so i have to do tree.size >= 1 instead. If we can get the isEmpty function to work all the better because i hate hard coding anything. Cheers mate!

suggestion: let the remove operation preserve node identity

in C++ STL's term, don't invalidate iterators on removal of elements, except those pointing to the removed node.

(if the nodes returned in various APIs are to be regarded as "iterators", with the assumption that users don't tamper with it, which is common in the javascript world)

the modification is rather straightforward. in remove function, instead of assigning the data and key of the "lifted node" to the "deleted node", maintain the left, right, parentreferences to/from the "lifted node" and cut the "deleted node"'s outward references (but preserve the key and data).

within my knowledge of the currently implemented APIs, the remove operation is the only one that breaks this. (more care may be needed if split, join are to be added)

this allows iterator usages to work (there might not be much, but i really have code using iterators the iterator way). without this property, though workaround exists, one has to find by .key, prev/next, store .key again and again, that's hurts both readability (+code size) and overall performance.

also returning the node deleted makes more sense (may be inserted again with the fictional insertNode API).

there might be performance degradation due to more operations to do. but since the remove operation is the fastest operation here, it's nice to have this property with a little price. or maybe removeGracefully?

Typescript definition? :)

@w8r Willing to host a Typescript definition built-in this repo or want me to push one to DefinitelyTyped?

Shouldn't be too complicated of a Type definition.

split/join functionality

I might be wrong, but i'm missing split/join functionality.

Unluckily i need to code a B+ Tree, and the leaf needs to be ordered somehow efficiently.

I jumped into associative arrays, but even if it's ordered you don't know anything about the "real" index, where it's inserted.

Tree crashes on insert

Error: TypeError: Cannot read property 'balanceFactor' of null
Line: > 607 | if (parent.right.balanceFactor === 1) rotateRight(parent.right);

Here is my code:

  const compareKeys = (lhs, rhs) =>
  {
    lhs = Array.isArray(lhs) ? lhs : lhs.key;
    rhs = Array.isArray(rhs) ? rhs : rhs.key;

    const lhsLength = lhs.length;
    const rhsLength = rhs.length;

    const length = Math.min(lhsLength, rhsLength);

    for (let i = 0; i < length; ++i)
    {
      if (lhs[i] < rhs[i])
      {
        return -1;
      }

      if (lhs[i] > rhs[i])
      {
        return -1;
      }
    }

    return lhsLength - rhsLength;
  };

  let cache = new Tree(compareKeys);

  cache.insert([1, 2]);
  cache.insert([2, 2]);
  cache.insert([2, 3]);

Tree is not a function

I am seeing an error
TypeError: Tree is not a function
I am using it in react after importing it.
import Tree from 'avl';

Merging @foxglove/avl upstream?

I just released the @foxglove/avl package to unblock waiting on some of the open PRs to this repo to merge, implementations of new functions such as findLessThan[OrEqual], findGreaterThan[OrEqual], cleaning up the naming convention to look more like ES6 containers such as Map<K, V> and Set<K>, and porting to TypeScript. Is there interest in merging this code into this repo and the avl package? If so, I can deprecate the @foxglove/avl package and direct users here. I'm also happy to continue maintaining @foxglove/avl if you'd prefer to keep the existing codebase.

Large node_modules bundle (216K)

For a (<500 lines of code) library this is pretty large repo once installed in the node_modules (216K).

$ du -h node_modules/avl
4.0K    avl/bench
 16K    avl/build
 96K    avl/dist
 24K    avl/src
 48K    avl/tests
216K    avl

One thing which helps prevent this is to include files in the package.json to exclude some of the unwanted files.

Or refactoring to pure ES5 would make the repo roughly ~24KB

{
  "name": "avl",
  "version": "1.3.0",
  "author": "Alexander Milevski <[email protected]>",
  "license": "MIT",
  "description": "AVL tree",
  "main": "dist/avl.js",
  "jsnext:main": "index",
  "module": "src/index",
  "types": "src/index.d.ts",
  "files": [
    "src",
    "dist"
  ],
  "repository": {
    "type": "git",
    "url": "https://github.com/w8r/avl.git"
  },

Add removeNode

The remove function behaviour is unknown when there are duplicates.
So it would be better to have a function to remove a specific node.

some suggested documentation enhancements for range() and noDuplicates

Thank you for your awesome library...so nice to have something simple and short.

Here are some improvements to add to the documentation, BOTH in the source code and under API in README.md:

range() does not clearly explain the boundary conditions > vs >= at start and end....
OLD
   * Walk key range from `low` to `high`. Stops if `fn` returns a value.
NEW
   * Walk key range from `low` to `high` inclusive. 
   * Stops if `fn` returns a true value.
   * Walk starts (first callback) at first node >= `low`.
   * Walk ends (no callback) at first node > `high`.
   * If tree was created with !noDuplicates and tree contains 
   * more than one node with equal keys >= `low` or > `high`,
   * range() stops/starts at the first such node.

The following functions have a major caveat when noDuplicates is false:

find()
OLD
   * Find node by key
NEW
   * Find node by key.
   * WARNING: If tree was created with !noDuplicates and tree contains 
   * more than one node with `key`, returns a RANDOM node with `key`.
   * See find_ge or find_gt if duplicate order is important to you.

(I will submit the code for find_ge() and find_gt() in a separate GitHub issue)

insert()
OLD
   * Insert a node into the tree
NEW
   * Insert a node into the tree.
   * If tree was created with noDuplicates and tree already contains 
   * one or more nodes with `key`, does not insert anything and returns null.
   * Otherwise returns newly inserted node.
   * If tree was created with !noDuplicates and tree already contains 
   * one or more nodes with `key`, inserts new node before existing nodes.

remove()
OLD
   * Removes the node from the tree. If not found, returns null.
NEW
   * Removes the node from the tree. If not found, returns null.
   * WARNING: If tree was created with !noDuplicates and tree contains 
   * more than one node with `key`, removes a RANDOM node with `key`
   * and leaves the other nodes with key in the tree.

Support pop of the maximum

Just the same as pop() but for the maximum.

How can we call this function?
It would be cool to have popMin and popMax

find(key) not working

I've tried the following code taken from the docs:
const t = new AVLTree();
t.load([3,2,-10,20], ['C', 'B', 'A', 'D']);

On top of it I added that piece of code at the end:
[-10,20,2,3].map(x => t.find(x))

result : [{…}, {…}, null, null]

I'm pretty positive that's not the wanted output. Am I doing something wrong? Or is it a real problem?

Either way, I can implement such search function on my own, but I'm just curious to know why it's not working

better function naming at API for "pop" & "popMax"

Dear Alexander,

During my last update from your branch I've found new function "popMax" for remove rightest node.
So now we have fully completed API for removing rightest and leftest nodes. All possible 2 direction at 2d AVL tree are covered.
I guess name "pop" function has some common with function from JavaScript Array prototype:

  1. same name
  2. return removed value/item
  3. remove item from data-structure's boundary

How to get better understanding what is "popMax" function. Looks like it has same functionality as "pop" but with "Max". What about exactly "Max"? Is it correct logic that "pop" function behavior means "popMin".

I personally fill a gap into semantic size around "pop" and "popMax" functions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.