GithubHelp home page GithubHelp logo

mozman / bintrees Goto Github PK

View Code? Open in Web Editor NEW
77.0 5.0 25.0 564 KB

Outdated binary tree package, please switch to sortedcontainers (UNMAINTAINED)

License: Other

Python 88.48% C 11.22% Makefile 0.29%

bintrees's Introduction

Binary Tree Package

Bintrees Development Stopped

Use sortedcontainers instead: https://pypi.python.org/pypi/sortedcontainers

see also PyCon 2016 presentation: https://www.youtube.com/watch?v=7z2Ki44Vs4E

Advantages:

  • pure Python no Cython/C dependencies
  • faster
  • active development
  • more & better testing/profiling

Abstract

This package provides Binary- RedBlack- and AVL-Trees written in Python and Cython/C.

This Classes are much slower than the built-in dict class, but all iterators/generators yielding data in sorted key order. Trees can be uses as drop in replacement for dicts in most cases.

Source of Algorithms

AVL- and RBTree algorithms taken from Julienne Walker: http://eternallyconfuzzled.com/jsw_home.aspx

Trees written in Python

  • BinaryTree -- unbalanced binary tree
  • AVLTree -- balanced AVL-Tree
  • RBTree -- balanced Red-Black-Tree

Trees written with C-Functions and Cython as wrapper

  • FastBinaryTree -- unbalanced binary tree
  • FastAVLTree -- balanced AVL-Tree
  • FastRBTree -- balanced Red-Black-Tree

All trees provides the same API, the pickle protocol is supported.

Cython-Trees have C-structs as tree-nodes and C-functions for low level operations:

  • insert
  • remove
  • get_value
  • min_item
  • max_item
  • prev_item
  • succ_item
  • floor_item
  • ceiling_item

Constructor

  • Tree() -> new empty tree;
  • Tree(mapping) -> new tree initialized from a mapping (requires only an items() method)
  • Tree(seq) -> new tree initialized from seq [(k1, v1), (k2, v2), ... (kn, vn)]

Methods

  • __contains__(k) -> True if T has a key k, else False, O(log(n))
  • __delitem__(y) <==> del T[y], del[s:e], O(log(n))
  • __getitem__(y) <==> T[y], T[s:e], O(log(n))
  • __iter__() <==> iter(T)
  • __len__() <==> len(T), O(1)
  • __max__() <==> max(T), get max item (k,v) of T, O(log(n))
  • __min__() <==> min(T), get min item (k,v) of T, O(log(n))
  • __and__(other) <==> T & other, intersection
  • __or__(other) <==> T | other, union
  • __sub__(other) <==> T - other, difference
  • __xor__(other) <==> T ^ other, symmetric_difference
  • __repr__() <==> repr(T)
  • __setitem__(k, v) <==> T[k] = v, O(log(n))
  • __copy__() -> shallow copy support, copy.copy(T)
  • __deepcopy__() -> deep copy support, copy.deepcopy(T)
  • clear() -> None, remove all items from T, O(n)
  • copy() -> a shallow copy of T, O(n*log(n))
  • discard(k) -> None, remove k from T, if k is present, O(log(n))
  • get(k[,d]) -> T[k] if k in T, else d, O(log(n))
  • is_empty() -> True if len(T) == 0, O(1)
  • items([reverse]) -> generator for (k, v) items of T, O(n)
  • keys([reverse]) -> generator for keys of T, O(n)
  • values([reverse]) -> generator for values of T, O(n)
  • pop(k[,d]) -> v, remove specified key and return the corresponding value, O(log(n))
  • pop_item() -> (k, v), remove and return some (key, value) pair as a 2-tuple, O(log(n)) (synonym popitem() exist)
  • set_default(k[,d]) -> value, T.get(k, d), also set T[k]=d if k not in T, O(log(n)) (synonym setdefault() exist)
  • update(E) -> None. Update T from dict/iterable E, O(E*log(n))
  • foreach(f, [order]) -> visit all nodes of tree (0 = 'inorder', -1 = 'preorder' or +1 = 'postorder') and call f(k, v) for each node, O(n)
  • iter_items(s, e[, reverse]) -> generator for (k, v) items of T for s <= key < e, O(n)
  • remove_items(keys) -> None, remove items by keys, O(n)

slicing by keys

  • item_slice(s, e[, reverse]) -> generator for (k, v) items of T for s <= key < e, O(n), synonym for iter_items(...)
  • key_slice(s, e[, reverse]) -> generator for keys of T for s <= key < e, O(n)
  • value_slice(s, e[, reverse]) -> generator for values of T for s <= key < e, O(n)
  • T[s:e] -> TreeSlice object, with keys in range s <= key < e, O(n)
  • del T[s:e] -> remove items by key slicing, for s <= key < e, O(n)

start/end parameter:

  • if 's' is None or T[:e] TreeSlice/iterator starts with value of min_key();
  • if 'e' is None or T[s:] TreeSlice/iterator ends with value of max_key();
  • T[:] is a TreeSlice which represents the whole tree;

The step argument of the regular slicing syntax T[s:e:step] will silently ignored.

TreeSlice is a tree wrapper with range check and contains no references to objects, deleting objects in the associated tree also deletes the object in the TreeSlice.

  • TreeSlice[k] -> get value for key k, raises KeyError if k not exists in range s:e
  • TreeSlice[s1:e1] -> TreeSlice object, with keys in range s1 <= key < e1
    • new lower bound is max(s, s1)
    • new upper bound is min(e, e1)

TreeSlice methods:

  • items() -> generator for (k, v) items of T, O(n)
  • keys() -> generator for keys of T, O(n)
  • values() -> generator for values of T, O(n)
  • __iter__ <==> keys()
  • __repr__ <==> repr(T)
  • __contains__(key)-> True if TreeSlice has a key k, else False, O(log(n))

prev/succ operations

  • prev_item(key) -> get (k, v) pair, where k is predecessor to key, O(log(n))
  • prev_key(key) -> k, get the predecessor of key, O(log(n))
  • succ_item(key) -> get (k,v) pair as a 2-tuple, where k is successor to key, O(log(n))
  • succ_key(key) -> k, get the successor of key, O(log(n))
  • floor_item(key) -> get (k, v) pair, where k is the greatest key less than or equal to key, O(log(n))
  • floor_key(key) -> k, get the greatest key less than or equal to key, O(log(n))
  • ceiling_item(key) -> get (k, v) pair, where k is the smallest key greater than or equal to key, O(log(n))
  • ceiling_key(key) -> k, get the smallest key greater than or equal to key, O(log(n))

Heap methods

  • max_item() -> get largest (key, value) pair of T, O(log(n))
  • max_key() -> get largest key of T, O(log(n))
  • min_item() -> get smallest (key, value) pair of T, O(log(n))
  • min_key() -> get smallest key of T, O(log(n))
  • pop_min() -> (k, v), remove item with minimum key, O(log(n))
  • pop_max() -> (k, v), remove item with maximum key, O(log(n))
  • nlargest(i[,pop]) -> get list of i largest items (k, v), O(i*log(n))
  • nsmallest(i[,pop]) -> get list of i smallest items (k, v), O(i*log(n))

Set methods (using frozenset)

  • intersection(t1, t2, ...) -> Tree with keys common to all trees
  • union(t1, t2, ...) -> Tree with keys from either trees
  • difference(t1, t2, ...) -> Tree with keys in T but not any of t1, t2, ...
  • symmetric_difference(t1) -> Tree with keys in either T and t1 but not both
  • is_subset(S) -> True if every element in T is in S (synonym issubset() exist)
  • is_superset(S) -> True if every element in S is in T (synonym issuperset() exist)
  • is_disjoint(S) -> True if T has a null intersection with S (synonym isdisjoint() exist)

Classmethods

  • from_keys(S[,v]) -> New tree with keys from S and values equal to v. (synonym fromkeys() exist)

Helper functions

  • bintrees.has_fast_tree_support() -> True if Cython extension is working else False (False = using pure Python implementation)

Installation

from source:

python setup.py install

or from PyPI:

pip install bintrees

Compiling the fast Trees requires Cython and on Windows is a C-Compiler necessary.

Download Binaries for Windows

https://github.com/mozman/bintrees/releases

Documentation

this README.rst

bintrees can be found on GitHub.com at:

https://github.com/mozman/bintrees.git

bintrees's People

Contributors

graingert avatar mozman avatar samyaple avatar sciencectn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

bintrees's Issues

contributing guidelines

I want to add some functionality. Would you be interested in a pull request? Or should I create a bintrees-extras package that depends on bintrees.

I want the option for ct_compare to cast strings to float before comparison so that strings that contain numerical data are sorted numerically rather than lexically.

RBTree deepcopy issue

Hi!

I think, I found a little issue in RBTree deepcopy process.

I used the latest version with BUGFIX.

C:\Users\user>pip list
DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning.
bintrees (2.0.6)
pip (9.0.1)
setuptools (28.8.0)

I tried to make a deepcopy, but I did not succeed.

C:\Users\user>python
Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 20:42:59) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import copy
>>> import bintrees
>>> t1 = bintrees.RBTree()
>>> t2 = copy.deepcopy(t1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\copy.py", line 174, in deepcopy
    y = copier(memo)
  File "C:\Python27\lib\site-packages\bintrees\abctree.py", line 234, in __deepcopy__
    self.foreach(_deepcopy, order=-1)
  File "C:\Python27\lib\site-packages\bintrees\abctree.py", line 658, in foreach
    _traverse(self._root)
  File "C:\Python27\lib\site-packages\bintrees\abctree.py", line 649, in _traverse
    func(node.key, node.value)
AttributeError: 'NoneType' object has no attribute 'key'
>>>

Can you help me in the issue?

Thanking you in advance!

Thank you for supporting SortedContainers

Hi! I am the author of SortedContainers. I just wanted to thank you for your support! I have admired the bintrees project over the years and often consulted it regarding performance and API. Thank you for your contributions to Python.

Failed building wheel for bintrees

i always get this error trying to install bintrees. on mac with anaconda. pip, conda on newest version. any ideas?

Command path/anaconda3/bin/python -u -c "import setuptools, tokenize;file='/private/var/folders/m9/v2wz_z356k732q0kl6rh2cdr0000gn/T/pip-build-8frgn_9p/bintrees/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /var/folders/m9/v2wz_z356k732q0kl6rh2cdr0000gn/T/pip-_6sz72j7-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/var/folders/m9/v2wz_z356k732q0kl6rh2cdr0000gn/T/pip-build-8frgn_9p/bintrees/

Don't mark it obsolete, speed tests for sortedcontainers are mistaken

Bintrees are substantially faster in case of inexact key and value search is used. The performance tests (reported at http://www.grantjenks.com/docs/sortedcontainers/performance.html) mainly ignore key ordering in containers. Only iter() test has something common with ordering; others can be faster satisfied on any non-sorted store like hash map. Moreover, the API of SortedDict with grounding on absolute index and additional operations to retrieve real data is conceptually misdesigned.

With inexact search of ceiling key or key-value pair, bintrees performance is much better. My results on 100000 key set (FreeBSD 10.3/amd64, bintrees with Cython, Python 2.7.13):

$ ./bench
Time for AVLTree (key only): 3.88829088211
Time for RBTree (key only): 3.60424399376
Time for SortedDict (key only): 5.31183195114
Time for AVLTree (KVP): 3.37197518349
Time for RBTree (KVP): 3.34165000916
Time for SortedDict (KVP): 5.43365883827

So, bintrees are almost twice faster for this task class.

The test code follows.


N = 100000*10

if name == "main":
avld = bintrees.avltree.AVLTree()
rbd = bintrees.rbtree.RBTree()
sd = sortedcontainers.SortedDict()
for k in range(0, N, 10):
avld[k] = rbd[k] = sd[k] = k * 11
t0 = time.time()
for k in range(N):
try:
t = avld.ceiling_key(k)
except KeyError:
pass
t1 = time.time()
print "Time for AVLTree (key only):", t1 - t0
t0 = time.time()
for k in range(N):
try:
t = rbd.ceiling_key(k)
except KeyError:
pass
t1 = time.time()
print "Time for RBTree (key only):", t1 - t0
t0 = time.time()
for k in range(N):
try:
t = sd.iloc[sd.bisect_left(k)]
except (KeyError, IndexError):
pass
t1 = time.time()
print "Time for SortedDict (key only):", t1 - t0
t0 = time.time()
for k in range(N):
try:
t = avld.ceiling_item(k)
except KeyError:
pass
t1 = time.time()
print "Time for AVLTree (KVP):", t1 - t0
t0 = time.time()
for k in range(N):
try:
t = rbd.ceiling_item(k)
except KeyError:
pass
t1 = time.time()
print "Time for RBTree (KVP):", t1 - t0
t0 = time.time()
for k in range(N):
try:
t = sd.iloc[sd.bisect_left(k)]
t2 = sd[t]
except (KeyError, IndexError):
pass
t1 = time.time()
print "Time for SortedDict (KVP):", t1 - t0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.