sprokopenko / concurrent-trees Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/concurrent-trees
Automatically exported from code.google.com/p/concurrent-trees
Currently..
Iterables.count(myTree.getValuesForClosestKeys(""))
..can be used to count the number of keys/values in the radix tree.
This ticket is to add a size() method to the trees, to simplify this, and also
it may be more efficient than to calculate size as above.
Note that calculating the size of a radix tree is an expensive operation having
O(n) time complexity. However the method may be useful for debugging purposes.
Original issue reported on code.google.com by [email protected]
on 3 Dec 2013 at 10:45
Hello, unless I am confused about what to expect, I believe
getKeyValuePairsForKeysPrefixing() is not returning the correct information.
In fact, I believe it is returning "keys" that aren't even in the tree. It is
returning the values at a node, but not the full key that was stored.
What steps will reproduce the problem?
1. Run the attached TreeTest class.
Here is the output I am getting (using the most recent jar downloaded
yesterday):
**** Constructing new tree
Added key/value pair: /a/b/ -> 1
Added key/value pair: /a/blob/ -> 2
Added key/value pair: /a/blog/ -> 3
○
└── ○ /a/b
├── ○ / (1)
└── ○ lo
├── ○ b/ (2)
└── ○ g/ (3)
Keys prefixing /: {/, 1}
Keys prefixing /a/: {/, 1}
Keys prefixing /a/b/: {/a/b/, 1}
Keys prefixing /a/bl/:
Keys prefixing /a/blo/:
Keys prefixing /a/blob/: {/a/blob/, 2}
Keys prefixing /a/blog/: {/a/blog/, 3}
**** Constructing new tree
Added key/value pair: /a/b -> 1
Added key/value pair: /a/blob -> 2
Added key/value pair: /a/blog -> 3
○
└── ○ /a/b (1)
└── ○ lo
├── ○ b (2)
└── ○ g (3)
Keys prefixing /:
Keys prefixing /a:
Keys prefixing /a/b: {/a/b, 1}
Keys prefixing /a/bl: {/a/b, 1}
Keys prefixing /a/blo: {/a/b, 1}
Keys prefixing /a/blob: {/a/b, 1} {/a/blob, 2}
Keys prefixing /a/blog: {/a/b, 1} {/a/blog, 3}
It looks to me like the tree structure is correct, but
getKeyVAluePairsForKeysPrefixing() is returning the incorrect key/value pairs
for several values. For example, with first tree in the example above:
Keys prefixing /: {/, 1} <- No key "/" stored; this is the node for /a/b/
Keys prefixing /a/: {/, 1} <- Ditto; no key / was stored
I am using concurrent-trees-2.1.0.jar on Fedora 17.
Original issue reported on code.google.com by [email protected]
on 5 Oct 2013 at 7:29
Attachments:
Deploy to Maven Central per:
https://docs.sonatype.org/display/Repository/Sonatype+OSS+Maven+Repository+Usage
+Guide
Original issue reported on code.google.com by [email protected]
on 4 Jul 2012 at 2:46
Java's default UTF-16, 2-bytes-per-character string encoding, is inefficient
for strings which otherwise could be encoded with a single byte per character.
It should be possible to represent characters in the trees using only a single
byte per character, when working with compatible strings. This may reduce
memory overhead by 50%.
Original issue reported on code.google.com by [email protected]
on 20 Oct 2013 at 10:20
Expose an API to scan the input for keys stored in the tree which are prefixes
of the input.
See discussion in forum:
https://groups.google.com/forum/#!topic/concurrent-trees-discuss/_IpLEzNDFWs
Example: tree contains keys 123, 1234568, 1234569
Input: 12345690
API would return keys 123, 1234569.
This could be used for processing phone numbers.
This could be calculated in a single scan through the input, thus finding keys
which are prefixes of the input very quickly. This functionality is a subset of
InvertedRadixTree.getKeysContainedIn, and can use the same traversal algorithm.
Unit test demonstrating desired functionality:
@Test
public void testGetKeysPrefixing() throws Exception {
ConcurrentInvertedRadixTree<Integer> tree = new ConcurrentInvertedRadixTree<Integer>(nodeFactory);
tree.put("1234567", 1);
tree.put("1234568", 2);
tree.put("123", 3);
// ○
// └── ○ 123 (3)
// └── ○ 456
// ├── ○ 7 (1)
// └── ○ 8 (2)
assertEquals("[123, 1234567]", Iterables.toString(tree.getKeysPrefixing("1234567")));
assertEquals("[123, 1234567]", Iterables.toString(tree.getKeysPrefixing("12345670")));
assertEquals("[123, 1234568]", Iterables.toString(tree.getKeysPrefixing("1234568")));
assertEquals("[123, 1234568]", Iterables.toString(tree.getKeysPrefixing("12345680")));
assertEquals("[123]", Iterables.toString(tree.getKeysPrefixing("1234569")));
assertEquals("[123]", Iterables.toString(tree.getKeysPrefixing("123456")));
assertEquals("[123]", Iterables.toString(tree.getKeysPrefixing("123")));
assertEquals("[]", Iterables.toString(tree.getKeysPrefixing("12")));
assertEquals("[]", Iterables.toString(tree.getKeysPrefixing("")));
}
Original issue reported on code.google.com by [email protected]
on 7 Aug 2013 at 9:03
It would be useful to support wildcard queries.
Two approaches to be investigated (both of which will be tracked in this issue):
(1) A permuterm index on top of the ConcurrentRadixTree. This would support
queries such as "<prefix>*<suffix>" on a single tree. It may be more memory
efficient than a hash-dictionary approach. See:
http://nlp.stanford.edu/IR-book/html/htmledition/permuterm-indexes-1.html
(2) A composite of a ConcurrentRadixTree and a ConcurrentReversedRadixTree. One
tree would support prefix lookup, the other suffix lookup. Query
"prefix*suffix" may return the intersection of the results from both trees,
after some post-filtering. This second approach however, is near the territory
of a query engine on top of multiple indexes, so if implemented would not
belong in this project, but in http://code.google.com/p/cqengine/
Example usage for (1) would be:
public static void main(String[] args) {
PermutermTree<Integer> tree = new ConcurrentPermutermTree<Integer>(new DefaultCharArrayNodeFactory());
tree.put("TEST", 1);
tree.put("TOAST", 2);
tree.put("TEAM", 3);
System.out.println("Keys matching 'T*T': " + Iterables.toString(tree.getKeysMatching("T", "T"))); // prefix, suffix
}
Output would be:
Keys matching 'T*T': [TOAST, TEST]
Original issue reported on code.google.com by [email protected]
on 24 Mar 2013 at 10:19
The current implementation is not serializable. If we load a huge amount of
data each time when starting, this may limit the usage. However, if it is
serializable, we can load it once and serialize the entire tree onto the disk.
During the start-up time, we only have to de-serialize it to load the whole
tree quickly.
Original issue reported on code.google.com by [email protected]
on 25 Mar 2013 at 2:29
Please add support for querying the longest prefix match (this is not currently
exposed in the public API). A variation of this which only matches if the key
is in fact truly a prefix would be useful, otherwise the caller would have to
run an additional prefix.startsWith(key). For example, if the tree only
contains "foo", it's the longest prefix match for anything. But in the use-case
I have, I would only want to match "foo*".
Original issue reported on code.google.com by phraktle
on 19 Nov 2012 at 10:33
In my opinion a very useful method would be a boolean .contains()-method, which
check if a query is contained in the tree: This should be similar to
.getKeysStartingWith(query), but break if it finds the first path matching the
query and return true.
For example:
tree.put("TEST", 1);
tree.put("TOAST", 2);
tree.put("TEAM", 3);
tree.contains("TO") -> returns true.
Original issue reported on code.google.com by [email protected]
on 25 Sep 2014 at 1:34
What steps will reproduce the problem?
1. Create a ConcurrentSuffixTree
2. Insert some keys
3. Attempt to retrieve all keys by using .getKeysEndingWith("")
What is the expected output? What do you see instead?
I expect an iterable with all keys; I get null.
What version of the product are you using? On what operating system?
2.4.0 on Java 1.8.0_20
Please provide any additional information below.
.getKeysStartingWith("") and .getKeysEndingWith("") return all keys for
ConcurrentRadixTree and ConcurrentReversedRadixTree respectively.
Original issue reported on code.google.com by [email protected]
on 25 Oct 2014 at 9:30
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.