GithubHelp home page GithubHelp logo

thinkaurelius / titan Goto Github PK

View Code? Open in Web Editor NEW
5.2K 404.0 1.0K 89.11 MB

Distributed Graph Database

Home Page: http://titandb.io

License: Apache License 2.0

XSLT 0.03% Java 75.22% Shell 6.40% Groovy 0.80% HTML 0.12% Makefile 0.02% Ruby 17.02% Batchfile 0.39%

titan's Introduction

Titan is a highly scalable graph database optimized for storing and querying large graphs with billions of vertices and edges distributed across a multi-machine cluster. Titan is a transactional database that can support thousands of concurrent users, complex traversals, and analytic graph queries.

Learn More

The project homepage contains more information on Titan and provides links to documentation, getting-started guides and release downloads.

titan's People

Contributors

achinthagunasekara avatar astn avatar bdeggleston avatar bellingard avatar boliza avatar boorad avatar brennonyork avatar bryncooke avatar crisweber avatar dalaro avatar danielthomas avatar dkuppitz avatar dmill-bz avatar elubow avatar graben1437 avatar haebin avatar joeferner avatar joshsh avatar mbroecheler avatar mikedias avatar mmcm avatar mpouttuclarke avatar okram avatar pluradj avatar spmallette avatar strongh avatar timwu avatar twilmes avatar vshchukin avatar xedin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

titan's Issues

Repeatedly opening and closing the database using the Astyanax adapter leads to Socket Error

Verified that Titan does properly clean up all Astyanax resources and filed an issue with Astyanax:
Netflix/astyanax#105

The exception is:

6325301 [RetryService : 10.101.41.184(10.101.41.184):9160] DEBUG com.netflix.astyanax.thrift.ThriftConverter - Too many open files
6325301 [RetryService : 10.101.41.184(10.101.41.184):9160] ERROR org.apache.thrift.transport.TSocket - Could not configure socket.
java.net.SocketException: Too many open files
at java.net.Socket.createImpl(Socket.java:414)
at java.net.Socket.getImpl(Socket.java:477)
at java.net.Socket.setSoLinger(Socket.java:918)
at org.apache.thrift.transport.TSocket.initSocket(TSocket.java:116)
at org.apache.thrift.transport.TSocket.(TSocket.java:107)
at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$1.open(ThriftSyncConnectionFactoryImpl.java:156)
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.reconnect(SimpleHostConnectionPool.java:314)
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.access$100(SimpleHostConnectionPool.java:59)
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool$1.run(SimpleHostConnectionPool.java:285)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)

getPropertyKey and getEdgeLabel should throw exceptions

Hello !

I noticed that when doing

tx.getPropertyKey("someKey");

or

tx.getEdgeLabel("someLabel");

Instead of getting an

IllegalArgumentException
just like the documentation says, Titan silently creates the given type name and returns it.
This caused me a bit of trouble : albeit this was my bad for using getX()instead of getType(),
I think Titan should have angrily thrown exceptions at my face :D

Thanks,
Dan

NPE in Mutation when deletions is null

NPE is thrown in Mutation when deletions is null, however deletions is always null when mutateIndex is called from addIndexEntry.

Stack trance:

java.lang.NullPointerException
at java.util.ArrayList.addAll(Unknown Source)
at com.thinkaurelius.titan.diskstorage.writeaggregation.Mutation.merge(Mutation.java:44)
at com.thinkaurelius.titan.diskstorage.writeaggregation.BufferStoreMutator.mutate(BufferStoreMutator.java:59)
at com.thinkaurelius.titan.diskstorage.writeaggregation.BufferStoreMutator.mutateIndex(BufferStoreMutator.java:45)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.addIndexEntry(StandardTitanGraph.java:906)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.persist(StandardTitanGraph.java:696)2012-07-19 12:23:26,768-INFO -AbstractFormatter OBJY / IG Error

at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.save(StandardTitanGraph.java:642)
at com.thinkaurelius.titan.graphdb.transaction.StandardPersistTitanTx.commit(StandardPersistTitanTx.java:171)
at com.thinkaurelius.titan.graphdb.blueprints.TitanBlueprintsTransaction.stopTransaction(TitanBlueprintsTransaction.java:30)

Data Types of Key Indices

Consider making key indices more forgiving when it comes to data types. For instance:

gremlin> conf = new BaseConfiguration()
==>org.apache.commons.configuration.BaseConfiguration@3f72c47b
gremlin> conf.setProperty("storage.backend", "cassandra")
==>null
gremlin> conf.setProperty("storage.hostname", "127.0.0.1")
==>null
gremlin> g  = TitanFactory.open(conf)
==>standardtitangraph[cassandra]
gremlin> g.createKeyIndex("longIndex", Vertex.class)
==>null
gremlin> g.createKeyIndex("integerIndex", Vertex.class)
==>null
gremlin> g.stopTransaction(SUCCESS)
==>null
gremlin> v = g.addVertex(null)
==>v[4]
gremlin> v.setProperty("longIndex", 1000l)
==>null
gremlin> v.setProperty("integerIndex", 1000)
==>null
gremlin> g.stopTransaction(SUCCESS)
==>null
gremlin> g.V("longIndex", 1000)
gremlin> g.V("longIndex", 1000l)
==>v[4]
gremlin> g.V("integerIndex", 1000)
==>v[4]
gremlin> g.V("integerIndex", 1000l)
gremlin>

It might be good to just treat integers/longs as general numerics so that it doesn't create confusion for new users.

NPE on transaction commit

Hi !

I am currently trying Titan out (snapshot 0.1), and I can't commit any transactions I make.
My code goes like this:

    TitanTransaction t = g.startThreadTransaction();
    [getType for TitanKeys and TitanLabels]
    [adding vertices and edges]
    t.stopTransaction(Conclusion.SUCCESS);

(i've also tried t.commit(), in vain)

in the end I always get the following:

ERROR ContainerResponse - The RuntimeException could not be mapped to a response, re-throwing to the HTTP container
java.lang.NullPointerException
at com.thinkaurelius.titan.graphdb.transaction.StandardPersistTitanTx.isDeletedRelation(StandardPersistTitanTx.java:64)
at com.thinkaurelius.titan.graphdb.relations.factory.RelationFactoryUtil.connectRelation(RelationFactoryUtil.java:15)
at com.thinkaurelius.titan.graphdb.relations.factory.StandardPersistedRelationFactory.createExistingProperty(StandardPersistedRelationFactory.java:41)
at com.thinkaurelius.titan.graphdb.types.manager.StandardTypeFactory.createExistingType(StandardTypeFactory.java:37)
at com.thinkaurelius.titan.graphdb.types.manager.SimpleTypeManager.getType(SimpleTypeManager.java:149)
at com.thinkaurelius.titan.graphdb.types.manager.SimpleTypeManager.getType(SimpleTypeManager.java:168)
at com.thinkaurelius.titan.graphdb.transaction.AbstractTitanTx.getType(AbstractTitanTx.java:247)
at com.thinkaurelius.titan.graphdb.database.util.TypeSignature.parseTypes(TypeSignature.java:41)
at com.thinkaurelius.titan.graphdb.database.util.TypeSignature.(TypeSignature.java:26)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.getSignature(StandardTitanGraph.java:856)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.getEntry(StandardTitanGraph.java:761)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.getEntry(StandardTitanGraph.java:703)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.persist(StandardTitanGraph.java:689)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.save(StandardTitanGraph.java:642)
at com.thinkaurelius.titan.graphdb.transaction.StandardPersistTitanTx.commit(StandardPersistTitanTx.java:171)
at com.thinkaurelius.titan.graphdb.blueprints.TitanBlueprintsTransaction.stopTransaction(TitanBlueprintsTransaction.java:30)

I am using a Cassandra backend, and had no problem creating types etc., although each time I try to use a transaction it gives me this error.

After looking at both stack trace and source, I think the NPE is caused by the " deletedEdges = null; " line before calling graphdb.save(...), in the TitanTransaction.commit() method. (although I only took a quick look so you might prove me wrong here).

Thanks for your time :)
(Please do not hesitate to ask for more details if I omitted important informations)

IllegalArgument : invalid count for upper bound

Hello !

While doing some hard work on my graph which loads about 1M nodes and 15M edges in memory (well, causes Titan to keep those in cache, at least),
I encountered the following exception when trying to add a few nodes and edges:

java.lang.IllegalArgumentException: Invalid count for bound:4611686018427387903
        at com.thinkaurelius.titan.graphdb.idmanagement.IDManager.getEdgeID(IDManager.java:169)
        at com.thinkaurelius.titan.graphdb.database.idassigner.SimpleVertexIDAssigner.nextEdgeID(SimpleVertexIDAssigner.java:91)
        at com.thinkaurelius.titan.graphdb.database.idassigner.SimpleVertexIDAssigner.getNewID(SimpleVertexIDAssigner.java:52)
        at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.assignID(StandardTitanGraph.java:526)
        at com.thinkaurelius.titan.graphdb.transaction.AbstractTitanTx.registerNewEntity(AbstractTitanTx.java:111)
        at com.thinkaurelius.titan.graphdb.relations.factory.StandardPersistedRelationFactory.createNewRelationship(StandardPersistedRelationFactory.java:116)
        at com.thinkaurelius.titan.graphdb.transaction.AbstractTitanTx.addEdge(AbstractTitanTx.java:210)
    at (my program)

Do you have any idea of why this may be happening?
I'm okay with fixing it by myself if need be, but I don't really understand how I ended up with such a high value in the ID counter.

Not sure if it helps, but when I manually take a look at the titan_ids CF (I'm using a Cassandra backend), I can see the latest allocated blocks:

0 (Node) -> [2300001;2400001)
1 (Edge) -> [9900001;10000001)
2 (PropertyType) -> [1100001;1200001)

Thanks for your time :)

Why standardtitangraph[...] as the toString().

When using both embedded mode and over Cassandra, Titan's toString() is standardtitan. Can we simply name the graph class TitanGraph so its consistent with other Blueprints implementations?

One sided query constraints causes exception

gremlin> hercules.outE('battled').has('time',T.gt,1).inV.name
0
Display stack trace? [yN] y
java.lang.ArrayIndexOutOfBoundsException: 0
at com.thinkaurelius.titan.graphdb.query.QueryUtil.hasFirstKeyConstraint(QueryUtil.java:54)
at com.thinkaurelius.titan.graphdb.loadingstatus.BasicLoadingStatus.loadedEdges(BasicLoadingStatus.java:57)
at com.thinkaurelius.titan.graphdb.vertices.PersistStandardTitanVertex.loadedEdges(PersistStandardTitanVertex.java:58)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.loadRelations(StandardTitanGraph.java:302)
at com.thinkaurelius.titan.graphdb.transaction.StandardPersistTitanTx.loadRelations(StandardPersistTitanTx.java:148)
at com.thinkaurelius.titan.graphdb.vertices.AbstractTitanVertex.ensureLoadedEdges(AbstractTitanVertex.java:88)
at com.thinkaurelius.titan.graphdb.vertices.StandardTitanVertex.getRelations(StandardTitanVertex.java:61)
at com.thinkaurelius.titan.graphdb.query.AtomicTitanQuery.edges(AtomicTitanQuery.java:436)
at com.thinkaurelius.titan.graphdb.query.ComplexTitanQuery.edges(ComplexTitanQuery.java:105)
at com.tinkerpop.gremlin.pipes.transform.QueryPipe.processNextStart(QueryPipe.java:80)
at com.tinkerpop.gremlin.pipes.transform.QueryPipe.processNextStart(QueryPipe.java:23)
at com.tinkerpop.pipes.AbstractPipe.next(AbstractPipe.java:81)
at com.tinkerpop.pipes.transform.IdentityPipe.processNextStart(IdentityPipe.java:20)
at com.tinkerpop.pipes.AbstractPipe.next(AbstractPipe.java:81)
at com.tinkerpop.gremlin.pipes.transform.EdgesVerticesPipe.processNextStart(EdgesVerticesPipe.java:37)
at com.tinkerpop.gremlin.pipes.transform.EdgesVerticesPipe.processNextStart(EdgesVerticesPipe.java:12)
at com.tinkerpop.pipes.AbstractPipe.next(AbstractPipe.java:81)
at com.tinkerpop.gremlin.pipes.transform.PropertyPipe.processNextStart(PropertyPipe.java:29)
at com.tinkerpop.pipes.AbstractPipe.hasNext(AbstractPipe.java:90)
at com.tinkerpop.pipes.util.Pipeline.hasNext(Pipeline.java:105)
at com.tinkerpop.gremlin.groovy.console.ResultHookClosure.call(ResultHookClosure.java:38)
at groovy.lang.Closure.call(Closure.java:425)
at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSite.invoke(PogoMetaMethodSite.java:226)
at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.call(PogoMetaMethodSite.java:64)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
at org.codehaus.groovy.tools.shell.Groovysh.setLastResult(Groovysh.groovy:324)
at org.codehaus.groovy.tools.shell.Groovysh.this$3$setLastResult(Groovysh.groovy)
at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:233)
at groovy.lang.MetaClassImpl.setProperty(MetaClassImpl.java:2384)
at groovy.lang.MetaClassImpl.setProperty(MetaClassImpl.java:3312)
at org.codehaus.groovy.tools.shell.Shell.setProperty(Shell.groovy)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.setGroovyObjectProperty(ScriptBytecodeAdapter.java:528)
at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:152)
at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:114)
at org.codehaus.groovy.tools.shell.Shell$leftShift$0.call(Unknown Source)
at org.codehaus.groovy.tools.shell.ShellRunner.work(ShellRunner.groovy:88)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)
at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:233)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1047)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:128)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:148)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.work(InteractiveShellRunner.groovy:100)
at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:267)
at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:52)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:137)
at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:57)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:233)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1047)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:128)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:148)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:66)
at com.tinkerpop.gremlin.groovy.console.Console.(Console.java:50)
at com.tinkerpop.gremlin.groovy.console.Console.(Console.java:57)
at com.tinkerpop.gremlin.groovy.console.Console.main(Console.java:62)
gremlin>

Limit applied client side for HBase storage adapter

Although getSlice applies columnStart and columnEnd filtering server-side, the total column count limit is only applied client-side. This limit is obviously better applied server-side. However, this might require deploying a Filter class to the HBase server(s), which makes deployment more annoying for prospective users.

Check for Iterables that also implement Iterator

For efficiency reasons, some Iterables also implemented Iterator and returned themselves. This is not the right behavior and needs to be fixed (i.e. the iterator needs to be an independent object).

Loading millions of edges at once on a single vertex causes timeouts in Cassandra

When loading a million edges which are all incident on the same vertex into a TitanGraph backed by a Cassandra cluster with more than one node, a Cassandra internal timeout occurs which aborts the loading process.
This behavior is specific to having such a "supernode" with a lot of incident edges and loading all of these edges at once. Also, whether or not this behavior is observed depends on the hardware. For some systems, increasing the RPC timeout parameter solves the issue on others it does not.

Titan against HBase in Gremlin shell hangs with Oracle JDK 7 on OS X

[19:39:39][~/dev/opensource/aurelius/titan/target/titan-0.1-SNAPSHOT-standalone/bin]$ ./gremlin.sh

         \,,,/
         (o o)
-----oOOo-(_)-oOOo-----
gremlin> conf = new BaseConfiguration()              
==>org.apache.commons.configuration.BaseConfiguration@1cbcfd
gremlin> conf.setProperty('storage.backend', 'hbase')
==>null
gremlin> conf.setProperty('storage.hostname', '127.0.0.1')
==>null
gremlin> g = TitanFactory.open(conf)
2012-08-25 19:40:43.554 java[27340:1c03] Unable to load realm info from SCDynamicStore
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.3-1240972, built on 02/06/2012 10:48 GMT
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Client environment:host.name=10.0.1.100
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Client environment:java.version=1.7.0_06
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Client environment:java.home=/Library/Java/JavaVirtualMachines/jdk1.7.0_06.jdk/Contents/Home/jre
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Client environment:java.class.path=./../lib/activation-1.1.jar:./../lib/antlr-2.7.7.jar:./../lib/antlr-3.2.jar:./../lib/antlr-runtime-3.2.jar:./../lib/asm-3.2.jar:./../lib/asm-analysis-3.2.jar:./../lib/asm-commons-3.2.jar:./../lib/asm-tree-3.2.jar:./../lib/asm-util-3.2.jar:./../lib/astyanax-1.0.3.jar:./../lib/avro-1.4.0-cassandra-1.jar:./../lib/avro-ipc-1.5.3.jar:./../lib/blueprints-core-2.1.0.jar:./../lib/cassandra-all-1.1.0.jar:./../lib/cassandra-thrift-1.1.0.jar:./../lib/colt-1.2.0.jar:./../lib/commons-beanutils-1.7.0.jar:./../lib/commons-beanutils-core-1.8.0.jar:./../lib/commons-cli-1.2.jar:./../lib/commons-codec-1.4.jar:./../lib/commons-collections-3.2.1.jar:./../lib/commons-configuration-1.6.jar:./../lib/commons-digester-1.8.jar:./../lib/commons-el-1.0.jar:./../lib/commons-httpclient-3.1.jar:./../lib/commons-io-2.0.1.jar:./../lib/commons-lang-2.4.jar:./../lib/commons-logging-1.1.1.jar:./../lib/commons-math-2.1.jar:./../lib/commons-net-1.4.1.jar:./../lib/commons-pool-1.5.5.jar:./../lib/compress-lzf-0.8.4.jar:./../lib/concurrent-1.3.4.jar:./../lib/concurrentlinkedhashmap-lru-1.2.jar:./../lib/core-3.1.1.jar:./../lib/gremlin-groovy-2.1.0.jar:./../lib/gremlin-java-2.1.0.jar:./../lib/groovy-1.8.6.jar:./../lib/guava-12.0.jar:./../lib/hadoop-core-1.0.0.jar:./../lib/hbase-0.92.1.jar:./../lib/high-scale-lib-1.1.1.jar:./../lib/hsqldb-1.8.0.10.jar:./../lib/httpclient-4.0.1.jar:./../lib/httpcore-4.0.1.jar:./../lib/jackson-core-asl-1.8.5.jar:./../lib/jackson-jaxrs-1.5.5.jar:./../lib/jackson-mapper-asl-1.8.5.jar:./../lib/jackson-xc-1.5.5.jar:./../lib/jamm-0.2.5.jar:./../lib/jamon-runtime-2.3.1.jar:./../lib/jansi-1.5.jar:./../lib/jasper-compiler-5.5.23.jar:./../lib/jasper-runtime-5.5.23.jar:./../lib/javax.servlet-api-3.0.1.jar:./../lib/jaxb-api-2.1.jar:./../lib/jaxb-impl-2.2.3-1.jar:./../lib/je-4.0.92.jar:./../lib/jersey-core-1.4.jar:./../lib/jersey-json-1.11.jar:./../lib/jersey-server-1.4.jar:./../lib/jets3t-0.7.1.jar:./../lib/jettison-1.3.jar:./../lib/jetty-6.1.26.jar:./../lib/jetty-util-6.1.26.jar:./../lib/jline-0.9.94.jar:./../lib/joda-time-2.0.jar:./../lib/jruby-complete-1.6.5.jar:./../lib/json-simple-1.1.jar:./../lib/jsp-2.1-6.1.14.jar:./../lib/jsp-api-2.1-6.1.14.jar:./../lib/jsr305-1.3.9.jar:./../lib/kfs-0.3.jar:./../lib/kryo-1.04.jar:./../lib/libthrift-0.7.0.jar:./../lib/log4j-1.2.16.jar:./../lib/metrics-core-2.0.3.jar:./../lib/minlog-1.2.jar:./../lib/netty-3.2.4.Final.jar:./../lib/org.apache.servicemix.bundles.commons-csv-1.0-r706900_3.jar:./../lib/oro-2.0.8.jar:./../lib/pipes-2.1.0.jar:./../lib/protobuf-java-2.4.0a.jar:./../lib/reflectasm-1.01.jar:./../lib/rexster-core-2.1.0.jar:./../lib/servlet-api-2.5-6.1.14.jar:./../lib/servlet-api-2.5.jar:./../lib/slf4j-api-1.6.1.jar:./../lib/slf4j-log4j12-1.6.1.jar:./../lib/snakeyaml-1.6.jar:./../lib/snappy-java-1.0.4.1.jar:./../lib/snaptree-0.1.jar:./../lib/stax-api-1.0.1.jar:./../lib/stringtemplate-3.2.jar:./../lib/titan-0.1-SNAPSHOT.jar:./../lib/uuid-3.2.0.jar:./../lib/velocity-1.7.jar:./../lib/xmlenc-0.52.jar:./../lib/zookeeper-3.4.3.jar:.:
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/Users/saslani/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/var/folders/lm/6b9gd41n7sg8fv7pxf9hczj00000gn/T/
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Client environment:os.name=Mac OS X
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Client environment:os.arch=x86_64
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Client environment:os.version=10.7.4
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Client environment:user.name=saslani
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Client environment:user.home=/Users/saslani
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Client environment:user.dir=/Users/saslani/dev/opensource/aurelius/titan/target/titan-0.1-SNAPSHOT-standalone/bin
12/08/25 19:40:43 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=127.0.0.1:2181 sessionTimeout=180000 watcher=hconnection
12/08/25 19:40:43 INFO zookeeper.ClientCnxn: Opening socket connection to server /127.0.0.1:2181
12/08/25 19:40:43 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is [email protected]
12/08/25 19:40:43 INFO client.ZooKeeperSaslClient: Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.
12/08/25 19:40:43 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session
12/08/25 19:40:43 INFO zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x139605342f50009, negotiated timeout = 40000
12/08/25 19:40:43 INFO client.HBaseAdmin: Started disable of titan
12/08/25 19:40:45 INFO client.HBaseAdmin: Disabled titan
12/08/25 19:40:50 INFO client.HBaseAdmin: Started enable of titan

...console hangs...

Config:

Titan 0.1
HBase 0.94

$ mvn -v
Apache Maven 3.0.4 (r1232337; 2012-01-17 02:44:56-0600)
Maven home: /usr/local/Cellar/maven/3.0.4/libexec
Java version: 1.7.0_06, vendor: Oracle Corporation
Java home: /Library/Java/JavaVirtualMachines/jdk1.7.0_06.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.7.4", arch: "x86_64", family: "mac"

The same commands work when the Gremlin shell and HBase are started against Apple's JDK 1.6.0_33

Outdated wiki information

Hi :)

This is low-priority, but I thought this was worth saying:

  • storage.transactions default value is true, while the wiki says it's false
  • flush-ids default value is true, while the wiki says it's false

(seen in titan/titan0.1 branch)

Do no use hyperbole in error messages.

I would avoid such "human exceptions":

Need to define particular key as indexed before it is being used!

The Blueprints style is, no punctuation.

It is not possible to set a key as indexable once it has been used

Reconnecting to Titan From Gremlin

I'm not sure if this is a problem or not. it is an off-case for sure which likely wouldn't happen in the real-world, but during testing and tearing stuff down it's an irritant and i'm not sure if it's not going to manifest as a problem somewhere else so...

gremlin> conf = new BaseConfiguration()
==>org.apache.commons.configuration.BaseConfiguration@301abf87
gremlin> conf.setProperty("storage.backend", "cassandra")
==>null
gremlin> conf.setProperty("storage.hostname", "127.0.0.1")
==>null
gremlin> g = TitanFactory.open(conf)
==>standardtitangraph[cassandra]
gremlin> g.createKeyIndex("someid", Vertex.class)
==>null
gremlin> g.stopTransaction(SUCCESS)
==>null
gremlin> v = g.addVertex(null)
==>v[4]
gremlin> v.setProperty("someid", 123)
==>null
gremlin> g.stopTransaction(SUCCESS)
==>null
gremlin> g.shutdown()
==>null

-- stop and restart cassandra

gremlin> g = TitanFactory.open(conf)
==>standardtitangraph[cassandra]
gremlin> g.shutdown()
==>null

-- stop cassandra, delete the cassandra related data directories and restart cassandra

gremlin> g = TitanFactory.open(conf)
Exception in storage backend.
Display stack trace? [yN] y
com.thinkaurelius.titan.core.GraphStorageException: Exception in storage backend.
        at com.thinkaurelius.titan.diskstorage.cassandra.thriftpool.UncheckedGenericKeyedObjectPool.borrowObject(UncheckedGenericKeyedObjectPool.java:44)
        at com.thinkaurelius.titan.diskstorage.cassandra.thriftpool.UncheckedGenericKeyedObjectPool.genericBorrowObject(UncheckedGenericKeyedObjectPool.java:74)
        at com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftOrderedKeyColumnValueStore.getSlice(CassandraThriftOrderedKeyColumnValueStore.java:157)
        at com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftOrderedKeyColumnValueStore.getSlice(CassandraThriftOrderedKeyColumnValueStore.java:192)
        at com.thinkaurelius.titan.diskstorage.util.OrderedKeyColumnValueIDManager.getIDBlock(OrderedKeyColumnValueIDManager.java:68)
        at com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftStorageManager.getIDBlock(CassandraThriftStorageManager.java:208)
        at com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool.renewBuffer(StandardIDPool.java:73)
        at com.thinkaurelius.titan.graphdb.database.idassigner.StandardIDPool.<init>(StandardIDPool.java:42)
        at com.thinkaurelius.titan.graphdb.database.idassigner.SimpleVertexIDAssigner.<init>(SimpleVertexIDAssigner.java:42)
        at com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration.getIDAssigner(GraphDatabaseConfiguration.java:308)
        at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.<init>(StandardTitanGraph.java:83)
        at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:60)
        at com.thinkaurelius.titan.core.TitanFactory$open.call(Unknown Source)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:42)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
        at groovysh_evaluate.run(groovysh_evaluate:39)
        at groovysh_evaluate$run.call(Unknown Source)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:42)
        at groovysh_evaluate$run.call(Unknown Source)
        at org.codehaus.groovy.tools.shell.Interpreter.evaluate(Interpreter.groovy:67)
        at org.codehaus.groovy.tools.shell.Interpreter$evaluate.call(Unknown Source)
        at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:152)
        at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:114)
        at org.codehaus.groovy.tools.shell.Shell$leftShift$0.call(Unknown Source)
        at org.codehaus.groovy.tools.shell.ShellRunner.work(ShellRunner.groovy:88)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:233)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1047)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:128)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:148)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.work(InteractiveShellRunner.groovy:100)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:267)
        at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:52)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:137)
        at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:57)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:233)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1047)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:128)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:148)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:66)
        at com.thinkaurelius.titan.util.gremlin.Console.<init>(Console.java:52)
        at com.thinkaurelius.titan.util.gremlin.Console.<init>(Console.java:59)
        at com.thinkaurelius.titan.util.gremlin.Console.main(Console.java:64)
Caused by: InvalidRequestException(why:Keyspace titan does not exist)
        at org.apache.cassandra.thrift.Cassandra$set_keyspace_result.read(Cassandra.java:4790)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
        at org.apache.cassandra.thrift.Cassandra$Client.recv_set_keyspace(Cassandra.java:480)
        at org.apache.cassandra.thrift.Cassandra$Client.set_keyspace(Cassandra.java:467)
        at com.thinkaurelius.titan.diskstorage.cassandra.thriftpool.CTConnectionFactory.makeObject(CTConnectionFactory.java:69)
        at org.apache.commons.pool.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:1190)
        at com.thinkaurelius.titan.diskstorage.cassandra.thriftpool.UncheckedGenericKeyedObjectPool.borrowObject(UncheckedGenericKeyedObjectPool.java:42)
        ... 58 more
gremlin>

basically, you can't reconnect to titan from the gremlin console if cassandra is reinitialized. i find the workaround is to exit gremlin and start over again. it doesn't seem right that gremlin should throw an error here as gremlin shouldn't care if cassandra is empty or not. it certainly doesn't care the first time cassandra is started and gremlin is started.

Efficiency problems with Cassandra backend

Hello there,

I'm guessing this is a known issue, but still:

When using Cassandra through Thrift, there is a serialization/deserialization of each mutation,
and the current version of the Cassandra backend creates a huge batchmutation with everything;
which means that when I create 100k edges, there will be 100k mutations serialized in one go, eating lots of ram on both sides (client while serialization, and server while deserializing), taking long to process on the Cassandra side (possibly reaching internal query timeout) and adding latency.

I think there should be a limit (possibly a configurable one) to the number of mutations per Cassandra batch.

Thanks,
Dan

NoClassDefFoundError in bin/gremlin.sh

Running bin/gremlin.sh in cygwin causes a NoClassDefFoundError:

java.lang.NoClassDefFoundError: com/thinkaurelius/titan/tinkerpop/gremlin/Console
Caused by: java.lang.ClassNotFoundException: com.thinkaurelius.titan.tinkerpop.gremlin.Console
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: com.thinkaurelius.titan.tinkerpop.gremlin.Console. Program will exit.
Exception in thread "main"

This is apparently caused by using the wrong path separator ":" for windows.

Should say: "Unknown property key"

gremlin> jupiter.query().labels("knows").has("since",2011).has("stars",5).vertices()
Unknown property type: since

a property has keys and values. Not types.

For certain id block sizes, BerkeleyJE fails test case

When the blocksize is set to be anywhere in [8191,65534] the TitanGraph.createAndRetrieveComprehensive test case fails. This id range seems to indicate a problem with ids in a certain bit range for BerkeleyDB. Edges are successfully written, but retrieving them by label fails (returns an empty result set from storage backend).

g.getIndexedKeys() throws NPE

This happens on a fresh instance of Cassandra and BerekeleyDB. Stack trace is after examples below.

gremlin> g = TitanFactory.open('target/testdb')
==>standardtitangraph[local]
gremlin> g.getIndexedKeys()
java.lang.NullPointerException
Display stack trace? [yN]
gremlin> g.createKeyIndex('name',Vertex.class)
==>null
gremlin> g.getIndexedKeys()
java.lang.NullPointerException
Display stack trace? [yN]
gremlin> g.stopTransaction(SUCCESS)
==>null
gremlin> g.getIndexedKeys()
java.lang.NullPointerException
Display stack trace? [yN]


... create with BasicConfiguration to Cassandra.
gremlin> g.getIndexedKeys()
java.lang.NullPointerException
Display stack trace? [yN] y
java.lang.NullPointerException
at com.thinkaurelius.titan.graphdb.blueprints.TitanBlueprintsTransaction.getIndexedKeys(TitanBlueprintsTransaction.java:125)
at com.thinkaurelius.titan.graphdb.blueprints.TitanBlueprintsGraph.getIndexedKeys(TitanBlueprintsGraph.java:95)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoCachedMethodSite.invoke(PojoMetaMethodSite.java:189)
at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:53)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:42)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:112)
at groovysh_evaluate.run(groovysh_evaluate:39)
at groovysh_evaluate$run.call(Unknown Source)

In memory filtering for Neighborhood ID queries

Currently, a neighborhood vertex id query will cause all edges matching the query to be created in memory (and the neighborhood subsequently determined by following those edges) unless the query can be answered by disk indexes alone (i.e. all property constraints are keyed and in the correct order). While most queries will be covered by keys it will nevertheless be a significant improvement to answer such neighbhorhood queries by doing a raw retrieval and filtering the ByteBuffers in memory.

Caching previously answered queries

Titan needs to keep track of what it has loaded in memory. As a trivial example, if we do two vertex.getEdges() calls in a row, we don't want the second one to hit the storage layer again loading all edges. Titan knows that it already loaded the edges and hence the second call only retrieves edges from main memory.

This method works for all edges and for EdgeQueries that ask for edges of a specific type by having each vertex keep track of the types of edges and their direction that have already been loaded. However, for more complex EdgeQuery (e.g. those that have a property constraint) keeping track of all EdgeQueries that have already been loaded per vertex is too cumbersome and memory intense.

As an intermediate solution (between keeping track of all queries and none), we can keep track of the last X queries using a concurrent hash map with soft key references (and boolean values) storing EdgeQueries.
This requires that the method that checks whether the edges for a particular query have been loaded also consults the transaction and, likewise, that new loading add the EdgeQuery to the map. That means, the vertex has to consult the tx and forward complex edge queries to the tx. This model is flexible in that it responds flexibly to memory requirements.

Could we have the possibility to retry a failed transaction?

Hi :)

Currently, when a transaction throws on commit/stopTransaction because of a locking error, everything gets destroyed and if I want to try again later I have to redo all of my previous computations in a new transaction...

Would it be possible to have a tryCommitting method, or some way to backup the changes so that we can retry committing a changeset to Titan?
(I could do it manually... but it pretty much ruins everything and isn't very maintainable)

Thanks in advance,
Dan

Problem with using TitanGraph with more than one thread

Hi,
I'm getting an exception from TitanGraph when I use something like this:

TitanGraph graph = TitanFactory.open(GRAPH_PATH);
graph.createKeyIndex("name", Vertex.class); 

graph.getVertices("name", name).iterator() // in another thread

the Exception is

Exception in thread "Thread-1" java.lang.UnsupportedOperationException: getVertices only supports indexed keys since TitanGraph does not support vertex iteration outside the transaction
    at com.thinkaurelius.titan.graphdb.transaction.AbstractTitanTx.getVertices(AbstractTitanTx.java:467)
    at com.thinkaurelius.titan.graphdb.transaction.AbstractTitanTx.getVertices(AbstractTitanTx.java:443)
    at com.thinkaurelius.titan.graphdb.blueprints.TitanBlueprintsGraph.getVertices(TitanBlueprintsGraph.java:137)

it works fine when using TitanGraph in just one thread but I dont really understand why it works that way. Am I missing something or is it a bug?

Cheers

Release reserved id blocks

Currently, any id block that is reserved by a client cannot be release, that is, given back to the database so that it can reassign that block. This means, that the unused ids are lost when a client is shut down.

There is already a method stup in StorageManager for id block release (commented out). Implementing this feature would require:

  • The StorageManager implementations put unused ids back into the common pool. This is easy. What is harder is using those ids again in the future. Strategy: With probability X scan the last Y entries in the id block table for the particular partition id. If there is a hole in that list, pick a hole randomly and return it as the id block. Otherwise, proceed like we currently do. The contract of the getIDBlock method is already such that it might return smaller blocks than requested.
  • IDPool implementations should release unused id blocks when closed (easy)

Avoid the use of null in internal implementations

As discussed in issue #57, except where an external API contract requires (Blueprints for instance), consider:

  • Using Guava Optional or Objects.firstNonNull to guard against null and establish defined absent or empty collection returns, instead of relying on null. When you require null when reaching those external interfaces with Optional, use http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/base/Optional.html#orNull()
  • Use collections that do not allow null keys or values. There's still an opportunity for null to cause problems, given null can mean a non-existent key, a null key with a null value, or a null value with many default Java collections

I have to say though, Titan does an excellent job generally about guarding against undesired state via Preconditions and assertions, but adopting patterns like these gives you those advantages for free, rather than requiring separate checks. See:

http://code.google.com/p/guava-libraries/wiki/UsingAndAvoidingNullExplained

Gremlin Configuration Helpers

Provide some helpers (mostly for Gremlin) to generate the BaseConfiguration object. It gets a bit tedious typing this:

Configuration conf = new BaseConfiguration();
conf.setProperty("storage.backend","cassandra");
conf.setProperty("storage.hostname","127.0.0.1");

Perhaps, just having something that could generate different BaseConfiguration objects with all the defaults for each of the different modes of operation for Titan. Or, maybe a 'create' method on TitanFactory in addition to 'open' ("create" in the sense of creating a configuration and opening the graph...might not be the right word) :

TitanGraph g = TitanFactory.create(Mode.CASSANDRA_LOCAL_SERVER)

kind of reminds me of the TinkerGraphFactory.createTinkerGraph() method that spins up the toy graph to play with.

IO Packages

Since @g.V@ and @g.E@ don't behave the in titan the way they do in other blueprints implementation the gremlin options to saveGraphML, saveGraphSON and saveGML won't work right as the IO writers won't be able to simply iterate all vertices and edges for output.

I suspect that's not a huge deal considering that no one would be dumping billions or vertices/edges to these file formats, however, it's worth documenting or otherwise warning users of the limitation somehow.

Cassandra and Astyanax tests failing

Are there prerequisites that need to be met for these tests to succeed?

Tests in error: 
  testOpenClose(com.thinkaurelius.titan.graphdb.cassandra.InternalCassandraGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftStorageManager
  testMultipleDatabases(com.thinkaurelius.titan.graphdb.cassandra.InternalCassandraGraphTest): Could not instantiate storage manager class     com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftStorageManager
  testBasic(com.thinkaurelius.titan.graphdb.cassandra.InternalCassandraGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftStorageManager
  testTypes(com.thinkaurelius.titan.graphdb.cassandra.InternalCassandraGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftStorageManager
  testConfiguration(com.thinkaurelius.titan.graphdb.cassandra.InternalCassandraGraphTest): Could not instantiate storage manager class     com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftStorageManager
  testTransaction(com.thinkaurelius.titan.graphdb.cassandra.InternalCassandraGraphTest): Could not instantiate storage manager class     com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftStorageManager
  testCreateDelete(com.thinkaurelius.titan.graphdb.cassandra.InternalCassandraGraphTest): Could not instantiate storage manager class     com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftStorageManager
  testPropertyIndexPersistence(com.thinkaurelius.titan.graphdb.cassandra.InternalCassandraGraphTest): Could not instantiate storage manager class     com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftStorageManager
  testIndexRetrieval(com.thinkaurelius.titan.graphdb.cassandra.InternalCassandraGraphTest): Could not instantiate storage manager class     com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftStorageManager
  testTypeGroup(com.thinkaurelius.titan.graphdb.cassandra.InternalCassandraGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftStorageManager
  testCreateAndRetrieveComprehensive(com.thinkaurelius.titan.graphdb.cassandra.InternalCassandraGraphTest): Could not instantiate storage manager class    com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftStorageManager
  testQuery(com.thinkaurelius.titan.graphdb.cassandra.InternalCassandraGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftStorageManager
  createAndRetrieveMedium(com.thinkaurelius.titan.graphdb.cassandra.InternalCassandraGraphTest): Could not instantiate storage manager class     com.thinkaurelius.titan.diskstorage.cassandra.CassandraThriftStorageManager
  testOpenClose(com.thinkaurelius.titan.graphdb.astyanax.InternalAstyanaxGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.astyanax.AstyanaxStorageManager
  testMultipleDatabases(com.thinkaurelius.titan.graphdb.astyanax.InternalAstyanaxGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.astyanax.AstyanaxStorageManager
  testBasic(com.thinkaurelius.titan.graphdb.astyanax.InternalAstyanaxGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.astyanax.AstyanaxStorageManager
  testTypes(com.thinkaurelius.titan.graphdb.astyanax.InternalAstyanaxGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.astyanax.AstyanaxStorageManager
  testConfiguration(com.thinkaurelius.titan.graphdb.astyanax.InternalAstyanaxGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.astyanax.AstyanaxStorageManager
  testTransaction(com.thinkaurelius.titan.graphdb.astyanax.InternalAstyanaxGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.astyanax.AstyanaxStorageManager
  testCreateDelete(com.thinkaurelius.titan.graphdb.astyanax.InternalAstyanaxGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.astyanax.AstyanaxStorageManager
  testPropertyIndexPersistence(com.thinkaurelius.titan.graphdb.astyanax.InternalAstyanaxGraphTest): Could not instantiate storage manager class     com.thinkaurelius.titan.diskstorage.astyanax.AstyanaxStorageManager
  testIndexRetrieval(com.thinkaurelius.titan.graphdb.astyanax.InternalAstyanaxGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.astyanax.AstyanaxStorageManager
  testTypeGroup(com.thinkaurelius.titan.graphdb.astyanax.InternalAstyanaxGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.astyanax.AstyanaxStorageManager
  testCreateAndRetrieveComprehensive(com.thinkaurelius.titan.graphdb.astyanax.InternalAstyanaxGraphTest): Could not instantiate storage manager class     com.thinkaurelius.titan.diskstorage.astyanax.AstyanaxStorageManager
  testQuery(com.thinkaurelius.titan.graphdb.astyanax.InternalAstyanaxGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.astyanax.AstyanaxStorageManager
  createAndRetrieveMedium(com.thinkaurelius.titan.graphdb.astyanax.InternalAstyanaxGraphTest): Could not instantiate storage manager class com.thinkaurelius.titan.diskstorage.astyanax.AstyanaxStorageManager

Documentation should be for users, not developers.

I noticed that this is all the documentation that exists for HBase:

https://github.com/thinkaurelius/titan/wiki/Using-HBase

It is a section for "developers" and is about running a test suite. This is not sufficient as product documentation.

Documentation needs to be:

  1. About the user (developer documentation is secondary at best)
  2. Hand holding with plenty of toy code examples that take the user from starting Titan, creating a graph, and querying it.
  3. Rich with diagrams and images to make it interesting and engaging.

Thanks,
Marko.

Build fails on JDK 7 on OS X due to no snappyjava in java.library.path

$ mvn -v 
Apache Maven 3.0.4 (r1232337; 2012-01-17 02:44:56-0600)
Maven home: /usr/local/Cellar/maven/3.0.4/libexec
Java version: 1.7.0_06, vendor: Oracle Corporation
Java home: /Library/Java/JavaVirtualMachines/jdk1.7.0_06.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.7.4", arch: "x86_64", family: "mac"

$ mvn clean package
Running com.thinkaurelius.titan.diskstorage.astyanax.InternalAstyanaxKeyColumnValueTest
java.lang.reflect.InvocationTargetException
...
Caused by: java.lang.UnsatisfiedLinkError: no snappyjava in java.library.path
    at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1860)
    at java.lang.Runtime.loadLibrary0(Runtime.java:845)
    at java.lang.System.loadLibrary(System.java:1084)
    at org.xerial.snappy.SnappyNativeLoader.loadLibrary(SnappyNativeLoader.java:52)
    ... 25 more

Workaround:

  1. Download libsnappyjava.jnilib from Google Code
  2. Copy to a directory on the java.library.path, e.g. /usr/lib/java/
  3. Rename to libsnappyjava.dylib per the snappy issue
  4. chmod 755 /usr/lib/java/libsnappyjava.dylib
  5. mvn clean package should now work
  6. ???
  7. Profit

Querying Titan using Labels

Hi,

I would be interested in querying Titan for an Iterable(TitanEdge) using a TitanLabel as argument (finding all edges with the given type).

As I could not find a way to do it: is there currently any way to do it, and if not can it be done ?
they way I understand things, Titan should be indexing my edge labels and thus doing TitanLabel queries should be feasible)

Thanks !

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.