Comments (29)
Hey @Dharun -- So I've written a Cassandra backend in https://github.com/barakmich/cayley/tree/cassandra_backend -- which does a bit more optimization, I think, than what you're describing.
Plus, I've had it kicking for a little while. It's at least as fast (usually faster) than Mongo.
I implore you to give it a shot -- if it works for you, I'll send it out for review and we can get into the nitty-gritty there.
from cayley.
@k0105 We decided to postpone Cassandra integration until we have few architectural changes implemented in currently supported backends. There is also an ongoing work on making a generic NoSQL backend to minimize the code needed to support new databases from this family. When we finish it, we will port existing NoSQL backends to it (Mongo, GAE) and will probably add few new backends like Cassandra. The work for generic backend for KV is nearly finished, meaning that NoSQL layer can be started in the near term.
from cayley.
Backend support was moved to a separate project. Cassandra will be supported once we finalize a tuple store interface there.
from cayley.
If my understanding is correct, the main problem with supporting things like Cassandra and dynamodb is that there is no good/easy/efficient way to "get all the nodes" from the database. At least in case of dynamodb "scans" are very expensive.
One solution maybe to use memstore along with a key/value store for persistence.
from cayley.
Cassandra replicates all data to each node is my understanding. So its possible after adding new data it will not be on all nodes that moment but under normal circumstances it will be on all nodes quickly.
Also other graph databases use Cassandra and other backends like it. It's a well proven storage system with built in scaling.
from cayley.
Titan is a graph database built on top of Cassandra. Maybe it could help in such cases.
from cayley.
Titan is not on a prosperous path for long time, with no proper response from their team. Lots of bugs too.
MQL(Cayley) is easy to understand and code than Gremlin(Titan).
from cayley.
I hear you..
from cayley.
I implemented a Cassandra backend today. Everything works but I think Cayley is an awkward fit with Cassandra clustering keys. Here is sample query to explain the issue
g.V("Daniel Craig").Out("ACT").All()
Cayley it first asks me for all quads containing ACT as a predicate and then asks me for all quads containing Daniel Craig loading a potenially large dataset into memory. It then does a union of the results to find what they have in common and then returns the result.
Theoretically that could have all been a single cassandra query with the proper schema
SELECT object FROM quads WHERE subject = "Daniel Craig" AND predicate = "ACT"
I haven't been able to find a way to implement this inside Cayley. I did model it after the mongo backend which seems to be doing some hashing that may not be necessary in Cassandra so I may be approaching this incorrectly.
from cayley.
A cassandra backing store for Cayley sounds very useful. Especially so since other backends also exist for Cayley.
I am interested in modelling CFG (computation flow graphs) and parametric equations with Cayley and Cassandra would really help with large scale.
i did not find any cassandra specific MQL mapping layer in the branch. Can you let me know more info, etc so i can see where the codes at and pick it up.
from cayley.
Great to see there is a Cassandra backend in development, thanks for your work on it Barak, I am pulling down the branch now and will give it a run and come back with a review for you.
from cayley.
+1
from cayley.
I pulled the cassandra backend code, but i cannot build it. the go sep restore steps are not working as the folder structures are different.. can you publish instruction on how to build the binary..
I am not a go coder, so any help will be greatly appreciated..
from cayley.
This backend can't be compliled in the newest version ..
from cayley.
@jaiganeshvazhkudai what you think about makea storage to http://www.scylladb.com/ it's 100% cassandra compatible and have a better performance.
See https://github.com/scylladb/scylla
from cayley.
scylladb looks like the best news in ages. I definitly support this.
I am wondering how binary compatible the driver is ?
from cayley.
Hello everyone,
Is there any ongoing work with this? I and my team(graduate students) are working with graph databases, but we need a distributed backend, if we have some direction and current status of this we can maybe help.
from cayley.
@mastayoda - i got Cayley working with Cassandra 3.0 as part of a test i was working on, it needed some very minor mods to get it to work but i am no GO programmer, my skills lie in Java / Python.
I have put it up on my blog here for all to use:
http://www.sitelabs.com/?p=29
It should be enough for you and your team to start more work on this - hope it helps.
from cayley.
@thompson42 this is great! it will help a lot for sure. I will discuss this with my team and let you guys know what will be the roadmap if we choose Cayley as our codebase. This is for a big research project.
from cayley.
@mastayoda - your other alternative is Tinkerpop / Titan Graph DB if you are looking for an excellent horizontally scalable graph DB backended by Cassandra.
Titan is a very large codebase and in those terms i think if your aim is to hack away at the source code then Cayley might be quite a bit more approachable.
from cayley.
@thompson42 , you are totally right. We have been evaluation TitanDB for months, but it is pretty buggy and the project is too quiet. The team is now owned by a company and they are building a new database that I guess is closed source. Our group is trying to build a nasty-fast and scalable graph database which supports both static and dynamic graphs. This obviously requires many things:
- Distributed backend: Cassandra, ScyllabDB, etc.
- Indexing capabilities, also distributed and high available, such as Lucene for example.
- A powerful front end, the Thinkerpop Stack is amazing, we would love to have this with Cayley
- A clean and modular graph indexing scheme, this is where Cayley looks great, is clean and modular, specially to attach many backends, we would build the indexing engine in this layer or in a new layer.
- A good interface for distributed processing, specifically Spark. We need to compute heavy and distribute thinkgs, such as page rank for example.
As you can see, TitanDB has many of these things, but it is giving us a very hard time, even for importing data, indexing, and the codebase is pretty complicated to start analyzing everything, fix all bugs, then start implementing our stuff.
On the other hand, Cayley has the Quad Store model, which I don't know if will be appropriate for our purpose. Also, the gremlin inspired query is nice, but too trivial in comparison with the Tinkerpop stack. If we can have in interface to the Gremlin Server with Cayley will be killer.
There is a lot of work to be done, but our options are very limited, and its hard to decide which direction to take.
from cayley.
I tried using arangodb meanwhile and I am impressed with what I was able to do this week with it..
Sent from my iPhone
On Mar 31, 2016, at 7:44 PM, thompson42 [email protected] wrote:
@mastayoda - your other alternative is Tinkerpop / Titan Graph DB if you are looking for an excellent horizontally scalable graph DB backended by Cassandra.
Titan is a very large codebase and in those terms i think if your aim is to hack away at the source code then Cayley might be quite a bit more approachable.
—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
from cayley.
@jaiganeshvazhkudai I just finished evaluating ArangoDB, I'm very impressed to be honest, thanks for pointing that out! I evaluate it months ago, but their new update looks beautiful.
from cayley.
Spent two days wrapping around the quirks of their AQL.. But their Java driver is good enough to insert and retrieve vertices and edges..
Sent from my iPhone
On Apr 1, 2016, at 2:05 PM, Victor O. Santos Uceta [email protected] wrote:
@jaiganeshvazhkudai I just finished evaluating ArangoDB, I'm very impressed to be honest, thanks for pointing that out! I evaluate it months ago, but their new update looks beautiful.
—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
from cayley.
This has been open for over 3 years now and for over one year on the "Later." list which states its entries should be implemented in about 6 months. Could anyone provide a brief status update? I don't want to complain, but am just genuinely interested in the current situation and roadmap. Also: GRAKN has Cassandra support - does it thus currently have an edge or how are they positioned with respect to Cayley? And what about ArangoDB which you mentioned? It seems awesome - any chance we'll see it supported in the near future?
Thanks in advance.
from cayley.
That sounds amazing. Thank you for letting me know.
from cayley.
Is there any update on this?
from cayley.
@dennwc so can this issue be closed / moved then?
from cayley.
This issue is now tracked here
from cayley.
Related Issues (20)
- 'go build ./cmd/cayley' breaks: missing go.sum entry for module providing package github.com/golang/protobuf/proto HOT 3
- Move cayley.io to Github Pages or Netlify or similar HOT 1
- Aggregations
- Code quality: NameOf and ValueOf should return errors HOT 2
- Relationships not displaying correctly in new cayley version HOT 1
- Cayley.io is down again HOT 10
- Seeking for a new maintainer HOT 13
- hidalgo: interfaces changed a bit in newer versions HOT 2
- Security issue with github.com/gogo/protobuf version < 1.3.2 HOT 2
- Repl Allows One Command to Execute and Subsequent Commands Fail HOT 1
- request a new release HOT 1
- Filter path by Label field of quads HOT 1
- Filter path by empty label field (FilterContext) HOT 1
- Plans for a new release? HOT 2
- Build Issue using cayley 0.7.7 HOT 1
- q InternalQuad ==> q *InternalQuad
- gio gui HOT 1
- Plans for a new release?
- Discourse forum down
- Is this project dead? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cayley.