GithubHelp home page GithubHelp logo

Cassandra backend about cayley HOT 29 CLOSED

cayleygraph avatar cayleygraph commented on July 17, 2024
Cassandra backend

from cayley.

Comments (29)

barakmich avatar barakmich commented on July 17, 2024 3

Hey @Dharun -- So I've written a Cassandra backend in https://github.com/barakmich/cayley/tree/cassandra_backend -- which does a bit more optimization, I think, than what you're describing.

Plus, I've had it kicking for a little while. It's at least as fast (usually faster) than Mongo.

I implore you to give it a shot -- if it works for you, I'll send it out for review and we can get into the nitty-gritty there.

from cayley.

dennwc avatar dennwc commented on July 17, 2024 3

@k0105 We decided to postpone Cassandra integration until we have few architectural changes implemented in currently supported backends. There is also an ongoing work on making a generic NoSQL backend to minimize the code needed to support new databases from this family. When we finish it, we will port existing NoSQL backends to it (Mongo, GAE) and will probably add few new backends like Cassandra. The work for generic backend for KV is nearly finished, meaning that NoSQL layer can be started in the near term.

from cayley.

dennwc avatar dennwc commented on July 17, 2024 1

Backend support was moved to a separate project. Cassandra will be supported once we finalize a tuple store interface there.

from cayley.

alimoeeny avatar alimoeeny commented on July 17, 2024

If my understanding is correct, the main problem with supporting things like Cassandra and dynamodb is that there is no good/easy/efficient way to "get all the nodes" from the database. At least in case of dynamodb "scans" are very expensive.
One solution maybe to use memstore along with a key/value store for persistence.

from cayley.

chrispassas avatar chrispassas commented on July 17, 2024

Cassandra replicates all data to each node is my understanding. So its possible after adding new data it will not be on all nodes that moment but under normal circumstances it will be on all nodes quickly.

Also other graph databases use Cassandra and other backends like it. It's a well proven storage system with built in scaling.

from cayley.

Acconut avatar Acconut commented on July 17, 2024

Titan is a graph database built on top of Cassandra. Maybe it could help in such cases.

from cayley.

 avatar commented on July 17, 2024

Titan is not on a prosperous path for long time, with no proper response from their team. Lots of bugs too.

MQL(Cayley) is easy to understand and code than Gremlin(Titan).

from cayley.

keremgocen avatar keremgocen commented on July 17, 2024

I hear you..

from cayley.

thdxr avatar thdxr commented on July 17, 2024

I implemented a Cassandra backend today. Everything works but I think Cayley is an awkward fit with Cassandra clustering keys. Here is sample query to explain the issue

g.V("Daniel Craig").Out("ACT").All()

Cayley it first asks me for all quads containing ACT as a predicate and then asks me for all quads containing Daniel Craig loading a potenially large dataset into memory. It then does a union of the results to find what they have in common and then returns the result.

Theoretically that could have all been a single cassandra query with the proper schema

SELECT object FROM quads WHERE subject = "Daniel Craig" AND predicate = "ACT"

I haven't been able to find a way to implement this inside Cayley. I did model it after the mongo backend which seems to be doing some hashing that may not be necessary in Cassandra so I may be approaching this incorrectly.

from cayley.

 avatar commented on July 17, 2024

A cassandra backing store for Cayley sounds very useful. Especially so since other backends also exist for Cayley.

I am interested in modelling CFG (computation flow graphs) and parametric equations with Cayley and Cassandra would really help with large scale.

i did not find any cassandra specific MQL mapping layer in the branch. Can you let me know more info, etc so i can see where the codes at and pick it up.

from cayley.

thompson42 avatar thompson42 commented on July 17, 2024

Great to see there is a Cassandra backend in development, thanks for your work on it Barak, I am pulling down the branch now and will give it a run and come back with a review for you.

from cayley.

payneio avatar payneio commented on July 17, 2024

+1

from cayley.

jaiganeshvazhkudai avatar jaiganeshvazhkudai commented on July 17, 2024

I pulled the cassandra backend code, but i cannot build it. the go sep restore steps are not working as the folder structures are different.. can you publish instruction on how to build the binary..
I am not a go coder, so any help will be greatly appreciated..

from cayley.

ustczen avatar ustczen commented on July 17, 2024

This backend can't be compliled in the newest version ..

from cayley.

jcmartins avatar jcmartins commented on July 17, 2024

@jaiganeshvazhkudai what you think about makea storage to http://www.scylladb.com/ it's 100% cassandra compatible and have a better performance.
See https://github.com/scylladb/scylla

from cayley.

 avatar commented on July 17, 2024

scylladb looks like the best news in ages. I definitly support this.
I am wondering how binary compatible the driver is ?

from cayley.

vsantosu avatar vsantosu commented on July 17, 2024

Hello everyone,

Is there any ongoing work with this? I and my team(graduate students) are working with graph databases, but we need a distributed backend, if we have some direction and current status of this we can maybe help.

from cayley.

thompson42 avatar thompson42 commented on July 17, 2024

@mastayoda - i got Cayley working with Cassandra 3.0 as part of a test i was working on, it needed some very minor mods to get it to work but i am no GO programmer, my skills lie in Java / Python.

I have put it up on my blog here for all to use:
http://www.sitelabs.com/?p=29

It should be enough for you and your team to start more work on this - hope it helps.

from cayley.

vsantosu avatar vsantosu commented on July 17, 2024

@thompson42 this is great! it will help a lot for sure. I will discuss this with my team and let you guys know what will be the roadmap if we choose Cayley as our codebase. This is for a big research project.

from cayley.

thompson42 avatar thompson42 commented on July 17, 2024

@mastayoda - your other alternative is Tinkerpop / Titan Graph DB if you are looking for an excellent horizontally scalable graph DB backended by Cassandra.

Titan is a very large codebase and in those terms i think if your aim is to hack away at the source code then Cayley might be quite a bit more approachable.

from cayley.

vsantosu avatar vsantosu commented on July 17, 2024

@thompson42 , you are totally right. We have been evaluation TitanDB for months, but it is pretty buggy and the project is too quiet. The team is now owned by a company and they are building a new database that I guess is closed source. Our group is trying to build a nasty-fast and scalable graph database which supports both static and dynamic graphs. This obviously requires many things:

  1. Distributed backend: Cassandra, ScyllabDB, etc.
  2. Indexing capabilities, also distributed and high available, such as Lucene for example.
  3. A powerful front end, the Thinkerpop Stack is amazing, we would love to have this with Cayley
  4. A clean and modular graph indexing scheme, this is where Cayley looks great, is clean and modular, specially to attach many backends, we would build the indexing engine in this layer or in a new layer.
  5. A good interface for distributed processing, specifically Spark. We need to compute heavy and distribute thinkgs, such as page rank for example.

As you can see, TitanDB has many of these things, but it is giving us a very hard time, even for importing data, indexing, and the codebase is pretty complicated to start analyzing everything, fix all bugs, then start implementing our stuff.

On the other hand, Cayley has the Quad Store model, which I don't know if will be appropriate for our purpose. Also, the gremlin inspired query is nice, but too trivial in comparison with the Tinkerpop stack. If we can have in interface to the Gremlin Server with Cayley will be killer.

There is a lot of work to be done, but our options are very limited, and its hard to decide which direction to take.

from cayley.

jaiganeshvazhkudai avatar jaiganeshvazhkudai commented on July 17, 2024

I tried using arangodb meanwhile and I am impressed with what I was able to do this week with it..

Sent from my iPhone

On Mar 31, 2016, at 7:44 PM, thompson42 [email protected] wrote:

@mastayoda - your other alternative is Tinkerpop / Titan Graph DB if you are looking for an excellent horizontally scalable graph DB backended by Cassandra.

Titan is a very large codebase and in those terms i think if your aim is to hack away at the source code then Cayley might be quite a bit more approachable.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub

from cayley.

vsantosu avatar vsantosu commented on July 17, 2024

@jaiganeshvazhkudai I just finished evaluating ArangoDB, I'm very impressed to be honest, thanks for pointing that out! I evaluate it months ago, but their new update looks beautiful.

from cayley.

jaiganeshvazhkudai avatar jaiganeshvazhkudai commented on July 17, 2024

Spent two days wrapping around the quirks of their AQL.. But their Java driver is good enough to insert and retrieve vertices and edges..

Sent from my iPhone

On Apr 1, 2016, at 2:05 PM, Victor O. Santos Uceta [email protected] wrote:

@jaiganeshvazhkudai I just finished evaluating ArangoDB, I'm very impressed to be honest, thanks for pointing that out! I evaluate it months ago, but their new update looks beautiful.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub

from cayley.

k0105 avatar k0105 commented on July 17, 2024

This has been open for over 3 years now and for over one year on the "Later." list which states its entries should be implemented in about 6 months. Could anyone provide a brief status update? I don't want to complain, but am just genuinely interested in the current situation and roadmap. Also: GRAKN has Cassandra support - does it thus currently have an edge or how are they positioned with respect to Cayley? And what about ArangoDB which you mentioned? It seems awesome - any chance we'll see it supported in the near future?

Thanks in advance.

from cayley.

k0105 avatar k0105 commented on July 17, 2024

That sounds amazing. Thank you for letting me know.

from cayley.

hashgupta avatar hashgupta commented on July 17, 2024

Is there any update on this?

from cayley.

iddan avatar iddan commented on July 17, 2024

@dennwc so can this issue be closed / moved then?

from cayley.

iddan avatar iddan commented on July 17, 2024

This issue is now tracked here

from cayley.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.