GithubHelp home page GithubHelp logo

Postgres backend! about cayley HOT 21 CLOSED

cayleygraph avatar cayleygraph commented on July 30, 2024
Postgres backend!

from cayley.

Comments (21)

einthusan avatar einthusan commented on July 30, 2024 1

@laprice lol seriously.. is that benchmark a fake? Quite surprised to see you say that, I'm guessing you have done your bit of homework on this? Can you share.. thanks!

from cayley.

barakmich avatar barakmich commented on July 30, 2024

Would love it. I talked with @erikstmartin about this and he was thinking about taking a whack at it. If we want to discuss it here, that'd be cool.

Also, sorry Erik, for the restructure, but I think you'll agree it's the gopher thing to do.

from cayley.

erikstmartin avatar erikstmartin commented on July 30, 2024

I'd definitely love to discuss a postgres backend more. While native json documents seems appealing, I'd love to come up with a backend that's more based off of database/sql to make it more general, with the potential of being used by other sql databases as well.

@barakmich I'm definitely cool with the restructure, I was actually going to do some refactoring to make things more idiomatic as well :)

from cayley.

pbnjay avatar pbnjay commented on July 30, 2024

I've done an initial hack at a postgres backend pbnjay/cayley@365ef70. Create DB and then ./cayley http -db=postgres -dbpath="dbname=cayley user=pbnjay"

I could definitely use feedback / tips / comments. I need to tune the schema and/or indexes, and borrow the IDLru from the mongo implementation, but it's usable. Also, loading isn't done in bulk yet, so it took about 30 mins to import the movie data. I'll be using COPY along with probably disabling the constraints to speed that up.

I intend to put a good deal of effort into this for a work project, so I'm more than happy to do what's necessary to get it merged eventually.

from cayley.

barakmich avatar barakmich commented on July 30, 2024

Woah! Very neat. I'm about to go to sleep but wanted to comment first to say awesome and I'll give it a run in the morning.

How did you find writing an interface up to this point? Things you thought were clunky or odd could certainly help improve things in the future!

from cayley.

jonathanmarvens avatar jonathanmarvens commented on July 30, 2024

@pbnjay

Well, that was quick :) ! I'm definitely gonna play with it in a few hours. Thanks for hacking on this!

- Jonathan

from cayley.

jonathanmarvens avatar jonathanmarvens commented on July 30, 2024

@pbnjay

BTW, nice username ... wanna guess why I'm in the kitchen right now ;p ?

- Jonathan

from cayley.

pbnjay avatar pbnjay commented on July 30, 2024

@barakmich just wrote up some initial comments / suggestions in #38 - nothing super painful though. Once I get a better handle on Iterator I might have some ideas there too (it seems to be acting as an iterator and a set?).

@jonathanmarvens What can I say I've written a lot of Go/Postgres code lately! ...and do a lot of work on graphs. This is like the first open-source project that squarely hits my particular niche. That said, my code has a long way to go...

from cayley.

barakmich avatar barakmich commented on July 30, 2024

Yeah, that makes sense. Thoughts on the best way to register when compiled in? (Perhaps init()?)

Iterator does act as an iterator and a set -- kind of. it's a set of all paths that could possibly get to this location. sometimes those are the same as a set of nodes, sometimes not.

Looking over your code, there are a couple immediate wins you can have.

A big one is to create a struct to pass around as your TSVal -- by keeping a little more state -- effectively, make a triple struct with a few ID fields -- you can kill a whole bunch of queries generated from https://github.com/pbnjay/cayley/blob/365ef7096c729601e31c07a5a66db371061d71fe/graph/postgres/triplestore.go#L175
Another thing to do sooner rather than later (because it makes playing with it painful) is to fill your TODO of multi-INSERT or COPY FROM.
An LRU will then be the next helpful part.

from cayley.

pbnjay avatar pbnjay commented on July 30, 2024

I added a (tweaked) LRU cache which helps somewhat, and deferred constraints during AddTripleSet. Movie data now loads in under 10 mins for me. At least you only need to load data once!

Also fixed all the dropped cursors. I'll definitely work on the TSVal tweaks you mention next. Cheers

from cayley.

pbnjay avatar pbnjay commented on July 30, 2024

@barakmich
Definitely running into some hiccups now wrt documenting interface contracts.

First, is that Next() and Check() must set it.Last upon success. They must also NOT set it.Last=nil on failure. Tags rely on this value, so I didn't run into this problem with the simple test cases without tags in my first version of the implementation.

Second, is that if you want to use different representations of TSVal for Nodes and Triples then it is probably better to use separate iterator implementations (unlike the mongodb driver which uses the same representations for both). This mostly stems from some gremlin code that seems to discard simple wrapper types. Case in point: type NodeTSVal int64 causes GetNameFor to be called with both NodeTSVal (most invocations) and int64 (ex: gremlin/session.go).

from cayley.

kortschak avatar kortschak commented on July 30, 2024

@pbnjay How tight is your interaction with pq? Recent posts on golang-nuts about pgx http://github.com/jackc/pgx look interesting (the author has provided a benchmark suite - his results).

from cayley.

pbnjay avatar pbnjay commented on July 30, 2024

My code isn't too reliant on pq. It would be easy enough to swap out. I
don't see COPY support either, but I haven't looked too closely yet.

I'll certainly look into it / give it a shot. I don't think the current
bottleneck is driver query times though.

Jeremy

On Wed, Jul 16, 2014 at 8:10 PM, Dan Kortschak [email protected]
wrote:

@pbnjay https://github.com/pbnjay How tight is your interaction with
pq? Recent posts on golang-nuts about pgx http://github.com/jackc/pgx
https://github.com/jackc/pgx look interesting (the author has provided
a benchmark suite https://github.com/jackc/go_db_bench - his results
https://gist.github.com/jackc/c402b42244d3390f26c6).


Reply to this email directly or view it on GitHub
#25 (comment).

from cayley.

laprice avatar laprice commented on July 30, 2024

I'm looking at this and have forked pbnjay's version and am in the process of updating it to use the QuadStore interface.

Beyond that; what level of additional work would be needed to make this backend part of the mainline cayley tree?

from cayley.

pbnjay avatar pbnjay commented on July 30, 2024

The main issue I ran into was that since cayley does all the result iteration itself, its hard to turn the tree into a flattened joined SQL query. So performance is pretty abysmal for nontrivial queries.

I started a separate branch with a new Query wrapper to allow top-down-ish optimization, but it has stalled as I got pulled into a different project at work.

from cayley.

laprice avatar laprice commented on July 30, 2024

@pbnjay I've spent most of this session just bringing the postgres branch up to date with the rest of cayley. So I haven't delved into performance at all. It might be possible to do something using WITH RECURSIVE or fiddling with indexes to get acceptable performance. I'm not too worried about the SQL side of things yet. Get it working; get it working right; then make it faster :-)

from cayley.

pbnjay avatar pbnjay commented on July 30, 2024

Like I said, the SQL side is perfectly easy, it's cayley's bottom-up query iteration that is a huge bottleneck for RDBMS backends.

So a word of warning: Cayley uses tags for query results (even if you don't name any in the query). Where I stood when I stopped was a reasonably performant backend, but no tagging implemented - which is when I discovered this issue. Basically it needs rewritten because of this... so build tags into your queries from the start. You can probably just aggregate arrays or something similar. There's not a whole lot of documentation on this end of things, so I'm just trying to dump everything I learned for you to pick up.

from cayley.

einthusan avatar einthusan commented on July 30, 2024

Im just curious as to why no one is using pgx when it seems to be more "better" than lib/pq.. if someone can clarify this for me, it would be appreciated.

from cayley.

laprice avatar laprice commented on July 30, 2024

pq has the advantage of a longer history and more active users and not
bypassing db/sql.

This means that it's less likely to contain hidden bugs and more likely to
be easily fixable if one is discovered.

Also lies, damn lies and benchmarks.

On Sat, Dec 27, 2014 at 5:15 AM, Einthusan Vigneswaran <
[email protected]> wrote:

Im just curious as to why no one is using pgx when it seems to be more
"better" than pq.. if someone can clarify this for me, it would be
appreciated.


Reply to this email directly or view it on GitHub
#25 (comment).

from cayley.

laprice avatar laprice commented on July 30, 2024

dude, you asked why I thought pq had more uptake and I gave you my opinion.

benchmarks are always problematic; especially if they weren't reproduced by
two or three independent parties.

And this isn't the place for that discussion it's completely irrelevant to
this effort, unless you want to provide a PR

Go post on reddit if you feel a need to evangelize.

On Sun, Dec 28, 2014 at 3:29 PM, Einthusan Vigneswaran <
[email protected]> wrote:

@laprice https://github.com/laprice lol seriously.. is that benchmark a
fake? Quite surprised to see you say that, I'm guessing you have done your
bit of homework on this? Can you share.. thanks!


Reply to this email directly or view it on GitHub
#25 (comment).

from cayley.

einthusan avatar einthusan commented on July 30, 2024

@laprice sorry you misunderstood me, I just wanted to know because this project mentioned it, and then dropped all talks about it. I wasn't trying to evangelize, just wondering if i should switch now from pgx to pq for a project of mine. But in any case, it's getting off-topic as you said, thanks for the feedback.

from cayley.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.