Comments (21)
@laprice lol seriously.. is that benchmark a fake? Quite surprised to see you say that, I'm guessing you have done your bit of homework on this? Can you share.. thanks!
from cayley.
Would love it. I talked with @erikstmartin about this and he was thinking about taking a whack at it. If we want to discuss it here, that'd be cool.
Also, sorry Erik, for the restructure, but I think you'll agree it's the gopher thing to do.
from cayley.
I'd definitely love to discuss a postgres backend more. While native json documents seems appealing, I'd love to come up with a backend that's more based off of database/sql to make it more general, with the potential of being used by other sql databases as well.
@barakmich I'm definitely cool with the restructure, I was actually going to do some refactoring to make things more idiomatic as well :)
from cayley.
I've done an initial hack at a postgres backend pbnjay/cayley@365ef70. Create DB and then ./cayley http -db=postgres -dbpath="dbname=cayley user=pbnjay"
I could definitely use feedback / tips / comments. I need to tune the schema and/or indexes, and borrow the IDLru from the mongo implementation, but it's usable. Also, loading isn't done in bulk yet, so it took about 30 mins to import the movie data. I'll be using COPY along with probably disabling the constraints to speed that up.
I intend to put a good deal of effort into this for a work project, so I'm more than happy to do what's necessary to get it merged eventually.
from cayley.
Woah! Very neat. I'm about to go to sleep but wanted to comment first to say awesome and I'll give it a run in the morning.
How did you find writing an interface up to this point? Things you thought were clunky or odd could certainly help improve things in the future!
from cayley.
Well, that was quick :) ! I'm definitely gonna play with it in a few hours. Thanks for hacking on this!
- Jonathan
from cayley.
BTW, nice username ... wanna guess why I'm in the kitchen right now ;p ?
- Jonathan
from cayley.
@barakmich just wrote up some initial comments / suggestions in #38 - nothing super painful though. Once I get a better handle on Iterator I might have some ideas there too (it seems to be acting as an iterator and a set?).
@jonathanmarvens What can I say I've written a lot of Go/Postgres code lately! ...and do a lot of work on graphs. This is like the first open-source project that squarely hits my particular niche. That said, my code has a long way to go...
from cayley.
Yeah, that makes sense. Thoughts on the best way to register when compiled in? (Perhaps init()?)
Iterator does act as an iterator and a set -- kind of. it's a set of all paths that could possibly get to this location. sometimes those are the same as a set of nodes, sometimes not.
Looking over your code, there are a couple immediate wins you can have.
A big one is to create a struct to pass around as your TSVal -- by keeping a little more state -- effectively, make a triple struct with a few ID fields -- you can kill a whole bunch of queries generated from https://github.com/pbnjay/cayley/blob/365ef7096c729601e31c07a5a66db371061d71fe/graph/postgres/triplestore.go#L175
Another thing to do sooner rather than later (because it makes playing with it painful) is to fill your TODO of multi-INSERT or COPY FROM.
An LRU will then be the next helpful part.
from cayley.
I added a (tweaked) LRU cache which helps somewhat, and deferred constraints during AddTripleSet. Movie data now loads in under 10 mins for me. At least you only need to load data once!
Also fixed all the dropped cursors. I'll definitely work on the TSVal tweaks you mention next. Cheers
from cayley.
@barakmich
Definitely running into some hiccups now wrt documenting interface contracts.
First, is that Next()
and Check()
must set it.Last
upon success. They must also NOT set it.Last=nil
on failure. Tags rely on this value, so I didn't run into this problem with the simple test cases without tags in my first version of the implementation.
Second, is that if you want to use different representations of TSVal for Nodes and Triples then it is probably better to use separate iterator implementations (unlike the mongodb driver which uses the same representations for both). This mostly stems from some gremlin code that seems to discard simple wrapper types. Case in point: type NodeTSVal int64
causes GetNameFor
to be called with both NodeTSVal
(most invocations) and int64
(ex: gremlin/session.go).
from cayley.
@pbnjay How tight is your interaction with pq? Recent posts on golang-nuts about pgx http://github.com/jackc/pgx look interesting (the author has provided a benchmark suite - his results).
from cayley.
My code isn't too reliant on pq. It would be easy enough to swap out. I
don't see COPY support either, but I haven't looked too closely yet.
I'll certainly look into it / give it a shot. I don't think the current
bottleneck is driver query times though.
Jeremy
On Wed, Jul 16, 2014 at 8:10 PM, Dan Kortschak [email protected]
wrote:
@pbnjay https://github.com/pbnjay How tight is your interaction with
pq? Recent posts on golang-nuts about pgx http://github.com/jackc/pgx
https://github.com/jackc/pgx look interesting (the author has provided
a benchmark suite https://github.com/jackc/go_db_bench - his results
https://gist.github.com/jackc/c402b42244d3390f26c6).—
Reply to this email directly or view it on GitHub
#25 (comment).
from cayley.
I'm looking at this and have forked pbnjay's version and am in the process of updating it to use the QuadStore interface.
Beyond that; what level of additional work would be needed to make this backend part of the mainline cayley tree?
from cayley.
The main issue I ran into was that since cayley does all the result iteration itself, its hard to turn the tree into a flattened joined SQL query. So performance is pretty abysmal for nontrivial queries.
I started a separate branch with a new Query wrapper to allow top-down-ish optimization, but it has stalled as I got pulled into a different project at work.
from cayley.
@pbnjay I've spent most of this session just bringing the postgres branch up to date with the rest of cayley. So I haven't delved into performance at all. It might be possible to do something using WITH RECURSIVE or fiddling with indexes to get acceptable performance. I'm not too worried about the SQL side of things yet. Get it working; get it working right; then make it faster :-)
from cayley.
Like I said, the SQL side is perfectly easy, it's cayley's bottom-up query iteration that is a huge bottleneck for RDBMS backends.
So a word of warning: Cayley uses tags for query results (even if you don't name any in the query). Where I stood when I stopped was a reasonably performant backend, but no tagging implemented - which is when I discovered this issue. Basically it needs rewritten because of this... so build tags into your queries from the start. You can probably just aggregate arrays or something similar. There's not a whole lot of documentation on this end of things, so I'm just trying to dump everything I learned for you to pick up.
from cayley.
Im just curious as to why no one is using pgx
when it seems to be more "better" than lib/pq
.. if someone can clarify this for me, it would be appreciated.
from cayley.
pq has the advantage of a longer history and more active users and not
bypassing db/sql.
This means that it's less likely to contain hidden bugs and more likely to
be easily fixable if one is discovered.
Also lies, damn lies and benchmarks.
On Sat, Dec 27, 2014 at 5:15 AM, Einthusan Vigneswaran <
[email protected]> wrote:
Im just curious as to why no one is using pgx when it seems to be more
"better" than pq.. if someone can clarify this for me, it would be
appreciated.—
Reply to this email directly or view it on GitHub
#25 (comment).
from cayley.
dude, you asked why I thought pq had more uptake and I gave you my opinion.
benchmarks are always problematic; especially if they weren't reproduced by
two or three independent parties.
And this isn't the place for that discussion it's completely irrelevant to
this effort, unless you want to provide a PR
Go post on reddit if you feel a need to evangelize.
On Sun, Dec 28, 2014 at 3:29 PM, Einthusan Vigneswaran <
[email protected]> wrote:
@laprice https://github.com/laprice lol seriously.. is that benchmark a
fake? Quite surprised to see you say that, I'm guessing you have done your
bit of homework on this? Can you share.. thanks!—
Reply to this email directly or view it on GitHub
#25 (comment).
from cayley.
@laprice sorry you misunderstood me, I just wanted to know because this project mentioned it, and then dropped all talks about it. I wasn't trying to evangelize, just wondering if i should switch now from pgx
to pq
for a project of mine. But in any case, it's getting off-topic as you said, thanks for the feedback.
from cayley.
Related Issues (20)
- 'go build ./cmd/cayley' breaks: missing go.sum entry for module providing package github.com/golang/protobuf/proto HOT 3
- Move cayley.io to Github Pages or Netlify or similar HOT 1
- Aggregations
- Code quality: NameOf and ValueOf should return errors HOT 2
- Relationships not displaying correctly in new cayley version HOT 1
- Cayley.io is down again HOT 10
- Seeking for a new maintainer HOT 13
- hidalgo: interfaces changed a bit in newer versions HOT 2
- Security issue with github.com/gogo/protobuf version < 1.3.2 HOT 2
- Repl Allows One Command to Execute and Subsequent Commands Fail HOT 1
- request a new release HOT 1
- Filter path by Label field of quads HOT 1
- Filter path by empty label field (FilterContext) HOT 1
- Plans for a new release? HOT 2
- Build Issue using cayley 0.7.7 HOT 1
- q InternalQuad ==> q *InternalQuad
- gio gui HOT 1
- Plans for a new release?
- Discourse forum down
- Is this project dead? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cayley.