GithubHelp home page GithubHelp logo

Comments (21)

aschwarte10 avatar aschwarte10 commented on May 18, 2024 1

@jeenbroekstra: yes, I will try to look into this by Friday

from rdf4j.

abrokenjester avatar abrokenjester commented on May 18, 2024

@aschwarte10 is this something you could perhaps take a look at?

from rdf4j.

abrokenjester avatar abrokenjester commented on May 18, 2024

There's no simple way to only execute that specific test I'm afraid; this is due to how manifest-based test suites are generated (runtime). Either just run the complete FederationSparqlTest suite (with sq14 re-enabled), or create a local copy of that specific test case so you can quickly execute it without needing to run the entire suite.

from rdf4j.

aschwarte10 avatar aschwarte10 commented on May 18, 2024

@jeenbroekstra: I analysed the issue and also understood the cause.

The issue is that the optimizer evaluates the inner subquery (i.e. the SLICE) as the right argument of the join (i.e., in a nested loop fashion) while it needs to be evaluated at once - ideally first.

See below for the relevant excerpt of the query plan:

NaryJoin
                     StatementPattern
                        Var (name=person)
                        Var (name=_const_aba78b99_uri, value=http://xmlns.com/foaf/0.1/homepage, anonymous)
                        Var (name=homepage)
                     Slice ( limit=3 )
                        Projection
                           ProjectionElemList
                              ProjectionElem "person"
                              ProjectionElem "name"
                           Order
                              OrderElem (ASC)
                                 Var (name=name)
                              OwnedTupleExpr org.eclipse.rdf4j.sail.memory.MemoryStoreConnection@1e6bd367
                                 NaryJoin
                                    StatementPattern
                                       Var (name=person)
                                       Var (name=_const_f5e5585a_uri, value=http://www.w3.org/1999/02/22-rdf-syntax-ns#type, anonymous)
                                       Var (name=_const_e1df31e0_uri, value=http://xmlns.com/foaf/0.1/Person, anonymous)
                                    StatementPattern
                                       Var (name=person)
                                       Var (name=_const_23b7c3b6_uri, value=http://xmlns.com/foaf/0.1/name, anonymous)
                                       Var (name=name)

One possible fix is to change the cardinality of a SLICE to be 1 always in the federation statistics calculator (i.e., it needs to have lower cardinality than the other join arguments) Not sure if this is the ideal solution though. I also tested this locally and the test is working fine. Note: relevant code place is org.eclipse.rdf4j.sail.federation.optimizers.EvaluationStatistics.CardinalityCalculator

Do you have an opinion on this?

from rdf4j.

abrokenjester avatar abrokenjester commented on May 18, 2024

Thanks for looking into this. We (mostly Mark) did some related fixes to join ordering not too long ago. I'll need to look up the details to see what exactly we did there and how it may affect the FederationSail.

I'm always a bit wary of any approach where optimizing in a certain way actually makes a functional difference (rather than just a performance difference). What is it in the join arguments that makes it crucial they're evaluated in a specific order - and is that always true, or just in the case of federated evaluation? Rather than tweaking the cardinalities, we should perhaps adapt the optimizer to look for that case and leave the order alone in this specific situation.

If that turns out to be too hard I'm fine with a fix that just modifies the cardinalities - though it might be good to leave an in-code comment to the effect that this should be looked at again in the future.

from rdf4j.

aschwarte10 avatar aschwarte10 commented on May 18, 2024

Here some additional isights:

  • the NaryJoin algebra node (i.e. the relevant join) is specific for the Federation Sail. I guess in the regular other sail implementations the join order is not changed
  • also the statistics implementation and the join order optimizer is specific to the federation sail

The main issue is: the sub select query needs to be evaluated as a whole (such that the order and limit work properly). With the current join order (i.e. the sub select being the right argument of the join) it is evaluated using the intermediate results as inputs (i.e. in a nested loop fashion).

So somehow the join order needs to be fixed that the sub select is evaluated as a whole, either by using a different join technique (e.g. hashbased join) or by using an appropriate join order

from rdf4j.

abrokenjester avatar abrokenjester commented on May 18, 2024

I guess that means that a fix in the cardinality can be done with minimal side effects. A different join strategy might be more robust in the long run but it sounds like a lot more work, so perhaps gook for the quick fix.

Can you put up a PR so we can test it out?

from rdf4j.

pulquero avatar pulquero commented on May 18, 2024

This looks like it could be related to #134. I believe the bug is actually in the ProjectionIteration.

from rdf4j.

aschwarte10 avatar aschwarte10 commented on May 18, 2024

@pulquero: in this case the issue is definitely not in the ProjectionIteration.

The cause is due to the join order (or if you want to keep the join order in this way: due to an inapprpropriate join implementation).

Consider the following example for an illustration of the problem:

The optimizer currently produces the following join algebra node

LEFT: ?person foaf:homepage ?homepage
RIGHT: SELECT ?person ?name { ... } LIMIT 3 ORDER BY ?name

What happens due to the nested loop join (and can easily be seen with a debugger)

  • compute the intermediate result set S for the LEFT operator
  • for each tuple (i.e. BindingSet) in S, evaluate the RIGHT operator

Obviously, this yields incorrect results according to SPARQL semantics (depending on the data). See the example sq14 provided by w3c for the concrete case.

Note that this properly works fine if the RIGHT operator is evaluated at once (and not in a nested loop)

@jeenbroekstra: I will try find some time andprovide the change in the federation sail join order optimizer by the end of this week.

from rdf4j.

pulquero avatar pulquero commented on May 18, 2024

I assume if joining two slices then both left and right will have to be evaluated at once.

from rdf4j.

pulquero avatar pulquero commented on May 18, 2024

Could this (and potentially the other issue) be fixed by changing evaluate() to do something like:

  if (TupleExprs.containsProjection(join.getRightArg())) {
        result = new HashJoinIteration(this, result, join.getArg(i));
    }
    else {
        result = new ParallelJoinCursor(this, result, join.getArg(i));
    }

Would just mean I could fix it in my custom EvaluationStrategy without waiting for a patch for 2.9.

from rdf4j.

aschwarte10 avatar aschwarte10 commented on May 18, 2024

@pulquero: using the HashJoinIteration might be the best solution. With the above change the SLICE cost ist estimated to 1, meaning that it is re-ordered as first in federation sail. When I find the time (maybe later in the afternoon) I will also try the HashJoin approach

from rdf4j.

pulquero avatar pulquero commented on May 18, 2024

It would be good if it works. I'm not a fan of solving this problem via means of changing the optimizer. I feel optimizers should always be optional/non-functional and evaluationstrategy should be functional. The distinction becomes quite important when customizing the provided Sails.

from rdf4j.

aschwarte10 avatar aschwarte10 commented on May 18, 2024

@pulquero, @jeenbroekstra : I evaluated whether the use of HashJoinIteration also solves the problem and indeed (as expected) the different join strategy works fine. Unfortunately, the code is not as nice as before - maybe you have suggestions for rewriting - see the pull request. One idea would be to completely evaluate using hash joins if there is at least one subquery (this could then be done in a seperate method).

from rdf4j.

pulquero avatar pulquero commented on May 18, 2024

As this is a fairly key fix can it be patched back to 2.9 branch pls.

from rdf4j.

abrokenjester avatar abrokenjester commented on May 18, 2024

@pulquero I assume you mean the 1.0 branch? That'll be taken care of as soon as this is merged in and the sync PR (2-0 -> 1.0) is merged as well.

from rdf4j.

pulquero avatar pulquero commented on May 18, 2024

I mean the old 2.9 because i think I remember you saying that critical bugs would be back-ported into maintenance releases.

from rdf4j.

abrokenjester avatar abrokenjester commented on May 18, 2024

You may have misunderstood - that was the case before we moved over, but from my point of view the Sesame code base is now simply closed. At least, I can not really afford to spend time on backporting bug fixes from RDF4J to Sesame. It's also why we started the RDF4J 1.0 branch in the first place: to give you (and other projects that rely on Java 7) a release they can more easily migrate to.

But if you really need this before you can migrate to RDF4J, then what you can do is backport the fix to the Sesame 2.9 branch yourself. Even if we do not do an official 2.9.1 release you can deploy a snapshot build in your own project. And who knows? If there's others who need this and/or I can find some spare time we can decide to push out a last patch version of Sesame after all.

from rdf4j.

pulquero avatar pulquero commented on May 18, 2024

How soon will there be an RDF4J 1.0 milestone in central? If there is one in the pipeline in a week or so, then I can just about wait and migrate, else I will need to backport into 2.9. Unfortunately, this issue is very problematic

from rdf4j.

abrokenjester avatar abrokenjester commented on May 18, 2024

First RDF4J 1.0 milestone is scheduled for next week. Basically as soon as this issue's PR is done and merged, and the branches synced and stabilized. If you want I can already deploy nightly snapshot builds of the branch to Sonatype, starting today.

from rdf4j.

pulquero avatar pulquero commented on May 18, 2024

FYI: the use of TupleExprs.containsProjection(rightArg) is appropriate given the use of ParallelJoins. This is not sufficient for other join strategies, in this case TupleExprs.containsProjection(rightArg) has to be replaced with

TupleExprs.containsProjection(expr) || (expr instanceof OwnedTupleExpr2 && (OwnedTupleExpr2)expr).getQuery() != null)

as expr when executed directly by the owner is also essentially a sub-query/projection.

from rdf4j.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.