GithubHelp home page GithubHelp logo

Comments (5)

jsteemann avatar jsteemann commented on June 24, 2024

@DougGarno55 : I just tried exactly what you wrote above above using 3.11, and for me the indexes are indeed used.
Here is what I did:
Create collections:

arangod> db._create("A");
[ArangoCollection 3157380, "A" (type document, status loaded)]
arangod> db._create("B");
[ArangoCollection 3157383, "B" (type document, status loaded)]

Insert data:

arangod> db._query(`   FOR d IN 1..50000
...>    INSERT {
...>       "Project ID": d,
...>       "Str Version": TO_STRING(d)
...>    } INTO A`);
[object GeneralArrayCursor, count: 0, cached: false]
arangod> db._query(`
...>    FOR d IN 1..50000
...>    INSERT {
...>       "Project ID": d,
...>       "Str Version": TO_STRING(d)
...>    } INTO B`);
[object GeneralArrayCursor, count: 0, cached: false]

Create indexes:

arangod> db.A.ensureIndex({ fields: ["Project ID"], type: "persistent" });
{ 
  "cacheEnabled" : false, 
  "deduplicate" : true, 
  "estimates" : true, 
  "fields" : [ 
    "Project ID" 
  ], 
  "id" : "A/3257402", 
  "name" : "idx_1793161154257747968", 
  "selectivityEstimate" : 0.9998779445868424, 
  "sparse" : false, 
  "type" : "persistent", 
  "unique" : false, 
  "isNewlyCreated" : true 
}
arangod> db.B.ensureIndex({ fields: ["Project ID"], type: "persistent" });
{ 
  "cacheEnabled" : false, 
  "deduplicate" : true, 
  "estimates" : true, 
  "fields" : [ 
    "Project ID" 
  ], 
  "id" : "B/3257410", 
  "name" : "idx_1793161159029817344", 
  "selectivityEstimate" : 0.9998779445868424, 
  "sparse" : false, 
  "type" : "persistent", 
  "unique" : false, 
  "isNewlyCreated" : true 
}

Explain query:

arangod> db._explain(`   FOR docA IN A
...>    FOR docB IN B
...>    FILTER docB["Project ID"] == docA["Project ID"]
...>    RETURN {
...>       "Project ID": docA["Project ID"],
...>       "Str Version": docB["Str Version"]
...>    }`);
Query String (182 chars, cacheable: true):
    FOR docA IN A
    FOR docB IN B
    FILTER docB["Project ID"] == docA["Project ID"]
    RETURN {
       "Project ID": docA["Project ID"],
       "Str Version": docB["Str Version"]
    }

Execution plan:
 Id   NodeType           Est.   Comment
  1   SingletonNode         1   * ROOT
  9   IndexNode         50000     - FOR docA IN A   /* persistent index scan, index only (projections: `Project ID`) */    
  8   IndexNode         50000       - FOR docB IN B   /* persistent index scan, index scan + document lookup (projections: `Str Version`) */    
  6   CalculationNode   50000         - LET #4 = { "Project ID" : docA.`Project ID`, "Str Version" : docB.`Str Version` }   /* simple expression */   /* collections used: docA : A, docB : B */
  7   ReturnNode        50000         - RETURN #4

Indexes used:
 By   Name                      Type         Collection   Unique   Sparse   Cache   Selectivity   Fields             Stored values   Ranges
  9   idx_1793161154257747968   persistent   A            false    false    false       99.99 %   [ `Project ID` ]   [  ]            *
  8   idx_1793161159029817344   persistent   B            false    false    false       99.99 %   [ `Project ID` ]   [  ]            (docB.`Project ID` == docA.`Project ID`)

Optimization rules applied:
 Id   RuleName
  1   move-calculations-up
  2   move-filters-up
  3   move-calculations-up-2
  4   move-filters-up-2
  5   use-indexes
  6   remove-filter-covered-by-index
  7   remove-unnecessary-calculations-2
  8   reduce-extraction-to-projection

79 rule(s) executed, 2 plan(s) created, peak mem [b]: 0, exec time [s]: 0.00032

As you can see, the explain plan shows that the indexes are indeed used.

So I guess you will have used a different way to create indexes on A and B.
How the indexes were created in your case is not shown in your example, but I guess you have created sparse indexes. If your indexes on A and B are sparse, they will indeed not be used for the query in question.

from arangodb.

DougGarno55 avatar DougGarno55 commented on June 24, 2024

In case it makes any difference, I did everything within the web interface, not the command line shell.
The indexes I used were persistent indexes. Here is a screen shot of the index creation.

image

from arangodb.

jsteemann avatar jsteemann commented on June 24, 2024

@DougGarno55 : what immediately comes to my mind is that in your screenshot example, you have enclosed the attribute name ("Project ID") in extra double quotes.
I think this is what's causing the problems. Can you try creating the index without enclosing the attribute name in extra double quotes?

from arangodb.

DougGarno55 avatar DougGarno55 commented on June 24, 2024

AH, that now works, here is the explain output:

Query String (156 chars, cacheable: true):
 FOR docA IN A
 FOR docB IN B
 FILTER docB["Project ID"] == docA["Project ID"]
 RETURN {
 "Project ID": docA["Project ID"],
 "Str Version": docB["Str Version"]
 }
 

Execution plan:
 Id   NodeType           Est.   Comment
  1   SingletonNode         1   * ROOT
  9   IndexNode         50000     - FOR docA IN A   /* persistent index scan, index only, projections: `Project ID` */    
  8   IndexNode         50000       - FOR docB IN B   /* persistent index scan, projections: `Str Version` */    
  6   CalculationNode   50000         - LET #4 = { "Project ID" : docA.`Project ID`, "Str Version" : docB.`Str Version` }   /* simple expression */   /* collections used: docA : A, docB : B */
  7   ReturnNode        50000         - RETURN #4

Indexes used:
 By   Name     Type         Collection   Unique   Sparse   Selectivity   Fields             Ranges
  9   IDX_aa   persistent   A            false    false        99.99 %   [ `Project ID` ]   *
  8   IDX_bb   persistent   B            false    false        99.65 %   [ `Project ID` ]   (docB.`Project ID` == docA.`Project ID`)

Optimization rules applied:
 Id   RuleName
  1   move-calculations-up
  2   move-filters-up
  3   move-calculations-up-2
  4   move-filters-up-2
  5   use-indexes
  6   remove-filter-covered-by-index
  7   remove-unnecessary-calculations-2
  8   reduce-extraction-to-projection

I was under the impression that we have to quote the attribute name if it has spaces.
This is not a problem, user error.
Thanks for the clarification.

from arangodb.

dothebart avatar dothebart commented on June 24, 2024

Closing as solved.

from arangodb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.