Comments (1)
For a simple query like FOR doc IN s FILTER doc._key == "12345" RETURN doc
, the profiling output looks like this:
Execution plan:
Id NodeType Site Calls Par Items Filtered Runtime [s] Comment
1 SingletonNode COOR 1 - 1 0 0.00000 * ROOT
7 SingleRemoteOperationNode COOR 1 - 1 0 0.00072 - FOR doc IN s FILTER doc.`_key` == "12345" /* primary index scan */
5 ReturnNode COOR 1 - 1 0 0.00000 - RETURN doc
Indexes used:
By Name Type Collection Unique Sparse Cache Selectivity Fields Stored values Ranges
7 primary primary s true false false 100.00 % [ `_key` ] [ ] "12345"
Optimization rules applied:
Id Rule Name Id Rule Name
1 use-indexes 3 remove-unnecessary-calculations-2
2 remove-filter-covered-by-index 4 optimize-cluster-single-document-operations
Query Statistics:
Writes Exec Writes Ign Doc. Lookups Scan Full Scan Index Cache Hits/Misses Filtered Requests Peak Mem [b] Exec Time [s]
0 0 0 0 1 0 / 0 0 0 0 0.00106
The optimize-cluster-single-document-operations
rule is active and you don't find any nodes in the execution plan with DBS
in the Site
column, respectively no REMOTE/GATHER operations. Calls
and Items
are 1
. The Requests
count under Query Statistics
is 0
.
The explain output doesn't include Calls
, Items
, and Requests
, but it does show you which optimizer rules are active and you can see whether the execution plan contains REMOTE/GATHER nodes.
The optimizer rule is not eligible if you filter for two different documents, like FOR doc IN s FILTER doc._key == "12345" OR doc._key == "54321" RETURN doc
(not a single document operation). The profile shows REMOTE/GATHER, several Calls
, and Requests
is e.g. 9
in case of a collection with 3 shards, which tells you that all DB-Servers were at least contacted once.
Execution plan:
Id NodeType Site Calls Par Items Filtered Runtime [s] Comment
1 SingletonNode DBS 3 - 3 0 0.00001 * ROOT
7 IndexNode DBS 3 0 2 0 0.00033 - FOR doc IN s /* primary index scan, index scan + document lookup, 3 shard(s) */
10 RemoteNode COOR 9 - 2 0 0.00005 - REMOTE
11 GatherNode COOR 4 - 2 0 0.00005 - GATHER /* parallel, unsorted */
5 ReturnNode COOR 4 - 2 0 0.00001 - RETURN doc
Indexes used:
By Name Type Collection Unique Sparse Cache Selectivity Fields Stored values Ranges
7 primary primary s true false false 100.00 % [ `_key` ] [ ] (doc.`_key` IN [ "12345", "54324" ])
Optimization rules applied:
Id Rule Name Id Rule Name Id Rule Name
1 replace-or-with-in 4 remove-unnecessary-calculations-2 7 parallelize-gather
2 use-indexes 5 scatter-in-cluster 8 async-prefetch
3 remove-filter-covered-by-index 6 remove-unnecessary-remote-scatter
Query Statistics:
Writes Exec Writes Ign Doc. Lookups Scan Full Scan Index Cache Hits/Misses Filtered Requests Peak Mem [b] Exec Time [s]
0 0 2 0 2 0 / 0 0 9 0 0.00330
The rule is also not applicable when you query a View, even when asking only for a single document.
When using custom shard keys (not _key
but let's say sk
), you can observe that the query explain output tells you the shard if you ask for a single shard key like FOR doc IN t FILTER doc.sk == 4 RETURN doc
(shard: s8010016):
Execution plan:
Id NodeType Site Par Est. Comment
1 SingletonNode DBS 1 * ROOT
2 EnumerateCollectionNode DBS ✓ 100000 - FOR doc IN t /* full collection scan, shard: s8010016 */ FILTER (doc.`sk` == 4) /* early pruning */
8 RemoteNode COOR 100000 - REMOTE
9 GatherNode COOR 100000 - GATHER /* parallel, unsorted */
5 ReturnNode COOR 100000 - RETURN doc
When asking for multiple shards, like FILTER doc.sk IN [1, 4]
, all shards are contacted, however, even if the shard keys belong to a single shard (3 shard(s)):
Execution plan:
Id NodeType Site Par Est. Comment
1 SingletonNode DBS 1 * ROOT
2 EnumerateCollectionNode DBS ✓ 100000 - FOR doc IN t /* full collection scan, 3 shard(s) */ FILTER (doc.`sk` IN [ 1, 4 ]) /* early pruning */
8 RemoteNode COOR 100000 - REMOTE
9 GatherNode COOR 100000 - GATHER /* parallel, unsorted */
5 ReturnNode COOR 100000 - RETURN doc
You can also observe the difference in the profiling output by looking at the Calls
and Requests
, although you should add a LIMIT 1
to either get 3
or 9
requests (without, you get some higher number that is dependent on the number of documents, possibly also the batchSize, etc. and you would need to compare the stats).
Using an arangosearch
View, search-alias
View, or an inverted index, the profiling output shows similar numbers for requests even if asking only for a single shard key and document, so all shards appear to be contacted (for the inverted index, it does show 3 shard(s)).
from arangodb.
Related Issues (20)
- After updating to version 3.11.7 of Arango BD, the service is constantly restarting HOT 1
- Graph traversal with cluster from list nodes very slow HOT 1
- GEO_CONTAINS produces false results (inconsistent right-hand rule) HOT 2
- Execution time inconsistent with the use of an index with sorting
- What next after Pregel removal?
- User with no access to _system can get/set server license HOT 1
- Follower getting 503 Error HOT 1
- Index not used if attribute has space in the name HOT 5
- Arangobench unreliable behavior HOT 2
- Using a different character to "/" in node ids. HOT 3
- Please make Docker image of v3.12.0 available HOT 2
- GPG key expired on 03/23/2024 HOT 1
- [Optimize graph traversal] How to skip startVertexs that exist in the previous graph traversal results. HOT 2
- 【AQL Grammar】how to write the AQL statement which's function equal to "g.inE("tech").otherV().inE('friends').otherV()" ? HOT 2
- Multiple vulnerabilities in Node JS modules shipped with ArangoDB 3.11.8 HOT 8
- aardvark: display of "info" tab data is very slow with v3.12
- Add a way to persist the webui-Graph settings including start vertex HOT 4
- A while after upgrade to v3.12.0, unable to create documents: "Corruption: Compaction sees out-of-order keys" HOT 3
- 3.12 webui fails soon after login
- Why does a Graph sometimes look mixed up and sometimes it's normal again?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from arangodb.