GithubHelp home page GithubHelp logo

High performance sorted queries about bevy HOT 15 OPEN

iiYese avatar iiYese commented on June 21, 2024 2
High performance sorted queries

from bevy.

Comments (15)

alice-i-cecile avatar alice-i-cecile commented on June 21, 2024 1

This is a follow-up to #1470. The other thing that this opens up is efficiently querying (using dynamic queries only, since enum variants aren't types) for enum variants. If we can store tables in a sorted form, requesting all Status::Dead entities becomes genuinely fast, rather than just sugar for a .filter operation.

from bevy.

iiYese avatar iiYese commented on June 21, 2024 1

When should the command be executed

In the next sync point. So basically the first run would be without dense iteration, the second run will need to rebuild the sort cache again and every run after that will be dense. (Until you have a mutation which will start this whole process again)

from bevy.

Victoronz avatar Victoronz commented on June 21, 2024

First of all, thanks for the issue!

I'd like to go through the parts of your suggested solution:

  1. Can be done, see below

  2. This is not doable by default. This would require mutable access for Tables i.e. &mut Tables which is only obtainable
    by engine code with &mut World, see bevy::ecs::storage docs, even with interior mutability, this would still not work.
    To mutate tables by default, you would need to have mutable access to tables whenever you can obtain a query. Since
    systems can have aliasing and disjoint queries, both within themselves and across parallel systems, this is not possible.
    Sort calls don't appear in function signatures, so the scheduler can't work around them either.

    To mutate a table/world you need an exclusive system or a QueryData impl that has complete, unique access to a
    table, which leads me to make this replacement suggestion:
    A set of exclusive systems that expose the same API as the sorts implemented in #13417, without the QueryIter
    parameter and commands that can run those systems.

    These would no longer use transmute functionality internally, just sort tables directly.
    Some catches with this are that archetypes can share tables, and we'd need to sort sparse sets as well.
    This may lead to unexpected behavior on the user side.
    It'd also be less "table sorts" and more "archetype sorts"

    With my suggestion, a user could manually order archetype sorts to happen in front of or after their systems, or spawn
    commands in their systems to do that.

  3. For each of these archetypes we'd store some kind of "last_sorted_by key". Together with change detection this would allow exiting early. I don't see the need to have queries themselves cache the sorts with this.

  4. Definitely perf improvements to be had here! However this will be heavily data set dependent, so good benchmarks would be needed.

  5. Functionality-wise Iterator::nth does this. It may be useful to have some kind of indexing here though, could be done with dedicated functions on the QuerySorted*Iter structs

On the example itself (other than the already presented issues above):

  • FilteredEntityRef and EntityRef currently act the same in type position, so better use the shorter one.
  • To sort by EntityRef you need to include it in the original QueryData, lest you alias with disjoint mutable queries.
  • nit: the sort method that uses a cmp function is sort_by

from bevy.

iiYese avatar iiYese commented on June 21, 2024
  1. Yes it is. Flecs already does this. You would not need mut access to tables. This is why a command is used. See the first point in the additional notes. Essentially the first run of the system will not have dense iteration, it will enqueue the command to reorder tables & the second run will have dense iteration. I don't like the alternate suggestion it's too clumsy.
  2. You cannot store caches at an archetype level. Archetypes are not enough. If you have archetypes (A, B) & (A, B, C) with tables that look like this:
// A, B
A(0), B
A(4), B
A(5), B

// A, B, C
A(1), B, C
A(2), B, C
A(3), B, C

And you want to sort by A you need to first iter (A, B): 0..=0 then (A, B, C): 0..=2 then (A, B): 1..=2. This is just a simple case with 2 archetypes and 1 system & 1 sort key. Queries are the only place you can cache this information. Caching at a query level also allows you to do things like:

  • Sort by B
  • Stable sort by A

If you want different sorts in different systems. You will get dense iteration across A & not completely dense but still some dense iteration across B.

  1. Benchmarks are irrelevant as the worst case is simply what we already have now which is completely random access.

  2. Iterator::nth still iterates https://doc.rust-lang.org/src/core/iter/traits/iterator.rs.html#305 it is not an O(1) operation as this would be.

FilteredEntityRef and EntityRef currently act the same in type position, so better use the shorter one.

I don't think you actually understand the difference between these. EntityRef gets read access to all of an entities components. Meaning you'll block parallelism if you allow this safely. FilteredEntityRef just acts like EntityRef but only has access to a subset of components, in this case it'd be whatever appears in the D: QueryData part of the query.

To sort by EntityRef you need to include it in the original QueryData, lest you alias with disjoint mutable queries.

That is another reason why it has FilteredEntityRef. You can transmute this from any D: QueryData.

from bevy.

iiYese avatar iiYese commented on June 21, 2024

I've updated the seek example to be more flexible.

from bevy.

Victoronz avatar Victoronz commented on June 21, 2024
  1. It seems I misunderstood you on this. How would this command spawning look like? You say that this would use thread
    local commands, does this mean we would be using Commands implicitly? Or are we having the users pass it?
    I still think the explicit sort systems would be useful for users, though separately, since it doesn't seem to cover your
    needs here.

  2. In that case, we'd be mutating QueryState, right? Does this fit with current ECS plans?
    In terms of execution, I would still not want it by default, because this would be hidden behavior and potentially
    introduce unexpected costs for the user on first iteration. How about a Cached enum as an additional parameter,
    instead of chained method calls? The user decides explicitly whether the sort is cached, no need for implicit cost other
    than a single, well-predicted branch. The enum would have two variants, one for cached and uncached behavior each.

  3. I'd still like to see two cases: The new version that uses adjacent runs of data vs the current version, on a dataset with
    no adjacent items after sort, and a dataset that is already sorted except for one item, i.e. the worst and best case. AFAIK,
    there are no benchmarks on this yet, so I would not like to be guessing how big of a difference it makes.

  4. I wasn't clear on this, that's what I meant!

On FilteredEntityRef, if that's how it works, I didn't actually understand it: This ability needs more docs, neither transmute_lens nor the struct itself say it can be transmuted from any QueryData.

from bevy.

iiYese avatar iiYese commented on June 21, 2024
  1. On main there's already World::commands the problem is currently this takes &mut World where as queries get &World so the addition of World::parallel_commands which would take &World or just making the command queue thread local would allow queries to enqueue commands.

  2. There's nothing incompatible about this with future ECS features. Either on QueryState or Query is fine. It will not be hidden behavior because it will be explicitly stated in the docs & there will be a way to opt out of it. With regards to an additional param I prefer well designed defaults with ways to opt out instead of noisy configuration.

  3. This is not guessing. I can confidently assert the difference this would make. You do not need this to be implemented to observe what is well established hardware behavior. It is the reason bevy and other archetypal ECSs tend to be orders of magnitude faster at iteration than things like sparse set ECSs. If you really want you can play around with flecs which already has this implementation.

from bevy.

Victoronz avatar Victoronz commented on June 21, 2024
  1. Taking a broader view, are there any other things that need caching on the Query level? To me this sounds like a problem
    that isn't unique to sorting queries. Would it make sense to make use of some shared infrastructure here?

    I think we disagree here. Of course this will depend on implementation, but as per my current understanding:
    My issue with this is that sort itself would queue the caching, and no_reorder would unqueue that
    caching within the same method chain.
    Not wanting to cache sorts will introduce that configuration noise you speak of into an iter chain that one would like to
    for loop over.
    How about we have some call that takes Query mutably, and mutate some queue_sort_cache setting, causing subsequent
    sorts to follow that behavior? (Sorts would use an already present cache regardless of this setting)
    If you sort the same query twice in a different way in the same system, the "no-reorder noise" can be placed before the
    iteration loop, and if a user still wants a sort to be cached afterwards, they can queue the command manually.

    This would also mean we could introduce some global setting that enable or disables this
    behavior entirely.
    We would still need a default, but this would be cleaner IMO.

  2. I know that it is faster and can be by a magnitude, and I guess this wouldn't be a blocker, but I would like to be able to show
    by how much!

from bevy.

Victoronz avatar Victoronz commented on June 21, 2024

3./4./5. Most improvements to QuerySortedIter should also be applied to QueryManyIter and its future offshoot types.
QuerySortedIter is basically a special case of the more general QueryManyIter.

from bevy.

iiYese avatar iiYese commented on June 21, 2024

Taking a broader view, are there any other things that need caching on the Query level? To me this sounds like a problem that isn't unique to sorting queries. Would it make sense to make use of some shared infrastructure here?

With queries as entities queries can have row polymorphism like everything else that's at an ECS level. This means caching is no longer limited to either:

  • A monolith struct with redundant fields
  • A zoo of types

Query entities can have different components for whatever caches they need. Aside from sort caches they will need other cache structures as we add new features. Two things that come to mind:

  • Immanent: Entity relations (makes query caches more structural where as currently they're very falt)
  • Much further down the line when we catch up to flecs: Query planner (changes the order in which terms are evaluated for faster query construction)

I would not worry about queries as entities as they're one of the prerequisite/accompanying features for relations and the sort cache can be refactored when they land.

from bevy.

Victoronz avatar Victoronz commented on June 21, 2024

Another question I have about the final API example:
Currently, a user calls iter, then sort, here this is reversed. When sort also lands for iter_many, how would this work?
The user can no longer specify a subset of a query to sort. I.e. if a user wishes to sort 3 out of a 1000 entities, then they do not wish to pay for a full query sort in advance.
Additionally, since the actual table is only sorted after the first run, the query.sort call would have to return a sorted Vec of query results to iterate over. Entity is not a simple index into this, and its matching item would instead have to be found in a more expensive manner.

from bevy.

iiYese avatar iiYese commented on June 21, 2024

Currently, a user calls iter, then sort, here this is reversed. When sort also lands for iter_many, how would this work?

This will either remove the existing API or exist along side it.

The user can no longer specify a subset of a query to sort. I.e. if a user wishes to sort 3 out of a 1000 entities, then they do not wish to pay for a full query sort in advance.

This sounds like an esoteric misfeature. How would a user know exactly which subset of items they would want sorted in the first place? There are zero guarantees about iteration order.

Additionally, since the actual table is only sorted after the first run, the query.sort call would have to return a sorted Vec of query results to iterate over. Entity is not a simple index into this, and its matching item would instead have to be found in a more expensive manner.

There should be no single use vectors. QueryState/Query wherever the cache ends up gets 2 new fields:

  • sort_cache: Vec<(TableID, RangeInclusive<usize>)>
  • sort_cursor: (usize, usize)

If .seek is not used the sort_cursor is reset to (0, 0).

from bevy.

Victoronz avatar Victoronz commented on June 21, 2024

The user can no longer specify a subset of a query to sort. I.e. if a user wishes to sort 3 out of a 1000 entities, then they do not wish to pay for a full query sort in advance.

This sounds like an esoteric misfeature. How would a user know exactly which subset of items they would want sorted in the first place? There are zero guarantees about iteration order.

F.e. if we want to sort the children of an entity. The Children or some other component stores a list of entities, which we retrieve by iter_many, then sort. I do not find this esoteric at all.
This would only be relevant if this is intended to replace the existing API.

from bevy.

iiYese avatar iiYese commented on June 21, 2024

That's not a subset of entities in a query that's mapping an entity list. Which is at best niche because

  • Children already has sort functions which you can use instead to get persistent ordering that you don't sort every run.
  • Component hooks & observers will allow people to ensure any data structures with entities they need sorting for are always sorted.
  • Children & bevy_hierarchy will be replaced with relations which will not have entity lists. The only way to get sorting period then will be at a query level.

from bevy.

Adamkob12 avatar Adamkob12 commented on June 21, 2024

Essentially the first run of the system will not have dense iteration, it will enqueue the command to reorder tables & the second run will have dense iteration.

When should the command be executed? If it's executed right away and reorders the Table, then it's very likely that the components will be mutated before the second sort is needed - requiring another table reordering just in time for the sorted query.

The logical conclusion is that the reordering of the table must happen right before the system that requires the sort runs (or after the last mutable access, but its impossible to know when that would be).

After thinking about it, I think there are 3 options that allow for dense, sorted iteration:

  1. Designate certain archetypes as "hot", meaning their tables would have to sort themselves after each mutation (only one sort per archetype)
  2. Let the user call a command to sort archetypes manually (A bit of a footgun)
  3. Reorder the table right at the .sort() call, and then iterate densely (need a special type of Query for that, because no one must access the table at the time of the sort)

from bevy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.