elidupree / time-steward Goto Github PK

Write games and simulations in Rust, using reactive programming for smoothness and replicability.

License: MIT License

Rust 97.06% HTML 0.40% JavaScript 1.96% CSS 0.51% Nix 0.07%

time-steward's Issues

Optimize by not rehashing DeterministicRandomIds

DeterministicRandomId can be used as a hash table key without applying another hash function to it. Its cousin, FieldId, can also be used just by XORing the ColumnId with part of the RowId. Since TimeId's are supposed to be unique, they can probably be used the same way, although that would mean committing to not generating "beginning of moment" ExtendedTimes. (That is currently not implemented, but is not yet forbidden either.)

This could be implemented as a custom Hasher with std::collections::HashMap. However, we may be writing a custom hash map type anyway.

Write a TimeSteward that takes advantage of parallelism

If N random events – near the same time, but at different locations – have very low chance of interfering with each other, then we can theoretically make use of N processors at almost 100% efficiency.

Naturally, parallelism raises some practical challenges. However, this is an important goal.

Decide on a license

I'm considering a bunch of license options, from the least restrictive (MIT) to the most restrictive (AGPL).

There are 2 main problem scenarios I want to avoid:

A random developer considers using the time steward for their project, but doesn't do it because the license is incompatible (or because they don't know that it is).
Somebody makes a proprietary version of the TimeSteward, and the Free version gets abandoned.

1 is much more likely than 2, obviously, so I should consider leaning towards less restrictive licenses, but that doesn't necessarily give me an easy answer.

There's a third scenario that might be nice to optimize for, but might be too difficult: 3. A big game studio makes a commercial game using the time steward and doesn't pay me any $$$ >:-(

I mean, I'd like to be paid for my labor, but it might be too impractical to do that – the easiest way would be if I hold onto the copyright for all code in the time steward, but it would be nice to be able to receive contributions from other Free Software developers without doing weird copyright negotiations.

Garbage collection?

TimeSteward is theoretically ideal for incremental garbage collection: it is already obligated to represent its data as small, separable chunks with links between them, and retain immutable copies of old state.

(Here follows some not-perfectly-explained thoughts; maybe I will rewrite them when I'm in a clearer mental state.)

The basic idea is to record when each object is created, and incrementally trace through all objects based on their links, starting at the TimeSteward globals. When a trace is finished, it then iterates through all objects that have been allocated, and drops the ones that were created before the trace started but not reached by the trace.

In practice, implementing this will be very complicated. Some things to consider:

In what ways does the garbage collection need to support concurrency?
Garbage collection depends on user code to implement related traits (such as tracing). But it should remain memory-safe even if the user implements them wrong.
A neat way to do it would be to trace the data at a particular ExtendedTime – specifically, a time when forget_before() has been called, meaning that the data can no longer change before that. As the forgotten time goes forward, eventually everything that's inaccessible will be dropped. However, what about the case where lots of computation is done WITHOUT the forgotten time moving forward? If you ran out of memory while doing that, and the garbage collector wasn't able to free that memory, that would be bad. If we required client code to be able to retain snapshots of the full history instead of just a single moment, this can be handled well, and that would be a nice feature in general… But we might need fancier data structures for that, since it would be too inefficient to simply clone whole DataTimelines.
If we're doing this much memory management, maybe it would be possible to avoid using malloc() as well? There's some possible ideas about using memory pools and moving old objects around, leaving behind pointers to where they moved to. Of course, there's a trade-off between saving the work of malloc() and doing extra work moving objects and each time you follow a link (to check whether it has moved).

Is QueryOffset::Before worth it?

I wanted to include this for auditing, so that the auditing code can query immediately before and after an event. Then I exposed it to the Accessor query interface because it was easy to do so. But is it good?

It adds significant boilerplate to every query, and I've never actually used the Before variant in the few examples I've made so far..
A DataTimeline that wanted to provide Before queries would be able to implement that as separate query type of its own, in the same way that you could make a DataTimeline that answers queries about X seconds in the past.
It creates weirdness if you create a DataTimeline in an event and then query it using Before in the same event. (This is related to the question of whether DataTimelineCells should need a creation time.)

Caching features

Caching can be important for efficiency, but it is inherently dangerous to determinism. Maybe we should provide caching types that are impossible to use unsafely and/or can have runtime checks enabled to detect whether they cause nondeterminism.

Constant caches can be included in a Basics::Constants. They should probably serialize to nothing by default. If we write consistency checks, they have to be done differently than field consistency checks (since caches are only required to be consistent between the simulations, but are allowed to be different from each other).

Maybe we can use adapton for this? It might be appropriate. It is not currently easy to learn (few examples), but if it turns out to be appropriate for this use, using it and writing our own examples would probably be better than implementing our own caching system from scratch.

Code cleanup

So far, I have been focusing on assembling code quickly so that I can have a working prototype. This has left the code in a somewhat messy state. Structs and impl's aren't in a consistent order. Vestigial glue code is still being used in some places. Many unnecessary warnings have not been fixed.

I intend to go through the code and do a cleanup pass, at some point as I approach an MVP.

It may be convenient for this to wait on more API stabilization.

Properly deal with the #[derive] bounds issue

Because of rust-lang/rust#26925, Basics currently requires a bunch of unnecessary supertraits, just to make it possible to use #[derive] in situations where you should normally be able to use it. Eventually, we should have a better approach. Possible solutions:

Implement the traits manually. Probably a bad idea.
Solve the problem in the Rust compiler. This is likely too difficult for me to do myself, especially because of the backwards-compatibility issues, as discussed in the issue.
Use rust-derivative to derive with custom bounds. A decent compromise, except that rust-derivative doesn't support PartialOrd, Ord, Serialize, or Deserialize. The first two are planned (mcarton/rust-derivative#3), but it's not clear what the schedule is. I might someday consider submitting pull requests to rust-derivative (and/or serde?), if no one else resolves it first.

Collision detection

Eliminate large single-operation costs

One advantage of the TimeSteward is that you will be able to take snapshots (such as for a save file or certain networking things) asynchronously. That is, you will be able to copy all fields incrementally, without stalling the simulation for the user.

This purpose is kind of defeated by the fact that we currently use HashMap to store all the fields. HashMap must synchronously move all current field data when reallocating the table. We should use a map type that can resize incrementally.

When deciding what data structure to use, we should also consider how it might enable other potential long-term goals, like persistence or concurrency.

Automated testing of TimeSteward internals

Our main tool for testing TimeSteward behavior should be cross verified time stewards. If two time stewards receive the same valid input and give different results, there is an internal error. However, this testing should be cautious not to give false-positives when the caller gives invalid input.

I began implementing something like this, but ran into trouble with Rust polymorphism limitations. In the short term, working around them may require a whole pile of macros.

We can also make a wrapper class that tests whether a TimeSteward obeys the valid_since() rules.

Automated testing of TimeSteward callers

There are various ways that TimeSteward callers can misbehave, which we should find ways to audit for.

Using nondeterministic code in predictors or events.
Using field types that have lossy serialization or don't have an exact match between Eq and serialization equality.
Using unsafe_now() improperly.
Using (column, event, predictor) types that are not included in Basics::IncludedTypes.
Using nonrandom data in (column, event, predictor) ids.

Rethink valid_since() and forget_before() API

When I created the valid_since() concept, forget_before() didn't exist, and "when can snapshots be taken?" was the same as "when can fiat events be modified?". Now, simple_flat defaults to retaining enough data to take old snapshots, but can't insert old fiat events.

forget_before() is designed to allow memory to be freed – after you call forget_before(), you can't do anything before that time (except refer to snapshots you already took), but all TimeSteward implementors retain the ability to take new snapshots in all cases EXCEPT where you call forget_before(). Currently, valid_since() only determines when you're allowed to create fiat events, but its name isn't quite right for that.

It seems inconsistent that TimeSteward implementors are required to report valid_since() but isn't required to report the most recent time it has forgotten-before.

EventHandle should not implement Ord

Ordering EventHandles by time forces their Eq implementation to be "equality by ExtendedTime", but this means that 2 different event handles measure "equal" even if one of them is an obsolete prediction that has been destroyed and replaced by a new prediction. This is confusing.

Event handles should probably implement Eq by object identity.

Simple_flat and simple_full currently depend on this, and it seems generally desirable to be ABLE to put event handles in sorted data structures, so we should implement a simple wrapper that implements Ord.

Take advantage of GPU computations?

The TimeSteward model assumes that a very large number of things do individual, relatively small computations according to the same rules. Theoretically, this is ideal for massively parallel programming.

This is a far-future goal, because the state of GPU programming support (across target platforms) is not very good currently, and there may be incompatibilities between our current implementations and GPU abilities (for instance, function pointers are not necessarily compatible with GPU control flow limitations).

Modification protocol

Currently, queries have a structured protocol, but for modifications, you just pass in a closure that takes an &mut DataTimeline.

I haven't found any technical reason why modifications would benefit from a stricter protocol, but there might be one, and the current arrangement seems strange.

It could technically help audit that the canonical behavior is always the same.

API stabilization

Most of the current API functions are okay.

The most important immediate issue is related to serialization. time_steward should provide features that make it easy to:

Serialize a snapshot
Deserialize a snapshot
Construct a new TimeSteward from a deserialized snapshot + predictors (+ constants?)

Is unsafe_now() the best way to serve its purpose?

Should rng(), random_id(), and constants() remain as they are? Right now, they are trait methods, which means that they could be implemented in different ways by each trait implementor. But they have simple, fixed ways they are supposed to behave. This leads to duplicate code and potential bugs.

ValidSince should have a method indicating whether it includes a particular base time. Perhaps it should implement Ord for itself as well.

insert_fiat_event() and erase_fiat_event() should probably return Result <(), FiatEventOperationError>.

Implement Div<i64> and variants for Range

Right now, there's a Mul implementation for multiplying with i64 (which is a simpler case than multiplying by a Range), but no matching one for Div.

Query by reference?

An event might want to look around in a medium-sized DataTimeline, in a way that would be more efficient using references than by first copying all the data it might be going to use.

This is tricky because it involves making the query API much more complex, and probably returning guards rather than plain references.

There might be other approaches that could accomplish the same thing.

Fix endianness issues

Currently, we rely on SipHasher, which is definitely not endian-safe because the default implementations of Hasher functions use mem::transmute(). We MIGHT be able to work around this by having SiphashIdGenerator use only write(), and implementing the rest of the Hasher functions in an endian-safe way.

However, we also need to be on the lookout for Hash implementations that are not endian safe. If #[derive (Hash)] doesn't always produce endian-safe code, we will have to avoid Hash and Hasher entirely.

Whatever solution we use, we should create #[test] functions that check the output of a few known inputs to make sure the generation is behaving consistently for every build of time_steward.

EventRng may also be a concern. The Rng functions that generate floats use mem::transmute(), which probably isn't safe. Since floats are forbidden anyway, we can override those defaults with a simple panic. fill_bytes() does not, but has a comment implying that that might be reasonable under some circumstances, so we need to beware that the rand crate might change implementations in a way that causes trouble for us.

Make better manual Debug impls

With our cyclic data structures, the default Debug impl overflows the stack instead of displaying something reasonable. I should fix this by making manual impls that are somehow restrained in their recursion.

Provide convenient floating-point emulation

Ordinary floating-point numbers are nondeterministic. However, users will certainly want to use them. If we don't provide convenient emulation, they will be tempted to try to circumvent the rules or implement their own questionably-safe alternatives.

MPFR may be suitable for this?

Review all panic messages

Some of my panic messages are good. Others are not. Others shouldn't be panics at all, but Results instead.

Use a faster hash algorithm?

Time spent hashing is currently a minority of the overhead, but not insignificant (10%-ish).

Apparently, siphash128 is no longer "experimental" (what's the hard rationale for this?).

HighwayHash is also worth considering (apparently it's much faster with SIMD? Although that improvement would be dependent on platform support).

Implement Rand for DeterministicRandomId

I had a rationale for not doing this, but it's no longer consistent with my current general approach.

Make it practical for other crates to implement trait TimeSteward

Currently, our standard TimeSteward implementations rely on macros which may or may not function correctly outside of the crate. We should clean this up and provide instructions for implementing TimeSteward, in case anyone wishes to do so.

Refine TimeSteward macros

All macros that accept struct definitions should permit trailing commas. Allowing trailing commas in other contexts is also desirable.
Also, where clauses.
Consider whether it's possible to remove the [] requirement from generic parameters and where clauses.

There's a bit of the trade-off between usability and maintainability here, so it's not necessarily good to allow more things just because we can (if the implementation is too complicated).

When generic associated types become available…

The API can finally all the defined in the actual api module, rather than a macro.
Accessor can have an associated read-guard type so that snapshots don't have to keep RefCells when they would be happier just returning regular references.

Optimization features

One possibility: user can provide a function FieldId->[(PredictorId, RowId)] that lists predictors you KNOW will be invalidated by a change to that field, then have that predictor run its get() calls with an input called "promise_inferred" or something so that we don't spend time and memory recording the dependency. (Can we also do something like this for events? It would be at least a little more complicated.)

Another: a predictor might have a costly computation to find the exact time of a future event, which it won't need to do if it gets invalidated long before that time comes. For that, we can provide a defer_until(time) method. (This is probably premature optimization until/unless we actually develop a simulation where it would help.)

Put the "rowless" code at the top level, since it's almost as complete as the old code

What to do with the old code? Delete it? Put it in a subdirectory?

Deleting seems appropriate, considering that we do have the git history. If we want to move it into a subdirectory AND have it continue compiling successfully, it would require rewriting a lot of module paths in use statements. And it certainly seems desirable not to spend time compiling it when we only want to use the new stuff.

To be able to conveniently reference the old code, maybe I should just make a branch at the last commit where the old code still exists in the repo.

This change isn't trivial:

At the top level, api.rs and api_macros.rs will be deleted, but deterministic_random_id.rs is shared with the new code in its current form, and lib.rs will need edits.
src/support/collision_detection/ is old, but the other things in src/support/ are compatible.
src/implementation_support/common.rs has a few reused functions, which should be merged into the current src/rowless/implementation_support/common.rs, but is mostly old. src/implementation_support/insert_only.rs is shared. src/implementation_support/data_structures.rs isn't actually used by the new code at all, but isn't dependent on the old code, so it shouldn't be deleted (but it's not technically proper for it to stay where it is if it doesn't help implement time stewards?) src/implementation_support/list_of_types.rs is old.
Everything in src/stewards/ is old.
A bunch of the examples are old.
The new-API examples will need to be updated to remove rowless:: from all their use statements, but otherwise should work. We also need to update the links in the HTML files after removing "rowless" from some example names.
Most of the code in src/rowless/ deliberately use relative paths to ease this change.

Automated profiling of TimeSteward simulations

There are a lot of statistics that would be useful for developing/optimizing TimeSteward simulations. Many of them aren't trivial to compute using client-side code. We should include features for getting some of the statistics, such as:

A visualization of how event dependencies propagate throughout the simulation
Distribution of (number of dependent events) over time (think "how far back in time can I go before one event will explode to the whole simulation")
Distribution of sizes of fields (which we can approximate through the Serialize trait)
Stats about loops; in most simulations, we hope that almost everything happens on the first iteration, so detecting the frequency of iterations beyond the first is useful.

Fiat events need to be serializable

To make a standard way of synchronizing a simulation over the network, we need a way to transmit the fiat events, which means that they need to be serializable.

It's not obvious what the API for this should be. Is the user obligated to make a struct and implement Fn for it? Should we create our own trait for events (and maybe for predictors as well)? Can we provide macros that make this easier, to make up for losing the convenience of plain closures?

SimpleTimeline interface polishing

Should it really be able to report the time/event that set the data to its current value? This makes serialized snapshots bigger, and is often unnecessary, or misleading (as I found in simple_diffusion when I modified a SimpleTimeline just to change a field that was different from the one that was based on the last change time). If it DOES report, it should presumably report the EventHandle instead of just the ExtendedTime. Originally, I only included this feature because it happened to be easy to provide, but it adds some annoyances, and storing the time as a 64-bit object inside VaryingData is only a small amount of memory overhead, and this is SimpleTimeline, not an especially optimized timeline.
Theoretically, it no longer needs to force wrapping its data in an Option. You could construct it with whatever initial value you wanted. However, this is still awkward if it's required to report the event, because the initial value won't have an associated event. Discarding the initial value in forget_before() would also make the code more complicated.
What if timelines can only be created in events, and are forbidden from being queried before the creation time? Then we wouldn't need a separate "initial value", and could always return just data (or data + EventHandle).
In the absence of query-by-reference, is it worth optimizing by making a query implementation that has a fn(&VaryingData)->Value generic parameter and returns a Value? Or is this something that should be handled on the TimeSteward-API end?
Is there a nice way to make several variants of the SimpleTimeline concept? Say, ones that do or don't report EventHandles, ones that do or don't have the query-tracking tree...

For the types that can only be dereferenced using a TimeSteward, is there any 0-cost way we can provide safety against using the WRONG TimeSteward object?

amortized::test_lots() should audit ALL the invariants, not just some of them

I was delayed in solving a bug when ran a test case where I called the function before and after every operation, to detect the first moment the invariants were broken, because it only ended up triggering after the first broken invariant caused the second invariant to break.

Tracking issue for current very-disruptive API changes

Currently, I hesitate to make too many more test cases (and even support libraries) because I'm going to have to update all of them in loads of places when I make API changes.

I'm hoping to settle these ones in particular:

Checklist for "deciding what to do":

#32, garbage collection (I don't necessarily need to implement garbage collection, just figure out how it will affect the DataHandle API)
#34, SimpleTimeline interface
#35, query by reference
#36, QueryOffset::Before
#46, StewardData

Checklist for actually implementing it:

Networking support

The TimeSteward is designed with networking in mind – especially for the case of keeping a simulation synchronized on 2 or more computers. Any full TimeSteward implementation is inherently suitable for networking, but we should go beyond this. The time_steward crate should provide a default networking system to do this, so that developers can easily build a networked simulation without having to write very much of their own networking code.

Provide a deterministic alternative to HashMap

We could implement a deterministic HashMap type (i.e. one where the iteration order depends only on the elements contained). For instance, it could use linear hashing and have each bucket contain a sorted vector (or B-tree) of elements.

We might want to make groups more inherent the TimeSteward. Without special features, storing even a deterministic HashMap in the TimeSteward costs O(n) operations per event that makes a single insertion or deletion. Reducing that back to O(1) would be desirable. On the other hand, storing large amounts of data in a single field is discouraged, so we might not want to spend extra effort to support doing that. And if we do support it, it may also still be useful to create a deterministic HashMap type for use DURING single events.

Use new polymorphism features when they arrive in Rust

TimeStewardLifetimedMethods and TimeStewardStaticMethods are hacks to work around the current limitations of Rust polymorphism. In the future, Rust will hopefully provide features that allow us to do these things with only one trait, TimeSteward.

This will also make it much easier to write code that is generic in the TimeSteward type. Our libraries, such as the collision detection, will no longer need to use awkward macros.

This will break compatibility with older versions of the TimeSteward, but some breaking changes are inevitable for this.

Figure out what I really mean by StewardData

Currently, various parts of the code require trait StewardData, but I'm not sure they have the same actual requirements, or if the requirements I've chosen are exactly the correct ones.

Also, if/when StewardData is actually the correct concept, if it's just a collection of supertraits, I probably want to make a blanket impl so that you don't have to implement it yourself all over the place.

Figure out what I really mean by StewardData [duplicate]

Currently, various parts of the code require trait StewardData, but I'm not sure they have the same actual requirements, or if the requirements I've chosen are exactly the correct ones.

Provide an easy way to generate large batches of ColumnId, etc.

We currently place upon the user the unchecked requirement to use secure random data for ColumnId, PredictorId, and EventId construction. Lazy users typing in nonrandom numbers would be awful, so it is critical that we make it as easy as possible to do the right thing.

This presumably needs to be cross-platform (not just a shell script that you can run on Linux).

We should also do more automatic checks to try to guarantee randomness (for instance, ban 0, and have more user-friendly checks for when you accidentally use multiples of the same id).

Make a BTreeMap-like data structure with heavily optimized deque operations

This is currently needed by SimpleTimeline. New tracked-queries are almost always inserted at the end of the structure, but it needs to be possible to insert them in the middle in less than O(n) time. So I currently use BTreeMap, but that isn't efficient because it takes O(log n) time to insert at the end (or remove from the beginning, as forget_before() requires). This is a significant chunk of the current CPU overhead.

There is no other already written code that would use this, but a lot of possible TimeSteward algorithms would benefit from it.

I have an idea for a modified B-tree, where the structure has pointers to the first and last leaf, and instead of just the root being allowed to have only 2 children, that relaxed condition would apply to every node on the left and right spine. That way, insertions at the end would be able to fill each node efficiently from empty to full without doing lots of operations further up the tree.

Initial documentation

Without documentation, the TimeSteward is essentially useless to anyone but me, and not optimal for myself as well. I need to do a serious pass at documenting all of the important features.

Currently, a few things have documentation comments, but it is haphazard.

This may have to wait on more API stabilization.

Finish implementing a full TimeSteward

So far, I have only finished implementing flat time stewards, even though they fail to fulfill the main point of this crate. I shall remedy this.

Split off a different crate for shared implementation details?

I made a bunch of implementation details public in case someone wants to implement trait TimeSteward in a different crate. However, I then crammed them all into one submodule so that they wouldn't clog up the documentation for TimeSteward users. Worse, for the ones that were macros, I labeled them #[doc (hidden)].

A logical thing to do instead would be to move the implementation support into a separate crate. Then it wouldn't appear in the TimeSteward USER documentation at all, but COULD be properly documented for the sake of TimeSteward implementors.

This is probably a long way off, due to various inconveniences. It will become more important if people start wanting to implement TimeSteward, or if the data structures in the implementation details become good enough and stable enough that I should provide them as separate libraries.

Better debug output for simply_synchronized

Currently, simply_synchronized has a few weaknesses:

Every error is a panic
There is no way to test the first moment at which a Predictor gives different results (note: this is because time stewards are NOT required to run the same predictors at the same times, so it's a little harder to define how to sync them)
The error messages don't contain all of the information that could be useful (e.g. a full log of the queries made by the first inconsistent event; a snapshot of the state immediately before the problem, so that you can rerun it)
Doesn't completely distinguish between TimeSteward internal errors and client errors (test_lots() helps with this, but see #24)

Single-threaded incremental serialization?

Standardize using ExtendedTime rather than base time in all API functions

I've been steadily exposing ExtendedTime more and more, and at this point, there's no reason not to go the rest of the way.

This applies to snapshot_before(), valid_since(), updated_until_before(), and forget_before().

This is mostly just an elegance thing, but it may be useful to allow snapshots of unusual ExtendedTimes for debug-examining stuff.

insert_fiat_event() could still use base time + id, because it's similar to the interface for creating predictions, and it supports automatically protecting the user from colliding fiat event ids with prediction ids.

The main downside of this change is that it would obligate the user to call beginning_of() themselves, but that seems tolerable.

Support upper time limits for step()

IncrementalTimeSteward::step() currently has a bit of a problem: if you're using a flat TimeSteward, you might have some free time to take a bunch of steps, but you can't afford to step beyond time X, the time of the next frame. The problem is that if IncrementalTimeSteward::updated_until_before() is lower than X now, you don't know whether it will be lower than X after one step.

One approach would be to make a IncrementalTimeSteward::next_step_time()->ExtendedTime. However, this isn't forwards-incompatible with concurrent TimeStewards that do a bunch of concurrent operations during step().

So, I propose giving step() a second argument, making it fn step (&mut self, limit: Option <ExtendedTime>)->bool. It would be guaranteed not to advance the settled time (see #38) as far as limit. It would also return false if there was no more work to do.

elidupree / time-steward Goto Github PK

time-steward's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs