GithubHelp home page GithubHelp logo

Comments (5)

davidchisnall avatar davidchisnall commented on August 23, 2024

My biggest concern (which I think I wrote in the design doc for the claims mechanism) is that this makes freeing an O(n) operation in the number of hazard slots, which is likely to be some small number times the number of compartments.

I wonder if there's something clever (or, at least, 'clever') that we can do with a bloom filter to let the allocator know if a given hazard set is likely to contain a given pointer (and, ideally, have a fast path for none have it, though clearing the bloom filter entries is tricky if it's global because recalculating the minimum set requires storing everything).

I am not sure I understand the synchronisation in this model. Currently, the allocator does not run with interrupts disabled at any point (except when reacquiring a lock, which will probably change at some point). As such, I'm not sure how you avoid a race between the allocator seeing that there are no hazards and freeing. I believe this could be addressed by having each hazard table hold a capability to a ptraddr_t owned by the allocator and have the allocator insert the address of an object to be freed into it, as soon as it has found the base address but before it checks hazard pointers.

I'm also not sure how this would scale to multicore systems.

from cheriot-rtos.

nwf-msr avatar nwf-msr commented on August 23, 2024

My biggest concern (which I think I wrote in the design doc for the claims mechanism) is that this makes freeing an O(n) operation in the number of hazard slots, which is likely to be some small number times the number of compartments.

I debated using per-thread state rather than the SealedHeapHazards thing... it might require our privileged library run with ASR for a little bit to get the current thread structure and the array of hazards there in, and it would need to traverse a collection of threads. If you prefer O(threads) to O(compartments), that could be a way to go.

I wonder if there's something clever (or, at least, 'clever') that we can do with a bloom filter to let the allocator know if a given hazard set is likely to contain a given pointer (and, ideally, have a fast path for none have it, though clearing the bloom filter entries is tricky if it's global because recalculating the minimum set requires storing everything).

There are "counting bloom filters" that support fast removal of elements so long as none of the internal counters saturate (IIRC). I wonder how well we could do with just a single cheap hash function.

If we were OK with making hazard registration require finding the object header, like claims do, we can use a bit to indicate that there might be a hazard on an object.

I am not sure I understand the synchronisation in this model. Currently, the allocator does not run with interrupts disabled at any point (except when reacquiring a lock, which will probably change at some point). As such, I'm not sure how you avoid a race between the allocator seeing that there are no hazards and freeing. I believe this could be addressed by having each hazard table hold a capability to a ptraddr_t owned by the allocator and have the allocator insert the address of an object to be freed into it, as soon as it has found the base address but before it checks hazard pointers.

That's the purpose of the "initial taint" in free() and why registering a hazard looks for taint: as soon as that initial taint is registered, it is no longer possible to create a new hazard for a given object.

I'm also not sure how this would scale to multicore systems.

Indeed, that's tricky, since this builds on IRQ deferral to achieve atomicity.

from cheriot-rtos.

davidchisnall avatar davidchisnall commented on August 23, 2024

Some thoughts from waking up in the middle of the night:

The main purpose of the hazard mechanism is for very short-lived claims. If you’re doing a longer-term claim, the cost of calling heap_claim is amortised. As such, we don’t need per-thread per-compartment hazards. We can, in fact, get away with two hazard slots per thread, because that lets us support memcpy between untrusted objects. We could manage with one if we did some bounce buffering but two seems like the minimum for convenience.

We can assume hazard slots are clobbered on every cross-compartment call. You can always reestablish hazards on return.

I think we can implement this fairly easily:

  1. Add a hazard region that is provided to the allocator as an MMIO capability.
  2. In the loader, place a store-only capability (yay, or first use case for these!) to this into each register save area.
  3. Provide a switcher API to load this capability.

This leaves a potential race in the scanning. I suggest that we address this by having the allocator expose a free epoch counter. This is incremented before each scan of the hazard list and after each free. The flow for acquiring a hazard is therefore:

  1. Read the epoch counter.
  2. Insert the capability into your hazard slot.
  3. If the epoch counter is odd, wait until it is even (futex).
  4. Test the tag on the capability.

If the epoch word uses a 16-bit counter, then we can use a priority-inheriting wait, so whichever thread is in the allocator gets a short boost from threads trying to use the hazard mechanism.

On the fast path, this doesn’t require any cross-compartment calls (assuming that we’ve already got the epoch counter. I really should extend MMIO capabilities to allow read-only views so that we can make it an MMIO capability) and we can put an acquire-hazards function in a shared library.

The extra costs in free are:

  • Two extra pointer checks (load, ctestsubset) per thread.
  • Two counter updates.
  • Storing hazard states until they are gone and freeing the hazard list later (maybe defer this until an allocation fails?)

There are some potential availability risks here. A thread can keep an object live for an unbounded amount of time with the hazard mechanism. If we zero the hazard pointers on compartment call, then any yielding operation will implicitly zero, so that’s probably fine.

Thoughts?

from cheriot-rtos.

nwf-msr avatar nwf-msr commented on August 23, 2024

I'm happy with 2 hazards per thread, presumed clobbered on cross-compartment calls, and for them to be stored in a contiguous array set up by the loader and shared as you describe. It's a little unfortunate to need to go to the switcher for the per-thread cap, but that could plausibly be a privileged library call rather than a full compartment call.

I don't understand where your proposal distinguishes between removing a hazard declaration to an object that is still live and removing a hazard declaration to an object that has been freed while the hazard was in place?

Matt pointed me at the "pass the pointer" scheme in https://dl.acm.org/doi/10.1145/3437801.3441596 which we might want to adapt. It relies on atomics, which we can emulate readily enough, and on having a copy of the hazard pointers ("handover") that's used to walk objects closer to free. However, as with almost all discussions of hazard pointers, it assumes cooperation in the sense that one only free-s something that is retired from global visibility, something upon which we cannot depend; declaring a hazard will inherently be more complex as a result, as we need to do something to ensure progress, that probably means preventing the same object(s) from being juggled between hazard slots forever, and that probably means either linear time (to examine object headers or other hazard slots) or space (expanded revocation bitmap).

from cheriot-rtos.

davidchisnall avatar davidchisnall commented on August 23, 2024

There is a small, bounded, number of object whose lifetime is prolonged by the hazard mechanism (2* number of threads). As such, my plan (this is the part I haven’t implemented yet) is to have the allocator maintain a list of object that it has not freed because they have hazards. When it sees an object on the hazard list that someone is trying to free, it copies the pointer to the other list. On every malloc or free, if this list is not empty, we try to recheck any queued object and free them. This is quadratic in the worst case, but with a very small n.

from cheriot-rtos.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.