GithubHelp home page GithubHelp logo

Comments (34)

ollie-etl avatar ollie-etl commented on June 25, 2024 1

@Noah-Kennedy : The behaviour of your benchmark in #151 is exactly what I meant by trying to keep to the principle of "least suprise". Your benchmark is submitting almost all operations prior to the benchmark measurement.

I think a builder API is very useful, not least because I'd like to be able to link submissions, but I maintain that futures do nothing until polled is the only sane behavior. It allows, bth on benchmarks and reality, us to define operation outside of the main execution loop

from tokio-uring.

Noah-Kennedy avatar Noah-Kennedy commented on June 25, 2024 1

@ollie-etl I'd agree, it seems like we were talking past each other in a lot of ways. I don't think we necessarily disagree on the core issues with the current API and on what largely needs done. I'll see if I can get up my draft PR with my proposed API changes tonight (my time, not yours), so that you can take a look. If I trim out anything related to SQE linking or async cancellation, I should be able to land it pretty quickly.

from tokio-uring.

ollie-etl avatar ollie-etl commented on June 25, 2024

@FrankReh I've also been thinking about this. I think deferred work is the way to go. Partly because this is how future is described , and we don't match that behavior currently

Futures alone are inert; they must be actively polled to make progress, meaning that each time the current task is woken up, it should actively re-poll pending futures that it still has an interest in.

from tokio-uring.

FrankReh avatar FrankReh commented on June 25, 2024

@ollie-etl I'm not disagreeing with you but asking for a clarification. Are you saying you think the function that returns the future for getting a response shouldn't be putting an entry in the squeue yet? We have both seen that adding an sqe to the squeue can run into the problem of the squeue needing to be submitted because it is full and that means a syscall which has to block long enough to drain the squeue. (I don't mean block in a bad way here, the current thread is going to have to make the syscall, whether it is when the operation is created or when it is first waited on, anyway.)

I just want to make sure you are leaning towards more work being done the first time the future is polled, rather than having some of the work done before the poll.

I'm almost on the fence on this but do like that the future could be dropped and nothing would have been put into the submission queue. Except ... I also like the ability at the moment to get an operation sent to the uring interface without having to wait for it - that comes in handy in drop functions when we want to use the uring interface to close or shutdown something.

Operations are likely to get more options, perhaps with a builder api of their own. Perhaps that would enable the option of directly adding to the submission queue in some cases but not in others.

from tokio-uring.

ollie-etl avatar ollie-etl commented on June 25, 2024

@FrankReh yes, I think the work should only be put on the squeue on poll - I think that's more in keeping with the spirit of futures. I think that is perhaps least surprising for users too.

I briefly started implementing this to see what it'd look like, but haven't got very far yet

from tokio-uring.

FrankReh avatar FrankReh commented on June 25, 2024

@ollie-etl Perhaps orthogonal to what you're doing, but I wanted to make sure you were aware it is possible the kernel is not being informed of entries being added to the ring, in a timely manor, because the submission sync isn't called for every entry or even for every 'n' entries. But I'm think of it like a flush and I don't know how long a cache line is allowed to be on two CPUs without the stale one being replaced.

from tokio-uring.

Noah-Kennedy avatar Noah-Kennedy commented on June 25, 2024

So, what I'd like from an API perspective is to have oneshot futures first submit an op and then return a future referencing the in-flight op which provides support for operations like cancellation.

I'd also like to have something like a builder API for constructing an op and operating on it pre-submission. I've been meaning to get around to implementing both.

from tokio-uring.

Noah-Kennedy avatar Noah-Kennedy commented on June 25, 2024

I'm going to be opening a PR hopefully tomorrow to restructure how we do op submissions. I'm actually thinking that it would ne better to make the transition to in-flight explicit rather than implicit, as is proposed here. It turns out that the change proposed here actually makes reasoning about or implementing linked SQEs kinda hard.

from tokio-uring.

ollie-etl avatar ollie-etl commented on June 25, 2024

It turns out that the change proposed here actually makes reasoning about or implementing linked SQEs kinda hard.

I'm not sure I agree. I too have been thinking about this (#165), and although haven't got round to linking SQE's yet, have given it some consideration, and don't see the blockers. Maybe they're the unknown unknowns

from tokio-uring.

Noah-Kennedy avatar Noah-Kennedy commented on June 25, 2024

I'm toying around with things now, and this is considerably more nuanced than I had thought. Let's postpone any work on this until after SQE linking is supported.

from tokio-uring.

Noah-Kennedy avatar Noah-Kennedy commented on June 25, 2024

I'm working on this right now, and hoping to get this in soon:tm:.

from tokio-uring.

oliverbunting avatar oliverbunting commented on June 25, 2024

@Noah-Kennedy It's maybe not fundemental, but quite central to the implementation approach I'm toying with for linking

from tokio-uring.

Noah-Kennedy avatar Noah-Kennedy commented on June 25, 2024

@oliverbunting I think we should discuss the SQE linking then, not this.

from tokio-uring.

ollie-etl avatar ollie-etl commented on June 25, 2024

I think deferred Ops are actually more important. That is because I think the behavior of the current API is pretty close to being a bug - its certainly not intuitive, and goes against the documented behavior for implementations of the Future trait, whereas Linking SQE's is a feature.

from tokio-uring.

ollie-etl avatar ollie-etl commented on June 25, 2024

@Noah-Kennedy See #169

from tokio-uring.

Noah-Kennedy avatar Noah-Kennedy commented on June 25, 2024

@FrankReh @oliverbunting I'd actually like us to go another route: towards more granular control of when an op is in-flight. I think this approach here has a couple of issues that more granular control solves better.

First, deferring work to time of poll produces inconsistent cancellation behavior. Say that a user creates a future to send data over a socket. If you submit the op when constructing the future, you have predictable drop/cancellation behavior. You know that the write will either occur or error if you drop the future. If you defer that work, you have no predicable behavior here. This produces a footgun and source of confusion.

Second, being able to control when an Op hits the queue is super useful for optimization. I'd really like to be able to submit an op and get it in the queue before I spawn off a task to handle the response and do further work, for instance. This allows you to save round trips through the scheduler and can actually be a pretty significant optimization under certain benchmarks.

If we give the user more granular control, we can solve these issues and open up a lot of interesting patterns without really sacrificing much:

/// Constructed but unsubmitted op
pub struct Unsubmitted<T> { ... }

/// Operation which has been pushed to the squeue
pub struct InFlightSingleShot<T> { ... }

/// Example read implementation.
///
/// This constructs an unsubmitted read op. The user explicitly controls when this gets submitted, which is useful for op linking and other pre-submission operations.
///
/// Note: for simplicity's sake I'm ignoring IoBuf
pub fn read(&self, buf: Vec<u8>) -> Unsubmitted<Vec<u8>> {
    // build op, providing custom vtable, either via a 'static trait object or a custom vtable like in Waker
    // this vtable basically just specifies what to do on completion as far as any post-processing is concerned on the returned data
    build_op(&READ_VTABLE, buf)
}

impl<T> Unsubmitted<T> {
    // there would be other fns that let you do thinks like build chains, unfortunately, I'm still working out the precise details here.

    // todo does this return io::result? this should handle any reasonable transient failure or flush the squeue and receive completions if needed. there are other, more permanent failure causes, but not sure if those should be panic or error.
    fn submit(self) -> InFlightSingleShot<T>;
}

impl Future for InFlightSingleShot<T> {
    Target = io::Result<T>;
    ...
}

/// still not sure if I like having intofuture here
impl IntoFuture for Unsubmitted<T> {
    Output = io::Result<T>;
    type IntoFuture = InFlightSingleShot<Self::Output>
    ...
}

I have yet to fully get this implemented in my PR unfortunately but I have largely solidified an API. I think that this is probably the middle ground that we want. It removes the footgun by making things much more explicit, and is kinda required to do SQE linking.

from tokio-uring.

oliverbunting avatar oliverbunting commented on June 25, 2024

@Noah-Kennedy wrt sqe linking, I think that's fundementally wrong. I'll finish a raft pr tonight with proves this. It won't be pretty, but will show my point

from tokio-uring.

oliverbunting avatar oliverbunting commented on June 25, 2024

@Noah-Kennedy wet saw linking, I think that's fundementally wrong. I'll finish a raft pr tonight with proves this. It won't be pretty, but will show my point

from tokio-uring.

oliverbunting avatar oliverbunting commented on June 25, 2024

@Noah-Kennedy you'll note I have provided an override for immediate submission, although that is explicit

from tokio-uring.

Noah-Kennedy avatar Noah-Kennedy commented on June 25, 2024

@oliverbunting can you give me a short explanation of how this is wrong? I know that there are ways to do this entirely in-driver, but everything that I've messed with there led to a huge increase in complexity and a reduction in performance.

from tokio-uring.

ollie-etl avatar ollie-etl commented on June 25, 2024

@Noah-Kennedy Sure, lets keep things abstract to start with, although #165 is a concrete example. First lets consider differed work only, not linking SQE's

Implementing this is quite straightforward. At Op creating, we do work to create an Sqe. We do this regardless of if we defer work or not. To defer work until poll, we must simply store it as Pending until Polled, at which point it gets entered into the queue. As we were doing this work anyway, the cost is approximately identical. Indeed I've benchmarks in #165 to prove this. For usecases which are approximately

Op::create_an_op().await

I strongly suspect the compiler does basically the same thing in both cases.

On to #165 in particular, the above is what we do, with the addition of enqueue(), which is the manual override for I want to submit this right away

We can now front loaded computations like your benchmark in #151, giving us the option to move work out of critical section, giving a performance boost in the region of interest

  // pre compute work
  let mut js = JoinSet::new();

  for _ in 0..opts.iterations {
      js.spawn_local(tokio_uring::no_op());
  }
  
  // timing critical region
  while let Some(res) = js.join_next().await {
      res.unwrap().unwrap();
  }
  //

Linking SQE's is orthogonal to the problem, I think with a builder API you could do it with or without deferred work. I'm going to explain with, because that is what I have considered

My approach basically boils down to having a new Op type Op<Link<A,B>, LinkMarker>, which we can create from any Op builder Send Accept Connect etc. When this is polled, we enqueue both ops, and poll the first. When it completes we return (<A as Future>::Output, Op<B,_>).

from tokio-uring.

Noah-Kennedy avatar Noah-Kennedy commented on June 25, 2024

Let's slow down a bit. What does this change make easier or less dangerous? It still isn't clear to me that this actually improves anything for the complexity that it adds to future work. I'd like us to try to model the semantics of a uring operation as closely as we can, and this really just seems to obfuscate the underlying semantics of uring.

I really can't see any cases that this helps, other than the very contrived situation with that benchmark. If you can show me situations where this improves things, I'd be a lot more onboard here. At present, this really just seems to add some indirection to things. What does this improve or allow us to improve, either for us as maintainers, or for users?

from tokio-uring.

ollie-etl avatar ollie-etl commented on June 25, 2024

@FrankReh I'm less concerned about matching io_uring, and more concerned that our futures violate the published behaviour of Futures. As I've at least partially demonstrated, it doesn't hurt performance

from tokio-uring.

Noah-Kennedy avatar Noah-Kennedy commented on June 25, 2024

This is the behavior of futures, but it isn't a general invariant with async APIs, and in fact its actually pretty common to not do this. Futures themselves do no work unless polled, largely because of the readiness model they follow.

In fact, it's actually super common for reasons of efficiency or API requirements to do work up-front and return a future that tracks the progress of an in-flight operation. For example, IIRC reqwest does this by putting a request in flight and returning a future which tracks its progress: https://docs.rs/reqwest/latest/src/reqwest/async_impl/client.rs.html#1453

Completion-based stuff seems to be a common case where people don't defer work until polling.

from tokio-uring.

ollie-etl avatar ollie-etl commented on June 25, 2024

@Noah-Kennedy I think with a bit of reinterpretation, your and my approaches actually have a lot in common

from tokio-uring.

ollie-etl avatar ollie-etl commented on June 25, 2024

My approach is rather 3 states, not 2
Builder -> Deferred -> In flight

The way I'm approaching Builder in the first instance is for use the Op struct as the builder. By that, I mean the Op<THIS_OP_TYPE,_>. Those structs already have all the info required to build an operation.

I'll give an example for fsync, although I'll have a PR ready shortly also. I thought I'd do this first, as links are a short step from this

pub trait Buildable 
where 
    Self: 'static + Sized + Completable
{
    /// The CqeType type which results from Submission
    type CqeType; //  = SingleCQE; // If default types were stable

    /// The operation to consume the builder to create the sqe
    fn create_sqe(&mut self) -> squeue::Entry;
   
   /// Was submit_with in `Op` 
   fn submit(mut self) -> io::Result<Op<Self, <Self as Buildable>::CqeType>> {
   ...
   }
   
   pub(crate) struct Fsync {
    fd: SharedFd,
    flags: Option<types::FsyncFlags>,
}

impl Op<Fsync> {
    pub(crate) fn fsync(fd: &SharedFd) -> io::Result<Op<Fsync>> {
        Fsync { fd: fd.clone(), flags: None }.submit()
    }

    pub(crate) fn datasync(fd: &SharedFd) -> io::Result<Op<Fsync>> {
        Fsync { fd: fd.clone(), flags: Some(types::FsyncFlags::DATASYNC)}.submit()            
        
    }
}

impl Buildable for Fsync
where 
    Self: 'static + Sized
{
    type CqeType = op::SingleCQE;

    fn create_sqe(&mut self) -> io_uring::squeue::Entry {
        let mut opcode = opcode::Fsync::new(types::Fd(self.fd.raw_fd()));
        
        if let Some(flags) = self.flags{
            opcode.flags(flags);
        }

        opcode.build()
    }
}
```

from tokio-uring.

ollie-etl avatar ollie-etl commented on June 25, 2024

likewise, I'll get my stuff I'm up. I'd be very suprised if eitehr approach is rigth first time out. Lets get our Pr's up, and take some time to review. I suspect we'll come to a hybrid, whcihh is usually the way of things

from tokio-uring.

Noah-Kennedy avatar Noah-Kennedy commented on June 25, 2024

likewise, I'll get my stuff I'm up. I'd be very suprised if eitehr approach is rigth first time out. Lets get our Pr's up, and take some time to review. I suspect we'll come to a hybrid, whcihh is usually the way of things

Yeah, my thoughts as well. I think a builder is actually a logical second step once the "constructed and linkable" and "in flight" stages are in place. The "Unsubmitted/Prepared" could be obtained via the builder or via a call to a function like read, and I think it's easier to omit the builder initially and add it in a follow-up.

from tokio-uring.

Noah-Kennedy avatar Noah-Kennedy commented on June 25, 2024

It will be interesting to compare the two approaches. I suspect we'll pull my phases in with your builder.

from tokio-uring.

ollie-etl avatar ollie-etl commented on June 25, 2024

I, as usual, have gone the other way. I was thinking that if I had a builder, linking to the next op is a logical way of consuming a builder

from tokio-uring.

Noah-Kennedy avatar Noah-Kennedy commented on June 25, 2024

I had initially done something similar before deciding that the approach I took would allow us to better hide a lot of API internals and more nicely integrate this with non-builder functions in the interface.

It basically just produces a nicer API, although the two are quite similar semantically.

from tokio-uring.

ollie-etl avatar ollie-etl commented on June 25, 2024

See #171 for a builder framework

from tokio-uring.

ollie-etl avatar ollie-etl commented on June 25, 2024

@Noah-Kennedy This is starting to become a pain point for me in deployment: I really want to optimize by creating ops whilst there is space in the ring - that could probably be solved another way, but a builder with try-submit op would also solve it.

I think we agree that what we want here is more granularity of state. Focusing on BuilderPhase -> Op i think we should do the following to cover all usecases

Build the op struct. Does not enqueue the operation, and is not a future. We simply have a structure representing the op, and we can chose what to do with it. We can:

  • submit() - place immediately on queue and return future. If its not possible to place on the queue, cause queue to advance such that it is.
  • try_submit() - try and place on the queue, like above, but fail if there is no space. Needs though R.W linked ops, which may have to advance queue regardless
  • defer() - create a future, but only do work on poll.

Thoughts on #171 would be appriciated

from tokio-uring.

FrankReh avatar FrankReh commented on June 25, 2024

Have started discussion #218. I think we need a place to talk not only about the next step but also the long term goals of the new API.

from tokio-uring.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.