When using mainloop::threaded::Mainloop , the callback

Possible undefined behavior due to missing `Send` for callbacks,about jnqnfe/pulse-binding-rust

Comments (6)

jnqnfe commented on June 15, 2024

The callbacks are executed in a thread dedicated to the threaded mainloop, however when this happens, the threaded mainloop thread is holding the threaded mainloop lock! Outside of callbacks when you use any PA objects or anything else used within your callbacks, you must hold the lock yourself, thus the lock ensures only one thread at a time is touching those objects.

There is thus no unsafety. Marking the callbacks as Send would make no difference for threaded mainloop use, but could negatively impact use with the non-threaded mainloop. If I recall correctly it was for this reason that I did not add the Send constraint.

from pulse-binding-rust.

zroug commented on June 15, 2024

I don't think that is enough.

Consider the following program:

use libpulse_binding::context::Context;
use libpulse_binding::context::State;
use libpulse_binding::mainloop::threaded::Mainloop;
use std::rc::Rc;

fn main() {
    let mut mainloop = Mainloop::new().unwrap();
    let mut ctx = Context::new(&mainloop, "Test").unwrap();
    ctx.connect(None, libpulse_binding::context::flags::NOFLAGS, None)
        .unwrap();

    mainloop.start().unwrap();
    loop {
        match ctx.get_state() {
            State::Unconnected => {}
            State::Connecting => {}
            State::Authorizing => {}
            State::SettingName => {}
            State::Ready => break,
            State::Failed => unimplemented!(),
            State::Terminated => unimplemented!(),
        }
    }

    let x = Rc::new(0);
    let x2 = x.clone();

    ctx.introspect().get_server_info(move |_| loop {
        x2.clone();
    });

    loop {
        x.clone();
    }
}

Very often this causes SIGILL, meaning that undefined behavior has definitely happened. (Make sure that the compiler doesn't optimize away the loops if you want to try it.) This happens because the Rc gets called simultaneously from multiple thread but only Arc supports this. Safe rust stops this error at compile time:

use std::rc::Rc;

fn main() {
    let x = Rc::new(0);
    let x2 = x.clone();

    std::thread::spawn(move || loop {
        x2.clone();
    });
    loop {
        x.clone();
    }
}

This program causes compile time error.

error[E0277]: std::rc::Rc<i32> cannot be sent between threads safely

from pulse-binding-rust.

jnqnfe commented on June 15, 2024

I don't think that is enough.

Consider the following program:

...

Very often this causes SIGILL, meaning that undefined behavior has definitely happened. (Make sure that the compiler doesn't optimize away the loops if you want to try it.) This happens because the Rc gets called simultaneously from multiple thread but only Arc supports this. Safe rust stops this error at compile time:

...

This program causes compile time error.

error[E0277]: std::rc::Rc<i32> cannot be sent between threads safely

You're doing it completely wrong! Re-read what I said before about the mainloop lock and then note that your example does not make use of this lock anywhere. This is the key reason why you experience problems in your example.

You need to take a look at the big example at the end of the threaded mainloop documentation.

Use of Rc instead of Arc for objects that get get used in the callbacks is perfectly valid here. Correct use of the threaded mainloop requires lock()ing and unlock()ing the mainloop lock, thus controlling when the mainloop can run and execute callbacks. You do not need a further layer of locking done by Arc on objects you make use of in the callbacks. It would be completely redundant.

Standard simple protection mechanisms of Arc and Send cannot be used to create a safe usage model that catches simple user mistakes here. If Send was a required property on callbacks, this:

Does not do anything to ensure that the threaded mainloop is used correctly with its lock.
Would force use of Arc instead of Rc to wrap objects used in callbacks, thus forcing an unnecessary second level of locking and thus inefficiency.
Would force users of the standard mainloop to pointlessly make use of Arc, thus creating inefficiency.

Implemented correctly, your example would be:

extern crate libpulse_binding;

use libpulse_binding::context::Context;
use libpulse_binding::mainloop::threaded::Mainloop;
use std::rc::Rc;
use std::cell::RefCell;
use std::ops::Deref;

fn main() {
    let mut mainloop = Rc::new(RefCell::new(Mainloop::new()
        .expect("Failed to create mainloop")));

    let mut context = Rc::new(RefCell::new(Context::new(mainloop.borrow().deref(), "Test")
        .expect("Failed to create new context")));

    // Set up context state change callback
    {
        let ml_ref = Rc::clone(&mainloop);
        let context_ref = Rc::clone(&context);
        context.borrow_mut().set_state_callback(Some(Box::new(move || {
            let state = unsafe { (*context_ref.as_ptr()).get_state() };
            match state {
                libpulse_binding::context::State::Ready |
                libpulse_binding::context::State::Failed |
                libpulse_binding::context::State::Terminated => {
                    // Using `signal(false)` we return, giving up the lock and waking any
                    // `wait()`ing threads, which then re-obtain the lock before continuing.
                    unsafe { (*ml_ref.as_ptr()).signal(false); }
                },
                _ => {},
            }
        })));
    }

    // Connect context
    // This is safe to do before grabbing the mainloop lock because we have not started the mainloop yet!
    context.borrow_mut().connect(None, libpulse_binding::context::flags::NOFLAGS, None)
        .expect("Failed to connect context");

    // Grab the mainloop lock, preventing the mainloop from running
    mainloop.borrow_mut().lock();
    // Get the mainloop up and started (but it will be blocked from looping because we hold the lock)
    mainloop.borrow_mut().start().expect("Failed to start mainloop");

    // Wait for context to be ready
    loop {
        match context.borrow().get_state() {
            libpulse_binding::context::State::Ready => { break; },
            libpulse_binding::context::State::Failed |
            libpulse_binding::context::State::Terminated => {
                eprintln!("Context state failed/terminated, quitting...");
                mainloop.borrow_mut().unlock();
                mainloop.borrow_mut().stop();
                return;
            },
            // If not a clear success/fail state change, ignore.
            // By `wait()`ing this thread gets paused and the lock released,
            // which will be automatically re-obtained upon resuming, thus we
            // allow the mainloop to take the lock and make some progress.
            _ => { mainloop.borrow_mut().wait(); },
        }
    }

    // We no longer care about reacting to state changes so clear the callback
    context.borrow_mut().set_state_callback(None);

    let x = Rc::new(0);
    let x2 = x.clone();

    // Guard to use for protection against spurious wakeup of our thread
    let mut guard = Rc::new(RefCell::new(true));

    // Setup introspection work (done in a callback)
    // Note that we still hold the lock and so it will not execute until we give up the lock!
    {
        let ml_ref = Rc::clone(&mainloop);
        let guard_ref = Rc::clone(&guard);
        context.borrow().introspect().get_server_info(move |_| {
            x2.clone();
            unsafe {
                *guard_ref.as_ptr() = false;
                (*ml_ref.as_ptr()).signal(false);
            }
        });
    }

    // We can temporarily give up the lock and give the mainloop a chance to do some work by `wait()`ing.
    // In fact by `wait()`ing our thread will remain paused until something signals us to wake up.
    // Note that we do this in a loop with an atomic guard variable to guard against spurious wakeups.
    while *guard.borrow() {
        mainloop.borrow_mut().wait();
    }

    // Continue with local work
    x.clone();

    // Clean shutdown (optional - will be done automatically by destructors)
    mainloop.borrow_mut().unlock();
    mainloop.borrow_mut().stop();
}

Note that your loops around x and x2 cloning are not present in my modified example; they have no place in the fixed example, or at least the one in the callback doesn't because if you infinitely loop on cloning x2, the callback will never end and so never set the guard state and wakeup the main thread to let it get to where x gets cloned. I.e. now we're implemented things properly, x cloning and x2 cloning never overlap.

from pulse-binding-rust.

zroug commented on June 15, 2024

I'm perfectly aware that I wasn't using the library correctly at all. That was deliberately. But the example wasn't about using it correctly. It was about how using it incorrectly causes undefined behavior. I wasn't using any unsafe blocks. Safe Rust guarantees me, that no matter how wrong I use a library, any undefined behavior must be the fault of the library that uses unsafe.

Consider this section from the Rustonomicon.

Much of the Rust standard library also uses Unsafe Rust internally. These implementations have generally been rigorously manually checked, so the Safe Rust interfaces built on top of these implementations can be assumed to be safe.

The need for all of this separation boils down a single fundamental property of Safe Rust:

No matter what, Safe Rust can't cause Undefined Behavior.

The design of the safe/unsafe split means that there is an asymmetric trust relationship between Safe and Unsafe Rust. Safe Rust inherently has to trust that any Unsafe Rust it touches has been written correctly. On the other hand, Unsafe Rust has to be very careful about trusting Safe Rust.

from pulse-binding-rust.

zroug commented on June 15, 2024

To clarify: I agree that requiring users of the simple mainloop to use unnecessary synchronization primitives is not optimal. But to avoid that some big changes to the exposed api are necessary.

I think the best option would be to use an abstraction similar to Futures or use Futures directly. One could then make Context generic over the mainloop implementation, with a Mainloop trait, that has an associated constant, that determines whether synchronization is required or not.

from pulse-binding-rust.

jnqnfe commented on June 15, 2024

I'm perfectly aware that I wasn't using the library correctly at all. That was deliberately. But the example wasn't about using it correctly. It was about how using it incorrectly causes undefined behavior. I wasn't using any unsafe blocks.

Oh, okay, that was not clear.

Yes, the safety guarantees of this library are not perfect, but it's also not an easy problem to solve given the complexities and requirements of the underlying C library.

Safe Rust guarantees me, that no matter how wrong I use a library, any undefined behavior must be the fault of the library that uses unsafe.

Well that's not exactly true. For instance, you could create a reference that's actually a null pointer or an invalid UTF-8 String or &str and pass that into a library that is designed to not expect this. That would obviously be your fault not the library's. Of course this requires you using unsafe code and perhaps you meant only when using the library incorrectly with purely "safe" code. And it would be more correct to say that it is a safety issue with the library (a bug or design flaw) rather than the fault of the library (blame).

Consider this section from the Rustonomicon.

Much of the Rust standard library also uses Unsafe Rust internally. These implementations have generally been rigorously manually checked, so the Safe Rust interfaces built on top of these implementations can be assumed to be safe.
The need for all of this separation boils down a single fundamental property of Safe Rust:
No matter what, Safe Rust can't cause Undefined Behavior.
The design of the safe/unsafe split means that there is an asymmetric trust relationship between Safe and Unsafe Rust. Safe Rust inherently has to trust that any Unsafe Rust it touches has been written correctly. On the other hand, Unsafe Rust has to be very careful about trusting Safe Rust.

This relates to the design of the language and standard libraries only. It does not define a baseline standard that all 3rd-party libraries (or at least those hosted in crates.io) must live up to. Ideally all Rust libraries should indeed follow the same standard, but this text does not make it a requirement and it's not always feasible.

You could argue that this library of mine should have things marked as unsafe as it stands but this would be awful since it would have to apply to almost everything considering that the design of the PulseAudio mainloop requires that PulseAudio objects should only be used with the lock held for thread safety reasons, and that's simply just not possible to guarantee at build time.

To clarify: I agree that requiring users of the simple mainloop to use unnecessary synchronization primitives is not optimal.

To be clear, the particular distinction is between threaded and non-threaded ("standard") mainloop use. There is a "simple" interface also offered via the libpulse-simple-binding crate. It's also worth noting that the libpulse-glib-binding crate provides a third mainloop implementation built upon the mainloop of the glib framework.

"that has an associated constant, that determines whether synchronization is required or not"

I do not believe this is possible. I believe that each unique callback signature would have to be an associated type of the trait, with Send being a part of the types in one case and not in the other.

But to avoid that some big changes to the exposed api are necessary.

I think the best option would be to use an abstraction similar to Futures or use Futures directly. One could then make Context generic over the mainloop implementation, with a Mainloop trait, that has an associated constant, that determines whether synchronization is required or not.

None of this stuff is new to me, I've wrestled with these issues conceptually before. I'm not at all opposed to changing things where there is a clear benefit, but benefits must be weighed against drawbacks; this situation is difficult and requires compromise in one area or another. Thus far I have felt it best to lean in the direction of what is best (in terms of efficiency for example) for those who use the library correctly, avoiding things that help hold the hand of those who make big mistakes (like not using the mainloop lock) which also notably negatively affects those who use it correctly, like forcing a second layer of locking does. This after all is supposed to just be a very thin layer through which to interface with the PulseAudio client library.

There is already a Mainloop trait - see pulse-binding/src/mainloop/api.rs.

Yes, it could hypothetically be possible to also have a similar trait (to the existing Mainloop one) for Context for which ContextStandardML and ContextThreadedML types could be derived (or similar), with the difference being use of theSendrequirement on callbacks. (This in no way requires using Futures btw). The former to be used with the non-threaded mainloop, and the latter with the threaded mainloop. Such a change would significantly affect code in at least all of the following modules: context, context::introspect, context::scache, context::subscribe, context::ext_device_manager, context::ext_device_restore, and context::ext_stream_restore, with all of this either being combined into a single trait, or left broken up as is would require every one of the several impl blocks being converted to traits and implemented for both context types. It would certainly be a significant change and it's only hypothetical. It would need to be assessed as being compatible with the "simple" interface and glib mainloop.

This would ensure that Send constraints do not impact users of the "standard" (non-threaded) mainloop.

However, with respect to the threaded mainloop, you never get away from the requirement of having to use its mainloop lock, in addition to the lock that would then be required to wrap non-Send objects (Arc<Mutex<T>>). Thus you force users who do everything correctly to have to go through two layers of locking for no functional benefit as I've already expressed.

Furthermore it in no way guarantees full thread safety since if you fail to use the mainloop lock then things are still unsafe, with the PulseAudio client library mainloop doing its thing involving itself grabbing the mainloop lock and using PulseAudio objects, up to the point of executing user callbacks, where if the callback involves unlocking an Arc<Mutex<T>> object, only then would it halt if you've already got it unlocked. By this point of course thread safety is already broken.

i.e.

User thread fails to grab mainloop lock
User thread locks Arc<Mutex<FOO>> object and starts doing stuff...
Mainloop thread successfully obtains its lock even though it shouldn't right now because user thread should be holding it but isn't
Mainloop thread thus starts running a mainloop loop, starting with doing various bits of work leading up to running user callbacks
Mainloop thread starts running user callback, which if it happens to involve this FOO object, will only then come to a halt trying to unlock it since the user thread is still using it (presuming it still is)

This safety issue (2 & 4 running in parallel) is impossible to protect against with the design of the mainloop as it is by changes such as adding Send constraints. User code must make use of the mainloop lock in achieving thread safety.

We could consider forcing users to wrap the mainloop as Arc<Mutex<Mainloop>> and pass a &MutexGuard<T> in every single method call on a PulseAudio object, as a means of proving that the mainloop is locked (or change the lock() method on the mainloop to create a custom MainloopGuard object, thus avoiding the use of Arc and Mutex). But besides the ugliness of this, it does nothing to prevent the user passing in a reference to a completely different mainloop instance. To try to block that you'd have to explore solutions involving blocking users from making more than one mainloop instance (but some users may legitimately need to do so), or blocking more than one per thread, or somehow using thread locals to check the pointer matches a recorded one or something). Nightmare.

You've also got to consider the fact that we already mark objects like Context as Send (so that they can be sent from one user thread to another with only an Arc wrapper rather than as Arc<Mutex<T>>), so as it is, unless users explicitly choose to wrap our crate objects as Arc<Mutex<T>>, having Send as a requirement on the callbacks only impacts user and 3rd-party objects (forcing those to be Arc<Mutex<T>> wrapped). Properly implementing your proposed change would thus additionally involve removing Send from the crate objects, thus also forcing these to be Arc<Mutex<T>> wrapped to be passed around, thus forcing a second lock layer (in addition to the mainloop lock) on use of those as well.

Something one might consider is replacing use of the actual PulseAudio threaded mainloop entirely, to try and achieve throwing out the use of the threaded mainloop lock of the existing implementation in favour of Arc<Mutex<T>> based locking only, but (1) the execution of a loop of the mainloop must be atomic, and you cannot really guarantee this without a lock for the mainloop itself (you cannot have it unlock the context, use it, unlock it, then start the user callback which then itself locks the context, you'd have to pass the already locked context into the callback, whilst preventing the callback from unlocking it itself). And (2) when I looked into this previously, it seemed that it would not be possible without completely replacing the entire PulseAudio client library.

Note that when I last spoke to the PulseAudio guys about Rust, they were perfectly open to PulseAudio itself being converted to Rust, but I've not yet had the time to explore that, and it's going to be difficult to do so while maintaining the existing C interface for non-Rust use. It is also now of questionable value with PipeWire potentially lined up to replace PulseAudio in the near future.

The one particular notable benefit of moving to requiring Send on callbacks is helping catch small mistakes of user or 3rd-party objects made use of within callbacks as well as in user threads outside of the protection of the mainloop lock, since the objects could only be used in callbacks if Send, which requires something like placing them in Arc<Mutex<T>>, which then ensures they are unlocked whenever used, blocking incorrect use (except possible deadlocks). But the drawback is as already discussed, the inefficiency of forcing the extra locking. I am not sure that it is not best to just leave things as they are, thus increasing the likelihood of user code crashing and thus them discovering that they've made a mistake, or simply expecting users to take proper care, just as they've always had to do so in the past in languages without the safety Rust tries to bring, and thus avoiding creating inefficiency for those users doing everything correctly. My decision thus far has always been to live with the lack of protection against user mistakes, keeping things efficient. I am not sure that I'm going to change my mind here, considering the fact that adding Send constraints cannot completely solve the problem.

As to Futures. This is something I (and others) have wanted to bring to this library for quiet some time, and I expected it to have been achieved by now. Unfortunately it is pretty difficult to figure out precisely how to marry it to the C callback interfaces here. I have spent some time previously trying to figure it out but it really is not easy, and I recall there being one particular note in the design of Futures that really threw me - I cannot recall the details now but it related to a referenced object that could apparently be replaced at any time, which stumped me as to how on earth this could be worked with (my objects would need to keep a copy of the reference/pointer of this thing, but yet it could become invalid at any moment it seemed without anyway of knowing). I do fully intend to take another stab at it when I can. Someone else did get fed up waiting and implemented a crate that layered a subset of support over the top of my crate, but I was not impressed with it when I took a brief look; it did not seem that they really understood things properly. Anyway, (1) if and when this does get used instead of the current callbacks, I expect that it may well force Send even though sometimes things will be run on the same thread, and (2) I actually expect that it does not avoid the need for the existing mainloop lock to be used (I cannot recall whether I'd gotten as far as considering that properly last time I looked at its use). If the latter is so, as I expect it must be, then a move to Futures will not in any way completely solve the safety issues (not that I expected it to).

from pulse-binding-rust.

Possible undefined behavior due to missing `Send` for callbacks about pulse-binding-rust HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs