facebookexperimental / libunifex Goto Github PK

Unified Executors

License: Other

CMake 1.66% C++ 98.29% Shell 0.05%

libunifex's Issues

Add a get_scheduler() CPO for querying the current scheduler from the receiver

This would allow algorithms to be written that can be written in such a way that they just use the ambient current scheduler rather than having to explicitly pass the scheduler into the algorithm.

This should be done in conjunction with having the on() algorithm inject the current scheduler into the receiver passed to the successor. This part depends on #15.

Is paring heap a better structure for intrusive heap?

The following code implements a variant of paring heap:

template <typename T, T* T::*Next, T* T::*Parent, T* T::*Children, typename Key, Key T::*SortKey>
class intrusive_paring_heap {
 public:
  intrusive_paring_heap() noexcept : root_(nullptr) {}

  bool empty() const noexcept {
    return root_ == nullptr;
  }

  T* top() const noexcept {
    assert(!empty());
    return root_;
  }

  void insert(T* item) noexcept {
    root_ = merge(root_, item);
  }

  T* pop() noexcept {
    assert(!empty());
    auto item = root_;
    T *x, *y, *list = nullptr;
    while ((x = root_->*Children)) {
      if ((root_->*Children = y = x->*Next))
        root_->*Children = root_->*Children->*Next;
      list = push_front(list, merge(x, y));
    }
    x = nullptr;
    while ((y = list)) {
      list = list->*Next;
      x = merge(x, y);
    }
    root_ = x;
    if(root_) root_->*Parent = nullptr;
    item->*Children = item->*Parent = item->*Next = nullptr;
    return item;
  }

  void remove(T* item) noexcept {
    T* parent = item->*Parent;
    if (parent == nullptr) {
      pop();
    }
    else {
      parent->*Children = remove_node(parent->*Children, item);
      if (item->*Children) {
        item->*Children->*Parent = nullptr;
        root_ = merge(root_, item->*Children);
      }
      item->*Children = item->*Parent = item->*Next = nullptr;
    }
  }

 private:
  T* root_;

  static inline T* push_front(T* list, T* x) noexcept {
    x->*Next = list;
    return x;
  }

  static inline T* remove_node(T* list, T* x) noexcept {
    T* p = list;
    if (x == list) return x->*Next;
    while (p && p->*Next != x) p = p->*Next;
    if (p) p->*Next = x->*Next;
    return list;
  }

  static inline T* merge(T* h1, T* h2) noexcept {
    if (!h1) return h2;
    if (!h2) return h1;
    if (h2->*SortKey < h1->*SortKey) std::swap(h1, h2);
    h2->*Next = h1->*Children;
    h1->*Children = h2;
    h2->*Parent = h1;
    h1->*Next = nullptr;
    return h1;
  }
};

However, it seems to have much more branch conditions; and it requires an extra slot of pointer and nullptr initialisation.

(It passes all tests but I didn't test more cases. Also, I didn't whether checkout the binary layout is good enough.)

Use intrusive_list in more places

Replace the manual intrusive list implementations in various places with the intrusive_list class.

manual_event_loop
thread_unsae_event_loop
timed_single_thread_context (uses both prev/next to support efficient removals)
trampoline_scheduler
inplace_stop_token (uses both prev/next to support efficient removals)

Experiment with a low-latency Windows I/O scheduler

Implement some of the low-latency ideas from llfio's experiments with sender/receiver that allow I/O operations to complete prior to the completion event being posted to the IOCP queue by polling the OVERLAPPED structure for a completion status and executing the continuation before the notification is delivered via GetQueuedCompletionStatus().

This would require placing the OVERLAPPED structure outside of the operation-state (perhaps using a pool-allocator) so that the operation-state can be destroyed by the continuation before receiving the completion event and then returning the OVERLAPPED structure to a free-list when the completion event eventually comes in.

io_uring.h is included in a public interface header file

Right now libunifex includes io_uring.h in a public interface header file, and requires cmake targets consuming libunifex to include liburing. liburing is not a commonly installed library, plus it is currently undergoing rapid development, so doing this is somewhat anti-social to quick and easy use of libunifex.

Suggested solutions in order of my personal preference:

Personally speaking, I think it is very doable that use of liburing can be kept exclusively internal to source files, and not be required in header files. You just need to reorganise your implementation a bit. Where I'd like to reach is the ability to ship precompiled binaries and a set of headers without imposing extra install steps upon end users.
You can replicate the bare minimum necessary of liburing into public header files, and run a series of static asserts in the source files to ensure that your replicated edition is binary compatible with latest liburing. I've been known to do this in my own code from time to time, and for kernel APIs, maintenance burden is generally excellent.
You can bundle a copy of liburing in with libunifex, either directly as source, or via a git submodule.
Finally, you can leave things as they currently are, where you silently disable io uring support if the end user hasn't installed liburing. I greatly dislike this choice.

This is really a question of direction, which you guys need to choose. I raise this now before we dig a deeper i/o uring packaging hole.

Add pipe support for io_uring_context

Add ability to create OS unidirectional anonymous/named pipes that can be used for communication between processes on the same machine.

Incorrect blocking type conditional check in via()

In the blocking customization point within via():
if (predBlocking == blocking_kind::never && succBlocking == blocking_kind::never) { return blocking_kind::never; } else if ( predBlocking == blocking_kind::always_inline && predBlocking == blocking_kind::always_inline) { return blocking_kind::always_inline;}

the second condition in the else if should be succBlocking == blocking_kind::always_inline

Remove the `unifex::cpo` namespace

Currently, many of the algorithms/basis-operations that are customisable have been placed in the unifex::cpo namespace.

Whether or not a function is customisable should be an implementation detail from the perspective of code calling the function and so including the mechanism used to implement its customisability in the name of the operation seems inappropriate.

We should move these CPOs in the unifex::cpo namespace into the top-level unifex namespace (or other sub-namespaces for logical groupings) to give the library a more consistent interface between customisable and non-customisable functions.

Make task<T> a typed_sender

Currently it lacks value_types and error_types typedefs, and an implementation of connect.

Add a sender-version of take_until()

The current implementation of take_until() only supports passing streams.

An overload of this algorithm should be added that supports taking senders.

However, the naming take_until() seems to imply many values and so may not be so appropriate for senders which only return a single result. Perhaps something like stop_when() would be more appropriate?

Implement schedule_after() for io_epoll_context and io_uring_context

The schedulers for these two execution contexts do not yet satisfy the 'time_scheduler' concept because they don't implement the schedule_after() CPO, only the schedule_at() and now() CPOs.

We should add implementations of schedule_after() for these contexts.

Use consistent naming for stream and non-stream algorithms

There are a number of algorithms that have been given different names based on whether they are applied to a sender or to a stream.

e.g.

transform(sender, func) / transform_stream(stream, func)
on(pred, succ) / on_stream(scheduler, stream)
via(succ, pred) / via_stream(scheduler, stream)
typed_via(succ, pred) / typed_via_stream(scheduler, stream)

These function pairs should either be merged to be overloads that dispatch to different implementations based on what kind of parameters are passed, or should be named the same but put into separate namespaces.

Can anyone help to explain why we should wrap the operation implementation into a struct _op?

Can anyone help to explain why we should wrap the operation implementation into a struct _op. Is that due to we want to remove cvref of Receiver?

struct _op {
struct type;
};

-stdlib=libc++ should not be forced on when on Linux

Right now you force it on always if clang is installed, and libc++ is installed. This is anti-social. The vast majority of end users use clang with libstdc++ on Linux, not libc++. They would be surprised if linking in libunifex when using clang on Linux by link errors. If end users specifically want libc++ on Linux, they'll explicitly ask for -stdlib=libc++ when configuring cmake.

Please change the default on Linux.

Crash in p1897.cpp

The following code crashes with clang 9

libunifex/examples/p1897.cpp

Line 114 in fa37742

vec[idx] = vec[idx] + i + idx;

auto indexed_for_sender =
    indexed_for(
      std::move(just_sender),
      execution::seq,
      ranges::iota_view{3},
      [](int idx, std::vector<int>& vec, const int& i){
          vec[idx] = vec[idx] + i + idx; //!
      });

because vec is empty.
I could not determine why. My very rough understanding is that it should have size 3?

Stacktrace if helps

#0  0x0000000000404b54 in main::$_0::operator() (this=0x7fffffffd9e0, idx=0, vec=..., i=@0x7fffffffd9d8: 10) at ../examples/p1897.cpp:114
#1  0x0000000000404b0c in std::__1::__invoke<main::$_0&, int&, std::__1::vector<int, std::__1::allocator<int> >&, int&> (__f=..., __args=@0x7fffffffd9d8: 10, __args=@0x7fffffffd9d8: 10, __args=@0x7fffffffd9d8: 10) at /usr/lib/llvm-9/bin/../include/c++/v1/type_traits:3530
#2  0x0000000000404a9c in std::__1::invoke<main::$_0&, int&, std::__1::vector<int, std::__1::allocator<int> >&, int&> (__f=..., __args=@0x7fffffffd9d8: 10, __args=@0x7fffffffd9d8: 10, __args=@0x7fffffffd9d8: 10) at /usr/lib/llvm-9/bin/../include/c++/v1/functional:2848
#3  0x00000000004049b7 in unifex::_ifor::_receiver<execution::sequenced_policy, ranges::iota_view, main::$_0, unifex::_tfx::_receiver<unifex::_sync_wait::_thread_unsafe_receiver<std::__1::vector<int, std::__1::allocator<int> >, unifex::unstoppable_token&&>::type, main::$_1>::type>::type::apply_func_with_policy<std::__1::vector<int, std::__1::allocator<int> >, int> (range=..., func=..., values=@0x7fffffffd9d8: 10, values=@0x7fffffffd9d8: 10) at ../source/../include/unifex/indexed_for.hpp:57
#4  0x00000000004048a2 in unifex::_ifor::_receiver<execution::sequenced_policy, ranges::iota_view, main::$_0, unifex::_tfx::_receiver<unifex::_sync_wait::_thread_unsafe_receiver<std::__1::vector<int, std::__1::allocator<int> >, unifex::unstoppable_token&&>::type, main::$_1>::type>::type::set_value<std::__1::vector<int, std::__1::allocator<int> >, int>(std::__1::vector<int, std::__1::allocator<int> >&&, int&&) && (this=0x7fffffffd9e0, values=@0x7fffffffd9d8: 10, values=@0x7fffffffd9d8: 10) at ../source/../include/unifex/indexed_for.hpp:79
#5  0x0000000000404869 in unifex::_rec_cpo::_set_value_fn::_impl<false>::operator()<unifex::_ifor::_receiver<execution::sequenced_policy, ranges::iota_view, main::$_0, unifex::_tfx::_receiver<unifex::_sync_wait::_thread_unsafe_receiver<std::__1::vector<int, std::__1::allocator<int> >, unifex::unstoppable_token&&>::type, main::$_1>::type>::type, std::__1::vector<int, std::__1::allocator<int> >, int> (this=0x7fffffffd688, r=..., values=@0x7fffffffd9d8: 10, values=@0x7fffffffd9d8: 10) at ../source/../include/unifex/receiver_concepts.hpp:59
#6  0x000000000040482d in unifex::_rec_cpo::_set_value_fn::operator()<unifex::_ifor::_receiver<execution::sequenced_policy, ranges::iota_view, main::$_0, unifex::_tfx::_receiver<unifex::_sync_wait::_thread_unsafe_receiver<std::__1::vector<int, std::__1::allocator<int> >, unifex::unstoppable_token&&>::type, main::$_1>::type>::type, std::__1::vector<int, std::__1::allocator<int> >, int> (this=0x4093d8 <unifex::_rec_cpo::set_value>, r=..., values=@0x7fffffffd9d8: 10, values=@0x7fffffffd9d8: 10) at ../source/../include/unifex/receiver_concepts.hpp:47
#7  0x00000000004047eb in unifex::_just::_op<unifex::_ifor::_receiver<execution::sequenced_policy, ranges::iota_view, main::$_0, unifex::_tfx::_receiver<unifex::_sync_wait::_thread_unsafe_receiver<std::__1::vector<int, std::__1::allocator<int> >, unifex::unstoppable_token&&>::type, main::$_1>::type>::type, std::__1::vector<int, std::__1::allocator<int> >, int>::type::start() &::{lambda(std::__1::vector<int, std::__1::allocator<int> >&&, int&&)#1}::operator()(std::__1::vector<int, std::__1::allocator<int> >&&, int&&) const (this=0x7fffffffd7e0, 
    values=@0x7fffffffd9d8: 10, values=@0x7fffffffd9d8: 10) at ../source/../include/unifex/just.hpp:46
#8  0x00000000004047a7 in std::__1::__invoke_constexpr<unifex::_just::_op<unifex::_ifor::_receiver<execution::sequenced_policy, ranges::iota_view, main::$_0, unifex::_tfx::_receiver<unifex::_sync_wait::_thread_unsafe_receiver<std::__1::vector<int, std::__1::allocator<int> >, unifex::unstoppable_token&&>::type, main::$_1>::type>::type, std::__1::vector<int, std::__1::allocator<int> >, int>::type::start() &::{lambda(std::__1::vector<int, std::__1::allocator<int> >&&, int&&)#1}, std::__1::vector<int, std::__1::allocator<int> >, int> (__f=..., 
    __args=@0x7fffffffd9d8: 10, __args=@0x7fffffffd9d8: 10) at /usr/lib/llvm-9/bin/../include/c++/v1/type_traits:3536
#9  0x0000000000404743 in std::__1::__apply_tuple_impl<unifex::_just::_op<unifex::_ifor::_receiver<execution::sequenced_policy, ranges::iota_view, main::$_0, unifex::_tfx::_receiver<unifex::_sync_wait::_thread_unsafe_receiver<std::__1::vector<int, std::__1::allocator<int> >, unifex::unstoppable_token&&>::type, main::$_1>::type>::type, std::__1::vector<int, std::__1::allocator<int> >, int>::type::start() &::{lambda(std::__1::vector<int, std::__1::allocator<int> >&&, int&&)#1}, std::__1::tuple<std::__1::vector<int, std::__1::allocator<int> >, int>, 0ul, 1ul>(unifex::_just::_op<unifex::_ifor::_receiver<execution::sequenced_policy, ranges::iota_view, main::$_0, unifex::_tfx::_receiver<unifex::_sync_wait::_thread_unsafe_receiver<std::__1::vector<int, std::__1::allocator<int> >, unifex::unstoppable_token&&>::type, main::$_1>::type>::type, std::__1::vector<int, std::__1::allocator<int> >, int>::type::start() &::{lambda(std::__1::vector<int, std::__1::allocator<int> >&&, int&&)#1}&&, std::__1::tuple<std::__1::vector<int, std::__1::allocator<int> >, int>&&, std::__1::__tuple_indices<0ul, 1ul>) (__f=..., 
    __t=...) at /usr/lib/llvm-9/bin/../include/c++/v1/tuple:1376
#10 0x00000000004046b2 in std::__1::apply<unifex::_just::_op<unifex::_ifor::_receiver<execution::sequenced_policy, ranges::iota_view, main::$_0, unifex::_tfx::_receiver<unifex::_sync_wait::_thread_unsafe_receiver<std::__1::vector<int, std::__1::allocator<int> >, unifex::unstoppable_token&&>::type, main::$_1>::type>::type, std::__1::vector<int, std::__1::allocator<int> >, int>::type::start() &::{lambda(std::__1::vector<int, std::__1::allocator<int> >&&, int&&)#1}, std::__1::tuple<std::__1::vector<int, std::__1::allocator<int> >, int> >(unifex::_just::_op<unifex::_ifor::_receiver<execution::sequenced_policy, ranges::iota_view, main::$_0, unifex::_tfx::_receiver<unifex::_sync_wait::_thread_unsafe_receiver<std::__1::vector<int, std::__1::allocator<int> >, unifex::unstoppable_token&&>::type, main::$_1>::type>::type, std::__1::vector<int, std::__1::allocator<int> >, int>::type::start() &::{lambda(std::__1::vector<int, std::__1::allocator<int> >&&, int&&)#1}&&, std::__1::tuple<std::__1::vector<int, std::__1::allocator<int> >, int>&&) (__f=..., __t=...) at /usr/lib/llvm-9/bin/../include/c++/v1/tuple:1385
#11 0x00000000004045fc in unifex::_just::_op<unifex::_ifor::_receiver<execution::sequenced_policy, ranges::iota_view, main::$_0, unifex::_tfx::_receiver<unifex::_sync_wait::_thread_unsafe_receiver<std::__1::vector<int, std::__1::allocator<int> >, unifex::unstoppable_token&&>::type, main::$_1>::type>::type, std::__1::vector<int, std::__1::allocator<int> >, int>::type::start() & (this=0x7fffffffd9c0) at ../source/../include/unifex/just.hpp:44
#12 0x00000000004045c9 in unifex::_start::_fn::_impl<false>::operator()<unifex::_just::_op<unifex::_ifor::_receiver<execution::sequenced_policy, ranges::iota_view, main::$_0, unifex::_tfx::_receiver<unifex::_sync_wait::_thread_unsafe_receiver<std::__1::vector<int, std::__1::allocator<int> >, unifex::unstoppable_token&&>::type, main::$_1>::type>::type, std::__1::vector<int, std::__1::allocator<int> >, int>::type> (this=0x7fffffffd828, op=...) at ../source/../include/unifex/sender_concepts.hpp:55
#13 0x000000000040406d in unifex::_start::_fn::operator()<unifex::_just::_op<unifex::_ifor::_receiver<execution::sequenced_policy, ranges::iota_view, main::$_0, unifex::_tfx::_receiver<unifex::_sync_wait::_thread_unsafe_receiver<std::__1::vector<int, std::__1::allocator<int> >, unifex::unstoppable_token&&>::type, main::$_1>::type>::type, std::__1::vector<int, std::__1::allocator<int> >, int>::type> (this=0x4093d4 <unifex::_start::start>, op=...) at ../source/../include/unifex/sender_concepts.hpp:44
#14 0x0000000000402cdc in unifex::_sync_wait_cpo::_fn::operator()<unifex::_tfx::_sender<unifex::_ifor::_sender<unifex::_just::_sender<std::__1::vector<int, std::__1::allocator<int> >, int>::type, execution::sequenced_policy, ranges::iota_view, main::$_0>::type, main::$_1>::type, unifex::unstoppable_token, std::__1::vector<int, std::__1::allocator<int> > > (this=0x409004 <unifex::sync_wait>, sender=..., stopToken=...) at ../source/../include/unifex/sync_wait.hpp:194

Index entries in API Reference should be anchor links

It would make https://github.com/facebookexperimental/libunifex/blob/master/doc/api_reference.md easier to navigate.

Add Windows IOCP-based I/O executor

Implement a simple single-threaded event loop that uses Win32 IOCP to dispatch async I/O completion events so that we can build and run examples that do file I/O on the Windows platform.

We can borrow some of the implementation details from cppcoro's io_service class.

Make most/all algorithms customisable

One of the design goals is to allow algorithms to be customised when invoked with certain types to allow more efficient implementations of that algorithm to be used. One of the key use-cases for this is to allow efficient fusing of composed operations for certain execution enivronments.

eg. so that two chained gpu-tasks can be chained gpu-side without needing to bounce back off the CPU, or similarly so that two I/O operations can be chained kernel-side so that we don't need a kernel transition to start a subsequent operation.

While some of the algorithms have already been turned into customisation points, there are still a large number of algorithms which have not.

The following algorithms are not currently CPOs:

delay(scheduler, duration)
just(values...)
let(predecessor, successorFactory)
next_adapt_stream(stream, nextAdapter)
adapt_stream(stream, nextAdapter, cleanupAdapter)
adapt_stream(stream, nextAndCleanupAdapter)
on_stream(scheduler, stream)
on(predecessor, successor)
reduce_stream(stream, initialState, reduceFunc)
single(sender) -> stream
stop_immediately(stream)
sync_wait(sender, stopToken)
sync_wait_r<R>(sender, stopToken)
take_until(source, trigger)
then_execute(scheduler, predecessor, func)
transform_stream(stream, func)
transform(pred, func)
type_erase<Ts...>(stream)
typed_via_stream(scheduler, stream)
typed_via(successor, predecessor)
via_stream(scheduler, stream)
via(successor, predecessor)
when_all(senders...)
with_allocator(sender, allocator)
with_query_value(sender, cpo, value)

Compile errors with Version 16.8.0 Previews

Any support for new version ? lot of compilation errors

Add initial implementation of "many sender" concept

Add a "many sender" concept that can send a sequence of calls to a "many receiver":

zero or more set_next() calls followed by;
a call to one of set_value(), set_done() or set_error().

Add a bulk_schedule(scheduler, count) algorithm that sends count calls to set_next(r, idx).
This should have a default implementation in terms of schedule(scheduler)

Add a get_execution_policy(receiver) CPO that allows a many-sender to query whether the receiver is able to support unsequenced/concurrent invocations of set_next().

Add some basic many-sender algorithms.

reduce()
for_each()
transform()
to_vector()

coroutines are required for some of the example

I have tried to build the library with gcc which, unfortunately, doesn't support coroutines, yet. However, the build unconditionally builds the examples and most of them depend on headers including <experimental/coroutines>. It would be helpful to have a cmake option to avoid building the corresponding examples.

Examples fail to compile due to missing pthread dependentcy

While compiling libunifex, I am getting the following error (and many similar ones for other examples):

Linking CXX executable libunifex-build/examples/heap_allocate_operation
FAILED: libunifex-build/examples/heap_allocate_operation 
: && /usr/bin/c++  -O3 -DNDEBUG   libunifex-build/examples/CMakeFiles/heap_allocate_operation.dir/heap_allocate_operation.cpp.o  -o libunifex-build/examples/heap_allocate_operation  -Wl,-rpath,libunifex-build/source libunifex-build/source/libunifex.so && :
/usr/bin/ld: libunifex-build/examples/CMakeFiles/heap_allocate_operation.dir/heap_allocate_operation.cpp.o: undefined reference to symbol 'pthread_create@@GLIBC_2.2.5'
/usr/bin/ld: /lib/x86_64-linux-gnu/libpthread.so.0: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status

Pthread is declared as PRIVATE dependency of libunifex here:

libunifex/source/CMakeLists.txt

Line 28 in e4b3158

pthread)

But examples do not link to phread directly.

It is enough to change pthread dependency from PRIVATE to PUBLIC ot make it build.

System: Ubuntu 19.10

$ gcc --version
gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008

Adopt a concepts-emulation facility to replace raw enable_if-based constraints

We ideally want to be able to switch to using C++20 concepts if it is available in the current compiler but fall back to traditional SFINAE techniques if not. The hope is that this will give better compile-times and better error-messages when C++20 concepts are available.

We may be able to port and use the concepts emulation bits from the range-v3 library for this purpose.

`submit_cpo` starts the operation twice

It looks like 'break' operator is missed in submit_cpo which leads to double invoking of the operation in some cases.

https://github.com/facebookexperimental/libunifex/blob/master/include/unifex/submit.hpp#L124

The sender/receiver concepts described in doc/concepts.md are out of date

Bring them into sync with what is actually defined in the code, which should more or less mirror what is in P0443R13.

Add static_thread_pool execution context

Add a simple thread-pool scheduler to allow building examples that incorporate parallelism.

Add support for building on Windows and MSVC

Make sure CMake files support Windows
Add CI builds for Windows
Fix up any compilation issues found on MSVC

The via() and on() algorithms should take a scheduler

The via() and on() algorithms currently take a sender that represents the execution context transition rather than taking the scheduler. This means the caller needs to explicitly call schedule(s) to get the sender to use.

These algorithms should be changed to take a scheduler instead.

We may need to add two new algorithms that compose senders in the way that these algorithms currently do to replace their utility.

Merge via() and typed_via()

The typed_via() algorithm should be incorporated as an overload of the via() algorithm rather than having a separate name.

The implementation should switch based on whether or not the sender is a typed-sender or not.

Generalise range_stream to adapt arbitrary ranges to async-streams

The current implementation of range_stream produces a stream of integers in some specified range.

This was largely built just so we had some kind of stream that we could test against, but it would be much more useful to change this to take an arbitrary range (view/container) and allow construction of a stream that produces the elements of that range.

Existing uses of the range_stream could be changed to range_stream(std::views::iota(min, max)).
Or we could rename the existing stream iota_stream and then add a more general range_stream.

Rename is_stop_never_possible to is_stop_ever_possible

The usage of this query quite often ends up needing to be negated which ends up with a double negative (not is stop never possible) instead of the positive form (is stop ever possible).

eg.

if constexpr (!is_stop_never_possible_v<StopToken>) {
  // handle possibility of stop being requested
}

Inverting this and using the positive form would allow this code to read better:

if constexpr (is_stop_ever_possible_v<StopToken>) {
  // handle possibility of stop being requested.
}

Adopt a unit-testing framework

The tests we have at the moment are mostly smoke-tests/examples that check that things compile and at least pass some basic runtime checks.

If we want to write more comprehensive tests then we should consider making use of a unit-test framework (eg. googletest, catch2, etc.) for writing the tests.

Consistently catch exceptions thrown from set_value() calls

The requirement for set_value() implementations was relaxed to allow them to throw exceptions.
However, this now means we need to audit all call-sites to unifex::set_value() to ensure that we catch exceptions and reflect the error back to unifex::set_error().

Some of this could be helped by a unifex::nothrow_set_value() helper that wraps up this pattern for us conditionally based on whether calling set_value() is noexcept or not.

Places where this needs to be done:

inline_scheduler
manual_event_loop::schedule() operation
ready_done_sender
stop_immediately concrete_receiver

We also need to update pretty much all of these senders to have error_types that indicates that they might send an exception_ptr since they don't yet know whether they will be connected to a receiver that has a potentially-throwing set_value() method.

Add contract checking support to schedule/connect/start

The code assumes that the contracts will be followed and invokes undefined behavior when they are not. This reduces confidence when landing changes. Introduce wrappers that check contracts and terminate on violations. Use these wrappers in the schedule/connect/start cpos to increase confidence in changes.

Build failing on commit 98339251024dac5f82a62411ea902127864b8aba

The build on the main branch just fails in so many ways. I was wondering if there's a relatively stable commit or branch I can check out to get the build through.

Or maybe get a reference environment (compiler version, OS etc).

Attached build error log for reference.
build.log.gz

Typo in doc/overview.md

https://github.com/facebookexperimental/libunifex/blob/master/doc/overview.md#heterogeneous-results

However, coroutines currently have the limitation that the return-type of a co_await expression can only be a single type (it is deduced from the return-type of the await_ready() method).

It should be "await_resume() method" instead of "await_ready() method", isn't it?

Libunifex does not compile on Linux

It appears that there is a namespace called linux. For some reason, Clang 9 #define's linux to be 1 on my platform (Ubuntu 18.04). I'm using the clang-9 from apt.llvm.org. Error below:

/usr/bin/clang++-9 -I../source/../include -Iinclude -stdlib=libc++ -std=gnu++2a -MD -MT source/CMakeFiles/unifex.dir/linux/monotonic_clock.cpp.o -MF source/CMakeFiles/unifex.dir/linux/monotonic_clock.cpp.o.d -o source/CMakeFiles/unifex.dir/linux/monotonic_clock.cpp.o -c ../source/linux/monotonic_clock.cpp
In file included from ../source/linux/monotonic_clock.cpp:16:
../source/../include/unifex/linux/monotonic_clock.hpp:24:11: error: expected identifier or '{'
namespace linux {
^
:392:15: note: expanded from here
#define linux 1
^

Get ASAN builds running in CI

Add an easy example so people can write their own schedulers.

For people wanting to write their own schedulers, can you please add an easy example? Most of the code I dug through is either too-complex or too tightly coupled. Here is something I came up with - A scheduler running on the same thread which produces values in a loop with a sleep interval. However -

I am not really sure that the code is in-line with how schedulers should be written in unifex.
I am doing std::move() on the same object multiple times withing the loop - Should not be legal.
Also, the code is still a bit long / complex - Can this be trimmed down? Do I really need task_base in the code?
Is this code ok, if we use sync_wait(when_all(...)) with Senders made from other schedulers?

Out-of-context: Is there a convenience function to create Receivers? I am not able to handle the errors the sender is producing.

The sleepy_scheduler code:

#include <unifex/scheduler_concepts.hpp>

#include <iostream>
#include <thread>

#include <csignal>

#include <unistd.h>


class sleepy_ctx;


struct task_base {};

template <typename Receiver>
struct _op {
    class type;
};

template <typename Receiver>
class _op<Receiver>::type final : task_base {
    
    using stop_token_type = unifex::stop_token_type_t<Receiver&>;

    public:
    
    template <typename Receiver2>
    explicit type(Receiver2&& receiver, sleepy_ctx* ctx)
    : receiver_((Receiver2 &&) receiver), 
      ctx_(ctx)
    {}

    void start() noexcept;

    private:

    static void start_impl(task_base* t, int iters, int sleeps) noexcept {
        auto& self = *static_cast<type*>(t);
        for (int i = 0; i < iters; ++i) {
            int ret = sleep(sleeps);
            if (ret != 0) {
                //unifex::set_error(std::move(self.receiver_), ret);
                unifex::set_value(std::move(self.receiver_), -ret); // TODO: set error
            } else {
                unifex::set_value(std::move(self.receiver_), i);
            }
        }
        unifex::set_done(std::move(self.receiver_));
    }

    UNIFEX_NO_UNIQUE_ADDRESS Receiver receiver_;
    sleepy_ctx* const ctx_;
};

template <typename Receiver>
using sleepy_operation = typename _op<std::remove_cvref_t<Receiver>>::type;


class sleepy_sender {
    
    // blocking is a customisation point which takes Sender sleepy_sender as parameter
    // TODO: Understand how this works
    friend constexpr unifex::blocking_kind 
        tag_invoke(unifex::tag_t<unifex::blocking>, const sleepy_sender&) noexcept {
        // Since sleepy_operation.start is guaranteed to call reveiver (via set_value etc) 
        // in same thread before start returns.
        return unifex::blocking_kind::always_inline; 
    }

    // SenderOf<int>
    public:
    template <
        template <typename...> class Variant,
        template <typename...> class Tuple>
    using value_types = Variant<Tuple<int>>;

    // int error types
    template <template <typename...> class Variant>
    using error_types = Variant<int>;

    static constexpr bool sends_done = true;

    explicit sleepy_sender(sleepy_ctx* ctx) noexcept
    : ctx_(ctx)
    {}

    template <typename Receiver>
    sleepy_operation<Receiver> connect(Receiver&& receiver) const& {
        std::cout << "?? thread=" << std::this_thread::get_id() << ": sleepy_sender::connect\n";
        return sleepy_operation<Receiver>{(Receiver &&) receiver, ctx_};
    }

    private:
    sleepy_ctx* const ctx_;
};


struct sleepy_sched {
    
    explicit sleepy_sched(sleepy_ctx* ctx) noexcept 
    : ctx_(ctx) 
    {}

    sleepy_sender schedule() const noexcept {
        return sleepy_sender{ctx_};
    }

    friend bool operator==(sleepy_sched a, sleepy_sched b) noexcept {
        return a.ctx_ == b.ctx_;
    }

    friend bool operator!=(sleepy_sched a, sleepy_sched b) noexcept {
        return a.ctx_ != b.ctx_;
    }

private:
    sleepy_ctx* ctx_;
};


struct sleepy_ctx {
    
    sleepy_ctx(int counts, int sleeps) 
    : counts_{counts}, sleeps_{sleeps}
    {}

    sleepy_sched get_scheduler() {
        return sleepy_sched{this};
    }

    int counts_;
    int sleeps_;
};


template <typename Receiver>
inline void _op<Receiver>::type::start() noexcept {
    std::cout << "?? thread=" << std::this_thread::get_id() << ": sleepy_operation::start\n";
    _op<Receiver>::type::start_impl(this, ctx_->counts_, ctx_->sleeps_);
}


#include <unifex/sync_wait.hpp>
#include <unifex/transform.hpp>

void signal_handler(int signal) {} // do nothing on sigint

using namespace unifex;

int main() {
    // The signal handler does not do anything. However SIGINT interrupts
    // our sleep call, and causes set_error calls
    std::signal(SIGINT, signal_handler); 

  
    sleepy_ctx ioctx{5, 2}; // iterate 5 time, with 2 second sleep
    auto sched = ioctx.get_scheduler(); 

    std::cout << "main thread=" << std::this_thread::get_id() << '\n';

    auto begin_sender = schedule(sched);
    std::cout << "main thread=" << std::this_thread::get_id() << ": after schedule\n";

    // TODO: how to hook a Receiver here which also handles the set_error calls?
    auto x1 = transform(begin_sender, [](int sval) {
        std::cout << "?? thread=" << std::this_thread::get_id() << " value=" << sval << '\n';
        return 1;
    });
    std::cout << "main thread=" << std::this_thread::get_id() << ": after transform\n";

    auto out = sync_wait(x1);
    std::cout << "main thread=" << std::this_thread::get_id() << ": after sync_wait\n";

    return 0;
}

Stack overflow caused by repeat_effect_until

An operation that repeats infinitely using repeat_effect_until will eventually cause an stack overflow. Take the following example:

    auto n = 0;
    sync_wait( repeat_effect_until(
        transform(
            schedule( inline_scheduler{} ), [&]() { printf( "%d\n", n++ ); } ),
        [] { return false; } ) );

Invoking start function of the sender operation state will trigger a set_value call on the repeat_effect_until receiver which recursively calls the op.start() again and eventually, trigger an stack overflow.

trampoline_scheduler should support cancellation

In example/stream_cancellation.cpp, we're using stop_when together with trampoline_scheduler, but since trampoline_scheduler doesn't yet support cancellation, the stop signal is ignored. That makes this a pretty useless example. :-P

Fix lifetime correctness for manual_lifetime_union

It is technically invalid to reinterpret cast the bytes in manual_lifetime_union to a manual_lifetime without first using placement new to construct an instance of that object

Instead we should replace the get<T>() method that returns a manual_lifetime<T> with construct<T>(), construct_from<T>() and destruct<T>() methods that in-place construct an object and have get<T>() return a reference to an already-constructed T value instead.

e.g. an interface something like this:

template<typename... Ts>
struct manual_lifetime_union {
  template<typename T, typename... Args>
  std::add_lvalue_reference_t<T> construct(Args&&... args);

  template<typename T, typename Factory>
  std::add_lvalue_reference_t<T> construct_from(Factory&& f);

  template<typename T>
  void destruct();

  template<typename T>
  std::add_lvalue_reference_t<T> get() &;
  template<typename T>
  std::add_lvalue_reference_t<std::add_const_t<T>> get() const &;
  // etc...

};

Support task<void>

Also, task<> should be an alias for task<void>.

Add Windows Thread Pool I/O scheduler

Add an implementation of an I/O scheduler that makes use of the Windows Thread Pool APIs to schedule work and handle I/O completion events.

The current windows_thread_pool implementation supports scheduling CPU-work and timers but does not yet support any file/socket/pipe I/O.

Add support for senders whose result-type is dependent on the receiver type

One of the patterns supported by senders is to allow contextual information from the calling site (receiver) to be injected into the callee (sender) using the receiver passed to the connect() operation.

For example, the get_stop_token() CPO can be called by a sender to query the stop-token contextual information from the receiver passed to connect(). Similarly, the get_allocator(context) CPO allows a sender to query the current allocator.

There will be cases where the result-type of a sender is dependent on the contextual information provided through the receiver. For example, a reduce algorithm that cumulates elements from a stream into a std::vector might produce a std::vector<T, Allocator> that uses the allocator obtained from calling get_allocator(context).

This means that the sender::value_types and sender::error_types type-aliases may not yet have enough information to determine the types of the values they will send. However, once we have access to a receiver (ie. in the connect() function or the operation-state object returned from it) we can correctly calculate the value_types/error_types that will be sent.

Thus, to support using these types of senders with algorithms that need to know what types of values/errors will be sent we should probably allow/require defining the value_types/error_types on the operation-state type instead-of/as-well-as on the sender type.

Add socket support to io_uring_context

Add support for tcp/ip sockets and udp/ip sockets using the io_uring_context event-loop.

Add support for async cleanup to senders

One of the main capabilities not yet supported by senders in libunifex is the abillity to have them perform asynchronous cleanup.

The stream abstraction has both a next() and a cleanup() operation which return senders. When you are finished with a stream you must call cleanup() and wait for it to complete before destroying the stream. We should be able to have the same facilities for senders.

This can allow a result to be produced with lower latency as the cleanup can be deferred until after the result has been processed.

Some use-cases

A stop_immediately() algorithm for senders that immediately completes the current op with 'done' and asynchronously requests the source operation to stop. The cleanup operation would then wait for the source operation to complete.

We could make when_all() immediately complete with error/done in the case that one of the input senders completed with error/done. Then other operations could be cancelled and the cleanup would wait for these operations to complete.

Reference

See also P1662R0 for a discussion of Async RAII for coroutines.

Generalise coroutine task<T> type to support passing through receiver context queries to parent receiver/coroutine

We would ideally like to be able to define coroutine tasks that can pass through context from the caller to child coroutines to allow passing things like stop_token, current scheduler, allocator transparently through algorithms implemented as coroutines in the same way that we do for most senders.

Since coroutines are inherently type-erased, this will mean that we need to add the ability to parameterise the task type on the set of query CPOs that should be passed through from the parent context.

template<typename T, typename... CPOs>
class task { ... };

template<typename T>
using cancellable_task = task<T,
  overload<inplace_stop_token(const this_&) noexcept>(get_stop_token)>>;

This will probably require hooking up the adaption of senders to the SenderAwaitable type to turn them into awaitables in the await_transform() rather than relying on operator co_await() so that we have the type-information from the promise early enough when defining the coroutine-receiver type to pass to the sender's connect() implementation.

Note that for some CPOs we may need to implement some special logic to adapt/type-erase across the boundary. eg. we might want to adapt from whatever stop-token type the caller has to the target stop-token type by attaching a stop_callback to the caller's stop-token. Some investigation will be required to determine what the appropriate strategy should be here.

Add ability to reschedule onto the current scheduler

Add overloads of the schedule(), schedule_after() algorithms that don't take a scheduler and that use whatever the current scheduler from the receiver is (ie. using get_scheduler() - see #20).

This would simplify writing some kinds of code, like a timeout() algorithm.

template<typename Op, typename Duration>
auto timeout(Op op, Duration duration) {
  return take_until(std::move(op), schedule_after(duration));
}

Whereas currently we'd need to do something like:

template<typename Op, typename Scheduler, typename Duration>
auto timeout(Op op, Scheduler sched, Duration duration) {
  return take_until(std::move(op), schedule_after(sched, duration));
}

Q. Should schedule() be named reschedule()?

Make senders awaitable in task<T> coroutines

... by adding an appropriate await_transform to task<T>. Also, consider whether to make senders from unifex awaitable in non-task<T> coroutines.

facebookexperimental / libunifex Goto Github PK

libunifex's Issues

Some use-cases

Reference

Recommend Projects

Recommend Topics

Recommend Org

Jobs