GithubHelp home page GithubHelp logo

Comments (2)

vxgmichel avatar vxgmichel commented on July 28, 2024

Hmm interesting problem!

It turns out that aiostream can handle streams of streams using the advanced operators, so that's probably what you're after here. The missing part is the ability to split a stream into a steam of streams where items are forwarded depending on a given predicate.

Here's an example of a split operator that would do just that:

from typing import AsyncIterable, TypeVar, Callable, AsyncIterator

from aiostream import pipable_operator, stream, pipe
from aiostream.core import streamcontext, Streamer
from aiostream.aiter_utils import AsyncExitStack
from anyio import create_memory_object_stream, BrokenResourceError
from anyio.abc import ObjectSendStream

T = TypeVar("T")
K = TypeVar("K")


@pipable_operator
async def split(
    source: AsyncIterable[T], key_function: Callable[[T], K], max_buffer_size: float = 0
) -> AsyncIterator[tuple[K, Streamer[T]]]:
    mapping: dict[K, ObjectSendStream[T]] = {}
    async with AsyncExitStack() as stack:
        async with streamcontext(source) as source:
            async for chunk in source:
                key = key_function(chunk)
                if key not in mapping:
                    sender, receiver = create_memory_object_stream[T](
                        max_buffer_size=max_buffer_size
                    )
                    mapping[key] = await stack.enter_async_context(sender)
                    yield key, streamcontext(receiver)
                try:
                    await mapping[key].send(chunk)
                except BrokenResourceError:
                    pass

Note how it uses a key function to tell where each produced item belongs. Here's an example of this operator being used:

@pytest.mark.asyncio
async def test_split():
    def is_even(x: int) -> bool:
        return x % 2 == 0

    def split_stream(
        key: bool, stream: Streamer[int], *_
    ) -> AsyncIterable[int | list[int]]:
        match key:
            case True:
                return stream | pipe.accumulate(initializer=0) | pipe.takelast(1)
            case False:
                return stream[:3] | pipe.list() | pipe.takelast(1)

    xs = (
        stream.range(10, interval=0.1)
        | split.pipe(is_even)
        | pipe.starmap(split_stream)
        | pipe.flatten()
        | pipe.list()
    )
    assert await xs == [[1, 3, 5], 20]

Here the key function is simply whether the item is even or odd. Then starmap can be used to apply specific stream operations depending on this predicate. For the sake of this example, the even numbers will summed together while the first 3 odd numbers are gathered as a list. Then both results are produced using the advanced flatten operator.

Here's a diagram of the corresponding pipeline:

graph TD;
    A(range) --> B(split);
    B --> C(starmap);
    C --> D(accumulate);
    D --> E(takelast);
    C --> F(take);
    F --> G(list);
    G --> H(takelast);
    E --> I(flatten);
    H --> I;
    I --> J(list);  
Loading

Does that correspond to your use case?

from aiostream.

parkerbjur avatar parkerbjur commented on July 28, 2024

Thanks! ya this is awesome.

from aiostream.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.