tokio-rs / turmoil Goto Github PK

Add hardship to your tests

License: MIT License

Rust 100.00%

turmoil's Issues

Re-starting a crashed host with bounce panics

Repro:

#[test]
fn restart_host_after_crash() -> Result {
    let mut sim = Builder::new().build();

    sim.host("h", || async { future::pending().await });

    // crash and step to execute the err handling logic
    sim.crash("h");
    sim.step()?;

    // restart and step to ensure the host sfotware runs
    sim.bounce("h");
    sim.step()?;

    Ok(())
}

running 1 test
thread 'sim::test::restart_host_after_crash' panicked at 'missing host', src/sim.rs:143:43
stack backtrace:
   0: rust_begin_unwind
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
   2: core::panicking::panic_display
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:139:5
   3: core::panicking::panic_str
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:123:5
   4: core::option::expect_failed
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/option.rs:1879:5
   5: core::option::Option<T>::expect
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/option.rs:741:21
   6: turmoil::sim::Sim::run_with_hosts
             at ./src/sim.rs:143:22
   7: turmoil::sim::Sim::bounce
             at ./src/sim.rs:125:9
   8: turmoil::sim::test::restart_host_after_crash
             at ./src/sim.rs:606:9
   9: turmoil::sim::test::restart_host_after_crash::{{closure}}
             at ./src/sim.rs:596:5
  10: core::ops::function::FnOnce::call_once
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/ops/function.rs:251:5
  11: core::ops::function::FnOnce::call_once
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/ops/function.rs:251:5

On crash, the rt is removed causing this issue.
https://github.com/tokio-rs/turmoil/blob/main/src/sim.rs#L321-L322

Cannot build project with turmoil

I am trying to use turmoil for one of my projects, but it is failing to build.

steps to reproduce:

cargo new test-turmoil
cd test-turmoil
cargo add turmoil
cargo check

yields

error[E0433]: failed to resolve: could not find `UnhandledPanic` in `runtime`
  --> /Users/mpostma/Documents/code/rust/turmoil/src/rt.rs:94:42
   |
94 |         .unhandled_panic(tokio::runtime::UnhandledPanic::ShutdownRuntime)
   |                                          ^^^^^^^^^^^^^^ could not find `UnhandledPanic` in `runtime`

error[E0433]: failed to resolve: could not find `UnhandledPanic` in `runtime`
   --> /Users/mpostma/Documents/code/rust/turmoil/src/rt.rs:108:43
    |
108 |     local.unhandled_panic(tokio::runtime::UnhandledPanic::ShutdownRuntime);
    |                                           ^^^^^^^^^^^^^^ could not find `UnhandledPanic` in `runtime`

error[E0599]: no method named `unhandled_panic` found for mutable reference `&mut tokio::runtime::Builder` in the current scope
  --> /Users/mpostma/Documents/code/rust/turmoil/src/rt.rs:94:10
   |
94 |         .unhandled_panic(tokio::runtime::UnhandledPanic::ShutdownRuntime)
   |          ^^^^^^^^^^^^^^^ method not found in `&mut tokio::runtime::Builder`

error[E0599]: no method named `unhandled_panic` found for struct `LocalSet` in the current scope
   --> /Users/mpostma/Documents/code/rust/turmoil/src/rt.rs:108:11
    |
108 |     local.unhandled_panic(tokio::runtime::UnhandledPanic::ShutdownRuntime);
    |           ^^^^^^^^^^^^^^^ method not found in `LocalSet`

This seems to be caused by the fact that the tokio dependency in the turmoil project is set to 0.19, but unhandled_panic is not part of this version.

I tried to patch the tokio version in turmoil and use the path dependency, but this still does not work.

This is on both macOS and a fresh linux VM.

Fix AsyncRead impl for TcpStream

Is your feature request related to a problem? Please describe.

Yes, turmoil::net::TcpStream does not behave like tokio::net::TcpStream.

AsyncRead is broken if the supplied buf does not have capacity for the next message.

#[test]
fn read_buf_smaller_than_msg() -> Result {
    let mut sim = Builder::new().build();

    sim.client("server", async {
        let listener = bind().await?;
        let (mut s, _) = listener.accept().await?;

        s.write_u64(1234).await?;

        Ok(())
    });

    sim.client("client", async {
        let mut s = TcpStream::connect(("server", PORT)).await?;

        let mut buf = [0; 1];
        // panic!: buf.len() must fit in remaining()
        let _r = s.read(&mut buf).await?;

        Ok(())
    });

    sim.run()
}

See:

turmoil/src/net/tcp/stream.rs

Line 137 in 2d0fadd

buf.put_slice(bytes.as_ref());

Describe the solution you'd like

Align turmoil with tokio::net.

Make the tracing sink configurable

Tracing currently accepts a path to a file. Make this more flexible by accepting any Write. Using stdout out is useful for short running tests.

meets Tokio `enble_io()` error

I was trying to init a simulator runs a closure which bind to localhost. like this:

#[test]
fn test_main() -> Result {
    let mut sim = Builder::new()
        .build();

    sim.client("10.129.11.11", async {
        let (mut sock, addr) = TcpListener::bind((IpAddr::from(Ipv6Addr::UNSPECIFIED), 8080))
            .await?
            .accept()
            .await?;
        sock.write_i32(124).await?;
        Ok(())
    });
    sim.run()
}

And I meets error :

It seems turmoil didn't call tokio's enable_io() method when initiating a tokio runtime?
related code in turmoil/src/rt.rs

fn init() -> (Runtime, LocalSet) {
    let mut builder = tokio::runtime::Builder::new_current_thread();

    #[cfg(tokio_unstable)]
    builder.unhandled_panic(tokio::runtime::UnhandledPanic::ShutdownRuntime);

    let tokio = builder.enable_time().start_paused(true).build().unwrap();

    tokio.block_on(async {
        // Sleep to "round" `Instant::now()` to the closest `ms`
        tokio::time::sleep(Duration::from_millis(1)).await;
    });

    (tokio, new_local())
}

Support loopback

Hosts currently may only bind to 0.0.0.0.

https://github.com/tokio-rs/turmoil/blob/main/src/net/tcp/listener.rs#L28

Add support to bind 127.0.0.1 to unblock loopback scenarios. We need to decide how network topology is affected by these changes. For example, it doesn't make sense to allow partitions within a host.

Add warning for blocking tasks that block the sim

Related to #139 we should add a warning that prints when a blocking task is still active in the runtime causing the next tick to not happen. This can be done by 1) adding a blocking task count metric to tokio-metrics and then to spin a bg thread that checks this metric and some sort of tick count. It will then start printing if the tick can not progress.

Add simulated PRNG

We should support deterministic PRNG for usage for retries, hashmaps, etc. We can accomplish this by providing a deterministic version of RandomState.

Add a condensed tracing format

Currently, tracing emits a "pretty-print" JSON format for all events. Some scenarios warrant seeing a more condensed version of the output.

Make the format configurable. Perhaps it could look like this?

src(dot) | dst(dot) | what(send, recv, etc.) | timestamp | ...

Look into spans for network tracing

See comments in #48 re: spans.

For TcpStream and UdpSocket spans might simplify the context needed for each event, ie syn, fin, etc.

Support bouncing a host

Is your feature request related to a problem? Please describe.
No. This is new functionality.

Describe the solution you'd like
Hosts in the simulation are simply futures. During run_until() I'd like to have the ability to "bounce" a host (cancel, join and restart).

Explore state exploration in turmoil

State exploration entails navigating all (or some portion of) the possible states a program can enter during its execution. Model checkers exist (TLA+, P, etc.), but they require building a model that is separate from the actual implementation.

turmoil provides an interesting opportunity where we are running all of the real code, but with a simulated network. The network provides a place to both view states and control state transitions. Can we expose the right APIs to make state exploration possible?

Note that this approach differs from fuzzing the network, which is already possible today.

Support async dns resolution

Tokio's https://docs.rs/tokio/latest/tokio/net/trait.ToSocketAddrs.html under the hood is an async operation which presents surface area for dns to hang etc

Implement additional UDP features

It would be useful to extend the current UDP model with the following features:

Randomized packet corruption/truncation
Randomized packet duplication/retransmission
Randomized packet reordering - this can be accomplished by having some jitter assigned to each packet.
~~Preferring new packets instead of old on full receive buffers - currently we drop new packets on full buffers but this isn't usually what network stacks do or what applications expect.~~ - turns out this is exactly what stacks do - see #128 (comment)
Setting the MTU for a path and being able to drop and/or truncate packets larger than that value
Simulate bufferbloat (i.e. latency increases by some function as the number of packets being buffered increases).

Loopback is incomplete

The following scenarios exist for client -> server within the same host:

(Only Tcp is shown, but we need to handle it for Udp as well)

// bind | connect

// 0s | 127.0.0.1
// client: local Ok(127.0.0.1:49582), peer Ok(127.0.0.1:1234)
// server: 127.0.0.1:49582, local Ok(127.0.0.1:1234), peer Ok(127.0.0.1:49582)

// 127.0.0.1 | 127.0.0.1
// client: local Ok(127.0.0.1:49622), peer Ok(127.0.0.1:1234)
// server: 127.0.0.1:49622, local Ok(127.0.0.1:1234), peer Ok(127.0.0.1:49622)

// 0s | 192.168.1.42
// client: local Ok(192.168.1.42:49716), peer Ok(192.168.1.42:1234)
// server: 192.168.1.42:49716, local Ok(192.168.1.42:1234), peer Ok(192.168.1.42:49716)

// 127.0.0.1 | 192.168.1.42
// Error: Os { code: 61, kind: ConnectionRefused, message: "Connection refused" }

The first two work as expected, including setting the correct local|peer_addr on each side of the stream. The last two cause panics today due to holes in the stop-gap implementation for loopback.

We need to address this with workarounds and/or include this in the refactor being discussed in #132 .

Error `ConnectionRefused` when binding UdpSocket to `localhost`

This line in the UdpSocket implementation suggests that it can be used with localhost, :: or 0.0.0.0.

But if we try to change the binding address from "unspecified" to "localhost" in the udp tests, they all fail with ConnectionRefused error.

If this is the expected behaviour for the socket, that part of documentation can be seen as somewhat misleading.

Regex matching throws exception in Pair

When using hold with regular expressions, Pair throws an exception because it expects the two IpAddr to be different.
Here is a minimal test:

    #[test] 
    #[cfg(feature = "regex")]
    fn hold_all() -> Result {
        let mut sim = Builder::new().build();

        sim.host("host", || { async { future::pending().await } });
        sim.client("client", async {  
                hold(regex::Regex::new(r".*")?, regex::Regex::new(r".*")?);
                Ok(())
        });
    
        sim.run()?;
        Ok(())
    }

Fails with:

thread 'sim::test::hold_all' panicked at 'assertion failed: `(left != right)`
  left: `192.168.0.1`,
 right: `192.168.0.1`', src/top.rs:35:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'sim::test::hold_all' panicked at 'a spawned task panicked and the LocalSet is configured to shutdown on unhandled panic', /Users/foo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.26.0/src/task/local.rs:603:17

support ephemeral port assignments

I tried the following code:

#[test]
fn ephemeral_port() -> Result {
    let mut sim = Builder::new().build();

    sim.client("client", async {
        let sock = bind_to(0).await?;

        // turmoil should assign a port to the ephemeral range
        assert_ne!(sock.local_addr()?.port(), 0);
        assert!(sock.local_addr()?.port() >= 49152);

        Ok(())
    });

    sim.run()
}

It would be nice to support ephemeral port assignment. This is useful for clients that don't care about the specific port number; they just need a free port.

From https://www.rfc-editor.org/rfc/rfc6335#section-6:

o the System Ports, also known as the Well Known Ports, from 0-1023
(assigned by IANA)

o the User Ports, also known as the Registered Ports, from 1024-
49151 (assigned by IANA)

o the Dynamic Ports, also known as the Private or Ephemeral Ports,
from 49152-65535 (never assigned)

Document determinism guidelines

Turmoil is built on the concept of deterministic execution. Using structures such as HashMap initialize with non-deterministic RandomState. Both the internals of turmoil and applications using it need to buy in.

e.g. HashMap, HashSet, tokio::select!, etc.

Document the guidelines.

Clean up network topology semantics

The simulation has the ability to manually and randomly change network conditions during the simulation. This was initially designed for the datagram (UDP) APIs, and does not fully translate to streams (TCP), namely dropping messages. The goal of the simulation is not to test that TCP works, rather it aims to test that applications built over TCP work correctly. These applications lean on the guarantees that TCP provides, ie message order.

Currently, one can apply two types of network partitions:

partition: All messages are dropped. Works for datagram. Not supported on established streams, however it works for new connections as we only send one message for the 3-way handshake.

See: https://github.com/tokio-rs/turmoil/blob/main/src/world.rs#L250

hold: Hold all messages "on the network". Works for both modes.

The goal of this issue is to figure out consistent semantics and naming for both networking modes.

Spawn blocking blocking sim runtime

main...lucio/spawn-blocking-bug#diff-ace3e8abab9fb7b84efd253a7cea095084172b5cc3431426e3391305a554b152R46

With this example code its possible to never run the client future as the server one will hang until all spawn blockings complete. The real answer here is to not use threads since this removes determinism. But this is still surprising behavior. The work around is to use another thread provider like a different tokio runtime (where you call spawn_blocking on that) or std::thread.

cc @MarinPostma @mcches

Return errors instead of panicking, when sending invalid packets.

Currently turmoil will panic, if a packet is send to an ip address that does not exist,
since this will result in an invalid access to the index map in top.rs.

This does not mirror the behavior if tokio or std sockets and panicking seems too extrem,
especially since some applications may create such sockets, expecting errors instead of
panics.

Therefore it might be advantageous to return errors instead of panicking in World::send_message.

Example

This example will panic.

fn main() -> Result {
     let mut sim = Builder::new().build();
     sim.client("client", async move {
         let _ = net::TcpStream::connect("192.168.30.1:80").await?;
         Ok(())
     });

     sim.run()
 }

thread 'main' panicked at 'IndexMap: key not found', ~/dev/turmoil/src/top.rs:221:25
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Error: JoinError::Panic(Id(1), ...)

Add support for SO_LINGER to `TcpStream`

[Placeholder]

Add support for `TcpStream#peek`

Method: TcpStream#peek

I have run into an issue that requires the use of this method. I would be happy to get started on a fix, but I'm uncertain on the approach to take. The issue I'm running across is (1) that self is immutably referenced:

pub async fn peek(&self, buf: &mut [u8]) -> io::Result<usize>

This was previously mutable and relied on poll_peek.

The other issue (2) is that turmoil currently implements the ReadHalf and WriteHalf using a tokio::mpsc channel. However, it doesn't look like the Receiver has an option to immutably read the internal lock-free list.

So my question is whether we should paper over the ReadHalf with an internal data structure, try to get this implemented in tokio/chan or some other potential solution I'm missing?

Setup CI

At min we should run a full build per PR.

See: https://github.com/jonhoo/rust-ci-conf

More interesting examples of testing distribution system building blocks

Description

It would be great to provide a few more complex example to showcase more Turmoil capabilities. The example should be succinct with lightweight dependencies but need functional testing for:

Fault tolerance
Scalability to many nodes

Proposal

Food for thought - https://martinfowler.com/articles/patterns-of-distributed-systems/ with a few candidates:

Heartbeat: seems straightforward, can be made more complex with Gossip?
Leader - Follower: some reference implementation here for raft, seems to touch heartbeat, quorum as well - too big?
2PC
Others?? Happy to contribute

Fix socket close behavior when there is unread data

[Placeholder]

Question: Reproduce, Random and Time

I read the examples and tests code in turmoil but still have some puzzle:

In the similar project MadSim , there is a "Test Seed" for every run to generate a deterministic time and random number, so users can use same seed to get exactly same result. Can turmoil do something like this? and how ?

BTW, I thought it was Sim::epoch() to do this, but I got different result by every run in code below:

use rand::SeedableRng;
use std::time::SystemTime;
use turmoil::{Builder, Result};

fn main() {
    println!("Hello, world!");
}

#[test]
fn test_main() -> Result {
    let mut sim = Builder::new()
        .epoch(SystemTime::UNIX_EPOCH)
        .rng(rand::rngs::StdRng::seed_from_u64(10))
        .build();

    sim.client("host", async {
        println!("Hello world!");

// now() is diffferent in every run . And there seems no API in turmoil to mock time.
        println!("now: {:?}", std::time::Instant::now());
        Ok(())
    });
    sim.run()
}

Can turmoil simulate IO other than network? (for example, Disk IO )

Network partition should cause 'host unreachable' not 'connection refused' for TCP

Summary: Unreachable hosts should cause an UnreachableHost rather than ConnectionRefused
on network partitions, etc.
Summary: I am not sure if Shuttle needs this level of fidelity just yet, and if anyone would notice the difference at this time. But someday simulations using Shuttle might take different actions based upon UnreachableHost vs ConnectionRefused, so it might make sense to fix.

detail

I modified the axum example by adding a single line before the client request:
turmoil::partition("client", "server");

Doing so resulted in this output:

[...]
thread 'main' panicked at examples/axum/src/main.rs:71:15:
called `Result::unwrap()` on an `Err` value: Error { kind: Connect, source: Some(Custom { kind:ConnectionRefused, error: "192.168.0.1:9999" }) }
[...]

Normally when trying to reach a TCP server via a partitioned network, a HostUnreachable error will occurr after a timeout period. A ConnectionRefused occurr will not occur, because a ConnectionRefused occurr happens when a box receiving a TCP syn rejects it, because there is no listener or server running on that port.

This can be demostrated by using curl from the command kine.

# in this first example, I am curling an IP address without a computer. 
# thus there is nothing to respond. it will timeout after ~3 seconds, and return host unreachable
c@intel12400 ~/t/e/axum (main) [7]> time curl -vvvvv 192.168.86.33
*   Trying 192.168.86.33:80...
* connect to 192.168.86.33 port 80 from 192.168.86.5 port 59648 failed: Host is unreachable
* Failed to connect to 192.168.86.33 port 80 after 3055 ms: Couldn't connect to server
* Closing connection
curl: (7) Failed to connect to 192.168.86.33 port 80 after 3055 ms: Couldn't connect to server

________________________________________________________
Executed in    3.06 secs      fish           external
   usr time    6.04 millis  960.00 micros    5.08 millis
   sys time    0.32 millis  315.00 micros    0.00 millis

# this second example shows when a connection refused occurs
# I am curling to a valid IP with a computer running, but nothing running on the port specified
# thus the computer receives the TCP syn request, but denies it, cause nothing is on the port
c@intel12400 ~/t/e/axum (main) [7]> time curl -vvvvv 192.168.86.5:8888
*   Trying 192.168.86.5:8888...
* connect to 192.168.86.5 port 8888 from 192.168.86.5 port 41228 failed: Connection refused
* Failed to connect to 192.168.86.5 port 8888 after 0 ms: Couldn't connect to server
* Closing connection
curl: (7) Failed to connect to 192.168.86.5 port 8888 after 0 ms: Couldn't connect to server

________________________________________________________
Executed in    5.57 millis    fish           external
   usr time    5.49 millis  701.00 micros    4.79 millis
   sys time    0.23 millis  226.00 micros    0.00 millis

Time and clocks

https://github.com/tokio-rs/turmoil/blob/main/src/host.rs#L69

I think code like this should be replaced with something that sources time off a virtual clock. If so, there are two broad patterns to apply:

Pass the instant 'now' in as a param
Given the host a timesource and fetch 'now' from said source

Infinitely running.

Hi, I'm not sure it is a bug, but given the code below ( I know it's a misuse of TcpListener) ,
the sim runs infinitely, and it is expected to stop within 10 logical seconds, and far shorter real world duration.

Maybe it's make sense to add a new simulator config like realworld_duration ?

#[test]
fn infinite() -> turmoil::Result {
    use std::time::SystemTime;
    use turmoil::{net::TcpListener, Builder};

    let mut sim = Builder::new().epoch(SystemTime::UNIX_EPOCH).build();
    sim.host("s", || async {


        loop {
            TcpListener::bind("0.0.0.0:80").await?;
        }
    });

    sim.run()
}

QA: How is this diffferent the original `simulation` project?

Add simulated disk Io

Is your feature request related to a problem? Please describe.
No. This is new functionality.

Describe the solution you'd like
Hosts have an Io concept today, but it is just network. Add the ability to write/read to/from disk, and have this state persist across host restarts.

Support holding messages after send

Is your feature request related to a problem? Please describe.
No. This is new functionality.

Describe the solution you'd like
During the simulation I'd like to place a "hold" on the link between two hosts. Any messages sent will remain in the queue while the "hold" is active. At a later time I'd like to remove the "hold", which allows delivery for the queued messages.

This is useful for tests that need to control the ordering of events across multiple hosts.

Support one-way partitions

Would be great if turmoil can publicly support the ability to introduce one-way partitions between hosts: host A can send messages to host B, but host B messages don't get delivered to host A.

Support multiple network interfaces

This refactor aims to introduce the ability of nodes to have multiple addresses
in distinct subnets.

Immediate Goals

Each node should be bound to an unique Ipv4Addr AND an unqiue Ipv6Addr
All bound addresses should be in a predefined subnet (like 192.168.0.0/16)
The available subnets should be configurable in the Builder
Addresses can be either automatically or manually assigned

Challenges

I have tried to implement this, and come to the conlsuion that some major changes
internally AND externally would be nessecary. Notably:

Nodes, and thus Rt/Host can no longer by identified by a single IpAddr.
The best possible solution would be to identify them by something like a MAC addr,
but that would warrent major internal changes
lookup would need to return more than one possible address, thus the public API
of ToIpAddr / lookup / lookup_many would need to change. This could be a good
moment to introduce API compliance with either std::net or tokio::net
using ToIpAddr for module creation creates problems when statically assigning addresses.
Even without this refactoring, nodes with explicitly assigned address cannot have human readable
names, since their place is traded for the address assignment. Mixed IP subnets only enlarge this
problem. The current api of Sim::client / Sim::host provides no way to explicitly assign both
an Ipv4 and an Ipv6 address. In short, the current API cannot support a node with explicitly assigned
addresses, let alone a human-readable name, so changes would be nessecary.

In my opinion, this amount of changes would exceed the scope of one PR, so it might be benifical
to make step by step changes to the public API. However this warrants discussion.

Some related thoughts

While not in the scope of this refactoring, binding sockets to specific addresses may be beneficial
in std/tokio Ipv6 sockets bound to [::] can receive incoming Ipv4 packets (addresses are being mapped to Ipv6),
however the reverse is not possible. This seems like an rare edge case, so i do not know whether we
should ever support this behaviour
I might prove useful in the future, to refrain from hardcoding only two possible addresses in two possible subnets per node.
Supporting a set of bound addresses+subnets might be beneficial to a) remodel localhost, to use top.rslinks or b) add support for multiple interface, thus multiple subnets, should that ever be a goal

As a reference, my current test implementation can be found here.
I have closed the corresponding PR #125, since it is already out of date.

Progress

types representing subnets
dns lookups that may return multiple IpAddrs
decoupling dns lookup and dns registration
uuid as Host/Rt identifers
updated node creation API
subnet configuration in Builder
multiple network interfaces per node (according to subnet configuration)
socket support for binding to specific addresses

Calling `run()` after crashing a host errors

Repro:

#[test]
fn run_after_host_crashes() -> Result {
    let mut sim = Builder::new().build();

    sim.host("h", || async { future::pending().await });

    sim.crash("h");

    sim.run()
}

Fails with:

running 1 test
Error: JoinError::Cancelled(Id(1))
test sim::test::run_after_host_crashes ... FAILED

failures:

failures:
    sim::test::run_after_host_crashes

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 10 filtered out; finished in 0.00s

Support client and host software errors

Currently, we only have panic to trigger failure during simulation runs. This makes writing both the test and host software a little clunky, as we can't ? return on err.

To make the experience better, we can define a dynamic Error type and have both client and hosts supply a future that aligns. On each run() iteration, we can check if any host finishes with an error, end the simulation and return the error.

e.g.

pub type TurmoilResult<T = ()> = std::result::Result<T, Box<dyn std::error::Error>>;

Support binding multiple addrs within a host

Listener::bind() was added in #35, but it assumes the sole host's SocketAddr. It's both reasonable to support this (say for loopback or multiple acceptors within a process) and necessary to mirror tokio::net.

Enable testing of backpressure from TCP connections

There are certain error conditions we want to test out that only happen when the TCP connection stalls and doesn't write right away (returns Poll::Pending on write). Would be great if Turmoil allowed for that - so in the simple code below, the host b just waited forever instead of the simulation panicking with socket buffer full.

use std::{
    net::{IpAddr, Ipv4Addr},
    time::Duration,
};
use tokio::{io::AsyncWriteExt, time::sleep};
use turmoil::{
    net::{TcpListener, TcpStream},
    Result,
};

#[test]
fn want_backpressure() -> Result {
    let mut sim = turmoil::Builder::new().build();
    sim.host("b", || async {
        let listener = TcpListener::bind((IpAddr::from(Ipv4Addr::UNSPECIFIED), 9876))
            .await
            .expect("Bind to local host");
        let (mut conn, _addr) = listener.accept().await.expect("Accept conn");
        for _ in 0..10000 {
            conn.write_all(b"message").await.expect("Write");
            conn.flush().await.expect("flush");
        }
        Ok(())
    });
    sim.client("a", async move {
        let _conn = TcpStream::connect("b:9876").await.expect("Open to b");
        sleep(Duration::from_millis(100)).await;
        Ok(())
    });
    sim.run()
}

tokio-rs / turmoil Goto Github PK

turmoil's Issues

Example

Description

Proposal

Immediate Goals

Challenges

Some related thoughts

Progress

Recommend Projects

Recommend Topics

Recommend Org

Jobs