tikv / fail-rs Goto Github PK

View Code? Open in Web Editor NEW

327.0 58.0 38.0 68 KB

Fail points for rust

License: Apache License 2.0

Rust 100.00%

fail-rs's People

Contributors

Stargazers

Watchers

fail-rs's Issues

"Put test cases exercising fail points into their own test crate" — is 'crate' right?

The fail crate's crate-level doc-comment first says—

fail-rs/src/lib.rs

Lines 107 to 108 in 2cf1175

 //! this it is a best practice to put all fail point unit tests into their own 

 //! binary. Here's an example of a snippet from `Cargo.toml` that creates a

—and then later says—

fail-rs/src/lib.rs

Lines 219 to 220 in 2cf1175

 //! fail points. Put test cases exercising fail points into their own test 

 //! crate.

Should the latter advice say "binary" rather than "crate"?

cargo feature should be opt-in, not opt-out

I have a usecase where I'd like to add failpoints across several of my libraries, however I'm experiencing some friction due to the way that cargo features are used by this crate.

Failpoints are currently active by default and needs to be disabled (opt-out) in production via the no_fail cargo feature. This poses a problem when nesting a couple of levels of dependencies, as the top-level consumer is no more in charge of those features and can't directly opt-out.

Considering that cargo features are additive, a better approach would be to make failpoints disabled by default and enabling them via a dedicated feature (opt-in). That way, the top-level application/consumer would be optionally in charge of configuring the fail environment and enabling failpoints (transparent to all intermediate libraries).

In practice, this would mean:

getting rid of the no_fail feature
making failpoints disabled by default
introducing a failpoints feature to enable them
releasing fail-0.3 with the new semantic

If this sounds fine to you, I can have a look around and send a PR in the next weeks.

/cc @BusyJay @kennytm @Hoverbear @brson

Release fail 1.0

This can probably be done shortly after upgrading to 2018: #21

I'll probably want to clean up the documentation a bit. It's overwhelming atm.

Support dependency wait

Is your feature request related to a problem? Please describe.

Make fail-point support dependencies (one fail-point wait for another before proceed)
we can refer to the implementation of rocksdb syncpoint https://github.com/facebook/rocksdb/blob/e9e0101ca46f00e8a456e69912a913d907be56fc/test_util/sync_point.h

Describe the solution you'd like

Support writting like this fail::cfg("point_A", "wait(point_B)")

wait indicates pause on point_A until point_B is passed.
wait_local indicates point_A is enabled when point_A and point_B are processed on same thread. And it will also pause on point_A until point_B is passed.

Additional context

part of #tikv/rust-rocksdb#361

crater fails to test fail-rs

I just noticed in a crater run that fail-rs is broken: https://crater-reports.s3.amazonaws.com/pr-60466/master%237840a0b753a065a41999f1fb6028f67d33e3fdd5/reg/fail-0.2.1/log.txt

It doesn't look like a problem with the crate, but I've asked @pietroalbini about it. Would be nice to have fail tested properly by crater.

Thread-local failpoints

Is your feature request related to a problem? Please describe.

Failpoint unit tests require taking a global lock, preventing test parallelism. An alternate or complimentary solution to a global lock (#23) would be to have a thread-local failpoint configuration, protected by a guard.

Describe the solution you'd like
Add a thread-local configuration that is protected by a guard that performs teardown.

Describe alternatives you've considered
Global locks: #23

Additional context
This would work for single-threaded test cases, but not generally for tests that require multiple threads.

fail_point! does nothing unless a FailScenario exists

Perhaps I'm doing something wrong, but I have code that looks very similar to the examples, and I can't get it to panic or otherwise respond to failpoints in the environment:

The full code is in https://github.com/sourcefrog/fail-repro

main.rs is

use fail::fail_point;

fn main() {
    println!("Has failpoints: {}", fail::has_failpoints());
    println!(
        "FAILPOINTS is {:?}",
        std::env::var("FAILPOINTS").unwrap_or_default()
    );
    fail_point!("main");
    println!("Failpoint passed");
}

When I run this:

$ FAILPOINTS=main=panic cargo +1.61 r --features fail/failpoints
    Updating crates.io index
...
     Running `target/debug/fail-repro`
Has failpoints: true
FAILPOINTS is "main=panic"
Failpoint passed

$ FAILPOINTS=main=print cargo +1.61 r --features fail/failpoints
    Finished dev [unoptimized + debuginfo] target(s) in 0.01s
     Running `target/debug/fail-repro`
Has failpoints: true
FAILPOINTS is "main=print"
Failpoint passed

In case this was broken by a later Cargo change, I tried it on both 1.76 and 1.63 and they both show the same behavior.

This is on x86_64 Linux.

API docs mention `no_fail` feature

The API docs mention the no_fail feature, but that feature no longer exists. Instead the API docs should mention, probably near the top, that failpoints are not active unless the failpoints feature is on, and its existence can be checked (after #38) statically or dynamically with has_failpoints.

cc @lucab

Cannot use `fail_point!` 3 arguments macro without importing it

Describe the bug
Cannot use full name qualification for fail_point! macro in the 3 arguments case

To Reproduce
Just try to compile:

fail::fail_point!("fail-point-3", enable, |_| {});

And you'll get:

error: cannot find macro `fail_point` in this scope
   --> my_code.rs:10
    |
10 |                     fail::fail_point!("fail-point-3", enable, |_| {});
    |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: consider importing this macro:
            fail::fail_point
    = note: this error originates in the macro `fail::fail_point` (in Nightly builds, run with -Z macro-backtrace for more info)

Expected behavior
You should be able to use the macro without importing it with use

Additional context
Looks like the issue is here:

fail-rs/src/lib.rs

Line 841 in 6645f17

fail_point!($name, $e);

The recursive macro invocation should look like this:

$crate::fail_point!($name, $e);

Clean up crate docs

The crate docs are pretty overwhelming. Figure out how to defer some of that discussion to elsewhere in the docs.

default to injecting crate name in failpoints

I think we should consider defaulting to injecting the crate name in fail_point!. Otherwise it's just too likely to have clashes if this crate is used by library crates for example.

This would need to happen at the next semver break.

Support thread group

Is your feature request related to a problem? Please describe.

fail-rs utilizes global registry to expose simple APIs and convenient FailPoint definition. But it also means all parallel tests have to be run one by one and do cleanup between each run to avoid configurations affect each other.

Describe the solution you'd like

This issue proposes to utilize thread group. Each test case defines a unique thread group, all configuration will be bound to exact one thread group. Every time a new thread is spawn, it needs to be registered to one thread group to make FailPoint reads configurations. If a thread is not registered to any group, it belongs to a default global group.

New public APIs include:

pub fn current_fail_group() -> FailGroup;

impl FailGroup {
    pub fn register_current(&self) -> Result<()>;
    pub fn deregister_current(&self);
}

Note that it doesn't require users have the ability to spawn a thread, register the thread before using FailPoint is enough.

Describe alternatives you've considered

One solution to this is pass the global registry to struct constructor, but it will interfere the general code heavily, it needs to be passed to anywhere FailPoints are defined.

Another solution is #24, but it lacks threaded cases support.

Upgrade to Rust 2018

After TiKV itself is successfully upgraded (tikv/tikv#3896) we can bump fail to Rust 2018 as well. Do a major version bump.

Support enabling conditionally fail_points without the third lambda argument

Is your feature request related to a problem? Please describe.
In most of my failpoints I need to use the condition to enable a fail point, but I rarely use the return feature. Neverthless, I'm forced to use the 3 args version of the macro, defining some return value that makes sense for my function.

Describe the solution you'd like
A fail_point two argument macro with name and enable flag, e.g.: fail_point!("my-fail-point", if: enableFlag)

need a tag for release 0.3

Describe the bug
From the README, the version has already bumped to 0.3. But in https://crates.io/crates/fail, its version is still 0.2.1. I guess we need a tag for release 0.3?

To Reproduce

Expected behavior

System information

Additional context

Add the global failpoint lock pattern directly to the library

When running failpoint unit tests, one must take a global lock so the failpoint configuration stays consistent during parallel execution. We do this in our own failpoints tests, and it's explained extensively in the fail docs. Since the library is significantly less useful without a global lock we might one directly to the library and use them in the tikv failpoints test.

Just copy the pattern from tikv/tests/failpoints into this library, then test tikv against the new failpoints library. This can be done by temporarily replacing the fail dependency in Cargo.toml with a path dependency to the modified version of fail, then running cargo test --test failpoints.

If it all works, then submit the patch here.

	//! this it is a best practice to put all fail point unit tests into their own
	//! binary. Here's an example of a snippet from `Cargo.toml` that creates a

	//! fail points. Put test cases exercising fail points into their own test
	//! crate.

tikv / fail-rs Goto Github PK

fail-rs's People

Contributors

Stargazers

Watchers

Forkers

fail-rs's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs