GithubHelp home page GithubHelp logo

skroutz / rspecq Goto Github PK

View Code? Open in Web Editor NEW
156.0 7.0 24.0 146 KB

Distribute and run RSpec suites among parallel workers; for faster CI builds

Home Page: https://rubygems.org/gems/rspecq

License: MIT License

Ruby 100.00%
rspec ci test-runner rspec-testing ci-tools rspec-suite rspec-runner test-runners

rspecq's Introduction

RSpec Queue

Build status Gem Version

RSpec Queue (RSpecQ) distributes and executes RSpec suites among parallel workers. It uses a centralized queue that workers connect to and pop off tests from. It ensures optimal scheduling of tests based on their run time, facilitating faster CI builds.

RSpecQ is inspired by test-queue and ci-queue.

Features

  • Run an RSpec suite among many workers (potentially located in different hosts) in a distributed fashion, facilitating faster CI builds.
  • Consolidated, real-time reporting of a build's progress.
  • Optimal scheduling of test execution by using timings statistics from previous runs and automatically scheduling slow spec files as individual examples. See Spec file splitting.
  • Automatic retry of test failures before being considered legit, in order to rule out flakiness. Additionally, flaky tests are detected and provided to the user. See Requeues.
  • Handles intermittent worker failures (e.g. network hiccups, faulty hardware etc.) by detecting non-responsive workers and requeing their jobs. See Worker failures
  • Sentry integration for monitoring build-level events. See Sentry integration. See #2.
  • Automatic termination of builds after a certain amount of failures. See Fail-fast.

Usage

A worker needs to be given a name and the build it will participate in. Assuming there's a Redis instance listening at localhost, starting a worker is as simple as:

$ rspecq --build=123 --worker=foo1 spec/

To start more workers for the same build, use distinct worker IDs but the same build ID:

$ rspecq --build=123 --worker=foo2

To view the progress of the build use --report:

$ rspecq --build=123 --report

For detailed info use --help:

NAME:
    rspecq - Optimally distribute and run RSpec suites among parallel workers

USAGE:
    rspecq [<options>] [spec files or directories]

OPTIONS:
    -b, --build ID                   A unique identifier for the build. Should be common among workers participating in the same build.
    -w, --worker ID                  An identifier for the worker. Workers participating in the same build should have distinct IDs.
        --seed SEED                  The RSpec seed. Passing the seed can be helpful in many ways i.e reproduction and testing.
    -r, --redis HOST                 --redis is deprecated. Use --redis-host or --redis-url instead. Redis host to connect to (default: 127.0.0.1).
        --redis-host HOST            Redis host to connect to (default: 127.0.0.1).
        --redis-url URL              Redis URL to connect to (e.g.: redis://127.0.0.1:6379/0).
        --update-timings             Update the global job timings key with the timings of this build. Note: This key is used as the basis for job scheduling.
        --file-split-threshold N     Split spec files slower than N seconds and schedule them as individual examples.
        --report                     Enable reporter mode: do not pull tests off the queue; instead print build progress and exit when it's finished.
                                     Exits with a non-zero status code if there were any failures.
        --report-timeout N           Fail if build is not finished after N seconds. Only applicable if --report is enabled (default: 3600).
        --max-requeues N             Retry failed examples up to N times before considering them legit failures (default: 3).
        --queue-wait-timeout N       Time to wait for a queue to be ready before considering it failed (default: 30).
        --fail-fast N                Abort build with a non-zero status code after N failed examples.
        --reproduction               Enable reproduction mode: Publish files and examples in the exact order given in the command. Incompatible with --timings.
        --tag TAG                    Run examples with the specified tag, or exclude examples by adding ~ before the tag.  - e.g. ~slow  - TAG is always converted to a symbol.
    -h, --help                       Show this message.
    -v, --version                    Print the version and exit.

You can set most options using ENV variables:

$ RSPECQ_BUILD=123 RSPECQ_WORKER=foo1 rspecq spec/

Supported ENV variables

Name Desc
RSPECQ_BUILD Build ID
RSPECQ_WORKER Worker ID
RSPECQ_SEED RSpec seed
RSPECQ_REDIS Redis HOST
RSPECQ_UPDATE_TIMINGS Timings
RSPECQ_FILE_SPLIT_THRESHOLD File split threshold
RSPECQ_REPORT Report
RSPECQ_REPORT_TIMEOUT Report Timeout
RSPECQ_MAX_REQUEUES Max requests
RSPECQ_QUEUE_WAIT_TIMEOUT Queue wait timeout
RSPECQ_REDIS_URL Redis URL
RSPECQ_FAIL_FAST Fail fast
RSPECQ_REPORTER_RERUN_COMMAND_SKIP Do not report flaky test's rerun command

Sentry integration

RSpecQ can optionally emit build events to a Sentry project by setting the SENTRY_DSN environment variable.

This is convenient for monitoring important warnings/errors that may impact build times, such as the fact that no previous timings were found and therefore job scheduling was effectively random for a particular build.

How it works

The core design is almost identical to ci-queue so please refer to its README instead.

Terminology

  • Job: the smallest unit of work, which is usually a spec file (e.g. ./spec/models/foo_spec.rb) but can also be an individual example (e.g. ./spec/models/foo_spec.rb[1:2:1]) if the file is too slow.
  • Queue: a collection of Redis-backed structures that hold all the necessary information for an RSpecQ build to run. This includes timing statistics, jobs to be executed, the failure reports and more.
  • Build: a particular test suite run. Each build has its own Queue.
  • Worker: an rspecq process that, given a build id, consumes jobs off the build's queue and executes them using RSpec
  • Reporter: an rspecq process that, given a build id, waits for the build's queue to be drained and prints the build summary report

Spec file splitting

Particularly slow spec files may set a limit to how fast a build can be. For example, a single file may need 10 minutes to run while all other files finish after 8 minutes. This would cause all but one workers to be sitting idle for 2 minutes.

To overcome this issue, RSpecQ can split files which their execution time is above a certain threshold (set with the --file-split-threshold option) and instead schedule them as individual examples.

Note: In the future, we'd like for the slow threshold to be calculated and set dynamically (see #3).

Requeues

As a mitigation technique against flaky tests, if an example fails it will be put back to the queue to be picked up by another worker. This will be repeated up to a certain number of times (set with the --max-requeues option), after which the example will be considered a legit failure and printed as such in the final report.

Flaky tests are also detected and printed as such in the final report. They are also emitted to Sentry (see Sentry integration).

Fail-fast

In order to prevent large suites running for a long time with a lot of failures, a threshold can be set to control the number of failed examples that will render the build unsuccessful. This is in par with RSpec's --fail-fast.

This feature is disabled by default, and can be controlled via the --fail-fast command line option.

Worker failures

It's not uncommon for CI processes to encounter unrecoverable failures for various reasons: faulty hardware, network hiccups, segmentation faults in MRI etc.

For resiliency against such issues, workers emit a heartbeat after each example they execute, to signal that they're healthy and performing jobs as expected. If a worker hasn't emitted a heartbeat for a given amount of time (set by WORKER_LIVENESS_SEC) it is considered dead and its reserved job will be put back to the queue, to be picked up by another healthy worker.

Rationale

Why didn't you use ci-queue?

Update: ci-queue deprecated support for RSpec.

While evaluating ci-queue we experienced slow worker boot times (up to 3 minutes in some cases) combined with disk IO saturation and increased memory consumption. This is due to the fact that a worker in ci-queue has to load every spec file on boot. In applications with a large number of spec files this may result in a significant performance hit and in case of cloud environments, increased costs.

We also observed slower build times compared to our previous solution which scheduled whole spec files (as opposed to individual examples), due to big differences in runtimes of individual examples, something common in big RSpec suites.

We decided for RSpecQ to use whole spec files as its main unit of work (as opposed to ci-queue which uses individual examples). This means that an RSpecQ worker only loads the files needed and ends up with a subset of all the suite's files. (Note: RSpecQ also schedules individual examples, but only when this is deemed necessary, see Spec file splitting).

This kept boot and test run times considerably fast. As a side benefit, this allows suites to keep using before(:all) hooks (which ci-queue explicitly rejects).

The downside of this design is that it's more complicated, since the scheduling of spec files happens based on timings calculated from previous runs. This means that RSpecQ maintains a key with the timing of each job and updates it on every run (if the --update-timings option was used). Also, RSpecQ has a "slow file threshold" which, currently has to be set manually (but this can be improved in the future).

Development

Install the required dependencies:

$ bundle install

Then you can execute the tests after spinning up a Redis instance at 127.0.0.1:6379:

$ bundle exec rake

To enable verbose output in the tests:

$ RSPECQ_DEBUG=1 bundle exec rake

Redis

RSpecQ by design doesn't expire its keys from Redis. It is left to the user to configure the Redis server to do so; see Using Redis as an LRU cache for more info.

You can do this from a configuration file or with redis-cli.

License

RSpecQ is licensed under MIT. See LICENSE.

rspecq's People

Contributors

agis avatar alejandroperea avatar andrewhampton avatar danielwestendorf avatar fragoulis avatar jorge-wonolo avatar kpelelis avatar nerian avatar nikosgkotsis avatar olleolleolle avatar rmsy avatar royzwambag avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

rspecq's Issues

High redis commands/second on parallelization > 1

Hi all

I'm experiencing weird redis thrash that's causing the test suite to go very long, and smash redis;

1 worker:
image

2 workers:
image

8 workers:
image

Further information:
1030 tests in spec suite
redis 3.2 running in a docker container
ruby 2.7.4 running in a docker container

gems:
rspec 3.9.0
rspecq 0.7.1
redis 4.1.2
redis-rails 5.0.2

Interestingly the cmd/s is fine for 1 worker, but with 2 or more workers it jumps orders of magnitudes into the 10s or 100s of thousaands and saturates the redis instance, slowing everything down.

I am running the commands with the following:

RSPECQ_BUILD=<8 digit randomly generated string>
RSPECQ_REDIS_URL=redis://redis:6379/8
RSPECQ_MAX_REQUEUES=3

for i in $sequence; do
    echo "Testing $i"
    TEST_ENV_NUMBER=$i bundle exec rspecq --worker=$i "$FILES" > /tmp/rspecq_${RSPECQ_BUILD}_${i} 2>&1 &
    pids[${i}]=$!
done

Fail-fast builds upon too many failures

There should be a way to have builds fail early if, for example, 50% of the executed examples are failures. This should be configurable and toggled on demand (e.g. in some cases a use might really want to see all failures).

A tool for new releases

  • run tests (one last time)
  • rubocop
  • update version (requires some kind of input)
  • update changelog (and allow for review before moving forward)
  • create and publish gem
  • create and push git tag

Tool should be indepondent. if it stops for any reason in any of those steps, it should be able to continue from the last step.

Remove sentry as a required dependency

We use a different bug tracking tool than sentry. Adding rspecq to the gem file and running some tests results in failures. I have traced this back to conflicts between the two bug tracking tools.

Since it doesn't make sense for us to include sentry in our dependency graph, does it make sense for sentry to be an optional dependency?

Provide visibility to flaky tests

Flaky tests, while not causing the build to fail, should still be fixed. Otherwise they can impact build times, since they can silently compound and cause many retries in each build.Consider a test suite that over time has 30 flaky tests. These could easily cause 90 additional example retries.

We should provide visibility into flaky tests (after the fact we determined they are flaky). Now that #16 is merged, we can report them to Sentry.

  • print to stdout of reporter
  • submit to Sentry

Integrate indirectly using a pub/sub mechanism

Instead of integrating directly with rspecq (like we do with sentry) utilize an easy first choice like active support notifications to integrate indirectly, by publishing events at key points of the flow.

Create an introductory short video

A short video that shows how to:

  • run a few workers in parallel
  • run the reporter to view the build progress

...could help with onboarding.

Improve flaky test reporting in Sentry

Currently, flaky tests are all emitted as a single event, with the same title "Flaky jobs detected". Thus, flaky job events from different CI builds all end up under a single Sentry event. For instance, this is the sole flaky job event as reported in one of our test suites:

sentry_rspecq_grouping

The problem with this approach is that it's hard to answer questions such as:

  1. when was this particular flaky test introduced?
  2. which file has the most flaky test?
  3. which are the files currently that contain flaky tests?

Also, it's impossible to set alerts (e.g. using code owners) based on the file which flaky jobs occur in, or collaborate an specific issues to solve a particular flaky job (since can't resolve a specific flaky job).

We have to think of a better way to report flaky jobs, whether this involves changing the fingerprint of the events, submitting separate events per flaky job/file, changing the title of events, or a combination of these.

Sentry integration

Exceptions in rspecq are naturally visible and rare, since the whole build will fail. However there are events which may not be errors, but affect QoS:

  • requeued lost job (i.e. worker went faulty); degrades performance
  • no timings found (i.e. jobs will be scheduled randomly); performance killer
  • error while trying to split slow spec files; performance killer

We should emit those to Sentry.

Fail build if published queue is empty

If there are no spec files in a project (i.e. queue is empty), the build should be considered failure (i.e. by the reporter), as a safety net against unexpected scenarios.

Support native RSpec example filtering

For test performance, we split our suite into acceptance and unit tests. We do this because acceptance spec require static assets to be built, which takes a decent amount of time which, whereas unit specs can be run instantly. We accomplish this with RSpec tags. For example, here is how we run our acceptance tests using rspec-queue:

            bin/_rspec-queue \
              --namespace acceptance \
              --tag capybara_feature \
              --tag type:feature \
              --tag type:request \
              --tag js \
              --tag webpack \
              --format=doc \
              --format=RspecJunitFormatter \
              --out="tmp/test_results/rspec_acceptance/results-$CIRCLE_NODE_INDEX.xml" \
              --requeue-tolerance=0.05 \
              --max-requeues="$CIRCLE_NODE_TOTAL"

In order to use rspecq, we need to mimic this functionality. As of now, it does not seem like rspecq supports example filtering, is this right? If not, would it be possible to support this use case?

"Formatter ... unknown" error on startup

I have not gotten rspecq to run successfully. When I execute bundle exec rspecq -b mybuild -w myworker, I get the following error message and stack.

bundler: failed to load command: rspecq (/home/me/workspace/vendor/bundle/bin/rspecq)
ArgumentError: Formatter '#<RSpecQ::Formatters::FailureRecorder:0x00007f07da84beb0>' unknown - maybe you meant 'documentation' or 'progress'?.
  /home/me/workspace/vendor/bundle/gems/rspec-core-3.6.0/lib/rspec/core/formatters.rb:178:in `find_formatter'
  /home/me/workspace/vendor/bundle/gems/rspec-core-3.6.0/lib/rspec/core/formatters.rb:146:in `add'
  /home/me/workspace/vendor/bundle/gems/rspec-core-3.6.0/lib/rspec/core/configuration.rb:876:in `add_formatter'
  /home/me/workspace/vendor/bundle/gems/rspecq-0.7.2/lib/rspecq/worker.rb:122:in `block in work'
  /home/me/workspace/vendor/bundle/gems/rspecq-0.7.2/lib/rspecq/worker.rb:94:in `loop'
  /home/me/workspace/vendor/bundle/gems/rspecq-0.7.2/lib/rspecq/worker.rb:94:in `work'
  /home/me/workspace/vendor/bundle/gems/rspecq-0.7.2/bin/rspecq:182:in `<top (required)>'
  /home/me/workspace/vendor/bundle/bin/rspecq:23:in `load'
  /home/me/workspace/vendor/bundle/bin/rspecq:23:in `<top (required)>'

I'm on rspecq 0.7.2 and rspec-core 3.4.4.

It looks like rspecq/worker.rb is trying to pass formatter instances to RSpec.configuration.add_formatter, but downstream from that function, it appears that RSpec::Core::Formatters::Loader#custom_formatter expects either a string or a class. Am I reading this correctly?

Set TTL to expire keys instead of relying to Redis eviction policy

Right now, we implicitly require administrators to have Redis configured with maxmemory and maxmemory-policy set to allkeys-lru. While this is convenient for us, it's not very bullet-proof since we don't guarantee that the timings key will not be evicted somehow. For example, the Redis instance might be reaching maxmemory for other reasons and cause the timings key to be evicted, if no builds have run for some time and some other app uses the same instance for some reason.

Instead of relying on this configuration, we could explicitly set TTLs to all the keys we use, except of those that should be persisted, i.e. timings.

This probably obsoletes #5

Preserve older timings keys

Currently we keep a single timings key (timings). If that is somehow lost, all scheduling is then thrown out of the window. To avoid mishaps, we could also store the older timings keys (e.g. timings:<timestamp>) and use them if the the latest one is somehow deleted.

Going a step further, we could also persist the key to disk and use it if no timings keys were found.

StatsD integration

API

Enabling StatsD reporting could be done via a CLI flag, --statsd, that would accept a host/IP. Additionally we should fallback to the environment variable RSPECQ_STATSD.

Metrics

Metrics we could report (<ns> stands for <namespace>) grouped by type:

Counters

  • number of successful builds <ns>.builds.total
  • number of successful builds <ns>.builds.successful
  • number of successful but flaky builds <ns>.builds.successfulFlaky
  • number of failed builds <ns>.builds.failed
  • number of failed-fast builds <ns>.builds.failed_fast
  • number of builds with a non-example error <ns>.builds.errored

Timers

  • [reporter] build total run time <ns>.totalRuntime
  • [worker] queue initialization run time <ns>.queueInitRuntime
  • [reporter] run times of slowest jobs (top 10) <ns>.slowestJobs.<job>

Gauges

  • [reporter] number of examples executed <ns>.examples
  • [queue] number of flaky examples <ns>.flakeyTests
  • [reporter] number of requeues <ns>.requeues
  • [reporter] number of example failures <ns>.failures
  • [reporter] number of non-example errors (e.g. syntax errors) <ns>.errors
  • [worker?] number of worker failures <ns>.workerFailures
  • [worker] total number of spec files <ns>.specFiles
  • [worker] total queue size (aka. number of jobs) <ns>.queueSize
  • [worker] number of spec files splitted <ns>.filesSplitted
  • [worker] number of jobs generated from the splitted files <ns>.jobsFromSplit
  • [worker] new (untimed) job received <ns>.untimedJobs

Provide info on how to reproduce flaky spec

It would be nice to also include instructions on how to reproduce (locally) the execution order that lead to the error.

@agis wrote on #31

We could perhaps submit the N (5-10) jobs that run prior to the flaky one, as a best-effort approach.

One thing we could also do, but this too is not so straightforward, is to emit the RSpec seed to Sentry. We could do these in next iterations.

RuntimeError: Queue not yet published after 30 seconds

I got this error RuntimeError: Queue not yet published after 30 seconds with redis server listening to 127.0.0.1:6379
when running

bundle exec rspecq --build=123 --worker=foo1 spec/models/car_spec.rb

Can you help to fix this error?

Default Worker#file_split_threshold to nil

Currently, we default it to a very big number (999999) to effectively disable it (because no jobs take more than that). That doesn't make much sense, we should instead be able to set it to nil to disable the splitting mechanism. The default should also be nil.

Automatically handle parallel execution

First, this gem is off to a great start! Thank you for putting your time into an open source library that helps the ruby community move forward!

I'm currently using parallel_tests for a test suite that takes a couple hours to run without parallelization. Two killer features it has are:

  • setup script: bin/rails parallel:setup
  • flag: -n [PROCESSES] How many processes to use, default: available CPUs

If you're not already familiar with these, the main idea here is parallel_tests expects you to run multiple processes on the same machine. Each process will get an env var set, TEST_ENV_NUMBER, which gives each process a distinct ID to use. This ID can be used in the Rails database.yml like database: test_<%= test_env_number %>, which allows each process to use the same DB server, but unique databases during the test run.

Are there plans to for rspecq to do something similar?

I think the main benefit of handling this in rpsecq is it solves the problem once, rather than requiring all consumers to do similar setup work. Of course, that setup work isn't necessary if your runners are distributed. But this would give parallelt_test users a convenient on ramp when switching to rspecq.

Use a proper logger for rspecq-level messages

Currently we use plain puts inside Worker and Reporter to print various rspecq-level events like errors or warnings. However this mixes the output from that generated by RSpec and makes it hard to differentiate between the two, merely by glancing at the terminal. Ideally we should let RSpec print to stdout (the default) and we should use stderr for diagnostic messages originating from rspecq itself.

We should also use an actual logger and the appropriate levels for such cases, for a more detailed output. Ruby's Logger from stdlib should be sufficient.

Using file split to run individual examples as jobs using lots of memory?

Hi all!

Solid library you guys have here, been using it for a CI test optimization that I've been working on and it's been great.
Lately I want to try to use the --file-split-threshold feature to split one of my Ruby on Rails test files which is a bit slow (1 file has ~1000 examples, each takes around 0.5-1s to complete) so that it can be worked on by multiple workers, running the file as 1 job takes about ~400s to complete.

When I try to split the file using --file-split-threshold, it gets split into ~1000 jobs and all of the workers don't even get past the 50-th job, turns out it's because the worker container ran out of memory (error code 137)

Here's a memory graph of said phenomenon (only 1 worker graph for this example)
image

Now I want to make sure if this is the problem on the tests I was running or a caveat with the library, so I pulled the rspecq repo and added this test

# my_spec.rb
RSpec.describe do
  1000.times do
    it do
      expect(true).to be true
    end
  end
end

Dry run

redis-cli flushdb
bundle exec rspecq --build=0 --seed 1 --worker=1 \
  --update-timings test/sample_suites/timings/spec/my_spec.rb

...then run it for file splitting

bundle exec rspecq --build=1 --seed 1 --worker=1 --update-timings --file-split-threshold 0 test/sample_suites/timings/spec/my_spec.rb

Turns out it's also hogging the memory on this dummy test as well, steadily increasing the memory usage until 3,7 GB.
image
*I modified the logging a bit just to see the executed examples better)

Now I'm pretty sure from the graph this indicates a memory leak (or bloat..?), and was wondering if there's something I'm missing before using the --file-split-threshold option? Maybe a configuration that I have to specify on spec_helper.rb or something like that. This is the spec_helper.rb that I used on my test

RSpec.configure do |config|
  # other config here, not really relevant

  config.filter_run_when_matching :focus
end

Ruby version is 2.7.2

Can you help me look into this? I've been dabbling on this out of memory problem for a while ๐Ÿ˜ข

Thanks ๐Ÿ™‡โ€โ™‚๏ธ!

Split spec files into examples programmatically

Right now we resort to shelling out and executing rspec in another process:

cmd = "DISABLE_SPRING=1 bin/rspec --dry-run --format json #{files.join(' ')}"

This is less than ideal since each project might have its own convention of calling into rspec (binstub, bundle exec or others). We should instead do this programmatically like we already do with the other aspects of the worker. I suspect we can call straight into RSpec::Core::Runner and pass the correct arguments (--dry-run etc.)

Read configuration from environment variables

In addition to the command-line flags we support, every configuration setting should be also optionally set via an environment variable. This will easy integration with CI servers.

Add tests

  • e2e/worker: passing but flakey suite (requeues)
  • e2e/worker: passing suite
  • e2e/worker: failing suite
  • providing a specific file
  • providing a different spec folder
  • job scheduling + file splitting
  • non-example error (e.g. syntax error)
  • concurrent workers (can we do this transparently for all test cases?)
  • reporter
  • worker liveness
  • CLI generic flags (worker, build, redis)
  • custom .rspec file in project root (not important)
  • find a way to silence stdout of workers

When reproduction flag is passed build/worker ids are not required

Since we can now pass the reproduction flag which is runs rspecq in kind of a test mode, we can also instruct it to auto-generate a build and worker id.

This way at least the developers would not have to restart redis or always change the build id when testing locally.

Reproducing flaky tests

Currently we requeue failures to guard against flaky tests. We also keep track of flaky tests and report them via Sentry (#21).

However, there is not much use of this feature until we also provide a way to reproduce these tests.

This is an umbrella task to track things we can do to make reproducing flaky tests in development mode possible.

Report dead workers to Sentry

In the event a worker dies (i.e. fails to emit a heartbeat in the specified timeframe) we should emit a warning to Sentry and also print a relevant warning to stdout.

Performance report

Hi! ๐Ÿ‘‹ I ported our tests from parallel_test to rspecq to see if it might be a viable alternative for us. The architecture of a central test queue is something I've been looking for in ruby testing for some time, and I'm optimistic about the future for this gem. So this isn't a bug report, but a performance report that I thought you might find useful.

First, here's a little background info. Our unit tests run in our Kubernetes staging cluster, on 16 core AWS EC2 machines. In parallel_test, it typically takes around 8 minutes and 30 seconds to complete, give or take 15 seconds.

To test, I wrote a runner script, rspecq_runner, and a wrapper script, rspecq, to spin up various threads.

rspecq_runner

#!/bin/bash

echo "Setting up database $TEST_ENV_NUMBER..."
bin/rails db:setup &> /dev/null

echo "Running tests $TEST_ENV_NUMBER..."
bundle exec rspecq --build=$TEST_ID --update-timings --worker=$TEST_ENV_NUMBER spec/

echo "Dropping database $TEST_ENV_NUMBER..."
bin/rails db:drop &> /dev/null 

rspecq

 #!/bin/bash

# By default, run one rspecq runner per CPU thread
CPU_COUNT=$(getconf _NPROCESSORS_ONLN)
TEST_ID=$RANDOM

# Uncomment to hard code how many rsqecq runners you want
# CPU_COUNT=14

echo "Starting test $RANDOM"

for i in $(seq $CPU_COUNT); do
   # TEST_ENV_NUMBER is used to ensure each thread connects to its own Redis and DB instance 
   export TEST_ENV_NUMBER=$i
   export TEST_ID
   ./bin/rspecq_runner &
done

bundle exec rspecq --build=$TEST_ID --report

echo "done" 

Results

First, I let the tests run a couple times on each of our test servers. This was done to ensure timings were stored in Redis. I then ran our test suite 8 times to see how long it took. Here's what I found:

Test run time
15:30
15:32
14:55
14:41
16:01
15:13
16:04
15:27

I then wanted to ensure I wasn't overloading the test machine, so I slowly incremented the number of runners:

CPU_COUNT runtime
2 27:07
4 19:26
6 17:18
8 16:25
10 16:15
12 15:47
14 15:15

Final thoughts

I was hoping for a quick port and magically everything would be faster. Turns out, there are no silver bullets.

As you can see from the numbers, it looks like a simple port from parallel_test to rspecq would nearly double our CI time.

One option I do find interesting is spinning up runner pods that connect to a central Redis. This would allow us to distribute the CI load across our k8s cluster more evenly. I assume this would also scale better horizontally, and trade cluster CPU time for test speed without vertically scaling our EC2 instances.. However, this seems like quite a bit more effort, and I won't have time to pursue this short term.

Thanks again for working on this gem. As I said before, I'm still optimistic about the future of rspecq. It was very easy to get up and running. If the runtimes were comparable, I would advocate internally for an immediate switch.

worker: files_to_example_ids fails if something is printed to stderr

In 42d20e7 we started redirecting stderr to stdout, so that we display a helpful message in case the split command fails.

However, this introduced a bug. If the dry-run command prints something to stderr but still succeeds (e.g. a deprecation warning coming from some gem or some application initializer), files_to_example_ids fails because the output doesn't contain only JSON.

A failing test case that reproduces the issue can be found in branch gh34-testcase.

As 42d20e7 suggested, we should grab both streams separately instead of redirecting stderr to stdout.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.