Right now we use run_start to track the start of the

Provide a better way to group observability data from the same run about hypothesis HOT 6 CLOSED

hgoldstein95 commented on August 20, 2024

Provide a better way to group observability data from the same run

from hypothesis.

Comments (6)

hgoldstein95 commented on August 20, 2024 1

Yeah, ok. That makes sense. Since Tyche is about live feedback, I'm tempted to just ignore any data more than a minute or two older than the newest data, but I'll play around with those heuristics.

from hypothesis.

Zac-HD commented on August 20, 2024

Our schema doesn't have any concept or way to represent an invocation of a test suite, only invocations of individual test functions. This is harder to pin down than it might sound - a short list of options includes:

Manually calling an @given()-decorated function (hereafter "test"; the "property" terminology made sense for Haskell but not Hypothesis)
- in a short-lived process / via a script
- in a long-lived process such as a notebook
Running via doctest
Running with unittest
- default serial configuration
- custom parallel configuration (OK if we don't support this, but avoiding nonsense would be nice)
Running with pytest
- single process/thread
- multiprocess with pytest-xdist - have to pass around the start timestamp somehow
- selecting a subset of available tests, by filename or e.g. -k selector
- (combinations of these)
Test runs triggered via editor integrations (possibly via pytest or unittest?)
Overlapping runs, e.g. triggering a specific test in-editor while waiting for a longer run which uses pytest-xdist

So logging the start and end timestamps of a suite (e.g. as information messages) is unsatisfactory; we do actually need to include some kind of suite identifier in the test-case message. Presumably we'll want to use something like an optional test-suite key under metadata, containing timestamp, string identifier, and maybe some other metadata about how it was invoked such as sys.argv or 'via vs code'. With some work we can get all of that for pytest runs via our own pytest plugin, though for others including unittest I don't think we can do much better than sys.argv and maybe the parent process ID (and timestamps? needs a psutil dependency...).

Thoughts?

from hypothesis.

hgoldstein95 commented on August 20, 2024

Yeah, I suspected this would be the sticking point. But your proposed solution of a best effort test_suite field sounds good to me, and the heuristics you're proposing for collecting that data also make sense.

I'm not sure if it should be under metadata or lifted to the top level. My perspective has been that metadata contains information that generic tools can't depend on but that users could choose to analyze manually if they're inclined. If we want downstream analyses to be able to use this to group properties, I'd suggest we make it a top-level data field.

from hypothesis.

Zac-HD commented on August 20, 2024

I lean towards metadata because I don't think tools can depend on it, even from Hypothesis - we can do best-effort, but there are certainly going to be cases and users for whom it doesn't work at all 😢

from hypothesis.

hgoldstein95 commented on August 20, 2024

Hmm, ok. I buy that.

Then that leaves open the question of what tools should do when they have a pile of data and no way to know which of those lines are fresh enough to show. Should we consider a time horizon (e.g., an hour or a day)? Should we just show the latest data from every property that's available?

from hypothesis.

Zac-HD commented on August 20, 2024

🤷 depends on the frontend tool I think!

I think it'd be reasonable to show a little ⓘ this test was run {duration} ago if it's older than the user might expect. Heuristic grouping by per-test timestamp probably also works pretty well - look at the distribution of intervals between sorted start times, and your most recent large outlier is probably the gap between suite runs.

from hypothesis.

Provide a better way to group observability data from the same run about hypothesis HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs