Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Property test improvements #15358

Merged
merged 1 commit into from
Jan 8, 2025
Merged

Conversation

sharkdp
Copy link
Contributor

@sharkdp sharkdp commented Jan 8, 2025

Summary

Test Plan

export QUICKCHECK_TESTS=100000
while cargo test --release -p red_knot_python_semantic -- \
  --ignored types::property_tests::stable; do :; done

@sharkdp sharkdp added testing Related to testing Ruff itself red-knot Multi-file analysis & type inference labels Jan 8, 2025
cargo test --locked --release --package red_knot_python_semantic -- --ignored types::property_tests::stable
done

create-issue-on-failure:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the rest of this workflow, this is copied over from daily_fuzz.yaml. It seems like a reasonable idea (how would we notice otherwise?), but I haven't actually seen any "Daily parser fuzz failed on …" in the issue tracker. Did the daily_fuzz workflow never run, or is this issue-creation disabled somehow?

@sharkdp sharkdp force-pushed the david/property-test-updates branch from 2d06d95 to a3f4526 Compare January 8, 2025 20:22
Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

I have no idea how to verify that the cron thing actually works, other than to land this and wait and see? (I don't even know where to go look tomorrow to see if it ran, assuming it doesn't fail and notify us.)

.github/workflows/daily_property_tests.yaml Outdated Show resolved Hide resolved
@sharkdp
Copy link
Contributor Author

sharkdp commented Jan 8, 2025

I have no idea how to verify that the cron thing actually works, other than to land this and wait and see?

The fuzzer tests run daily at 00:00 (UTM?), so I decided to run the property test daily at 12:00, to distribute a bit among timezones, and so I will be awake when they run 😄

https://crontab.guru/#0_12___*

I don't even know where to go look tomorrow to see if it ran

Here: https://github.com/astral-sh/ruff/actions/workflows/daily_property_tests.yaml

Sample run is here: https://github.com/astral-sh/ruff/actions/runs/12678401447/job/35335911931

Copy link
Contributor

github-actions bot commented Jan 8, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

- Mark `assignable_to_is_reflexive` as flaky
- Add new (failing) `intersection_assignable_to_both` test
- Add a workflow to run property tests on a daily basis
@sharkdp sharkdp force-pushed the david/property-test-updates branch from 063b602 to 3c1abce Compare January 8, 2025 21:02
@AlexWaygood
Copy link
Member

AlexWaygood commented Jan 8, 2025

@sharkdp

Like the rest of this workflow, this is copied over from daily_fuzz.yaml. It seems like a reasonable idea (how would we notice otherwise?), but I haven't actually seen any "Daily parser fuzz failed on …" in the issue tracker. Did the daily_fuzz workflow never run, or is this issue-creation disabled somehow?

@carljm

I have no idea how to verify that the cron thing actually works, other than to land this and wait and see? (I don't even know where to go look tomorrow to see if it ran, assuming it doesn't fail and notify us.)

The py-fuzzer script found a number of bugs in our parser immediately after the big parser rewrite in early 2024, but to my knowledge, the daily cron job that we set up to fuzz the parser shortly after the parser rewrite has never failed. @dhruvmanila just wrote a really good parser! 😃 (And also, we haven't made any major updates since the big rewrite.)

However, I'm pretty confident that the issue-creation logic works, because I've used it to great success in multiple other repos. For examples, see:

@sharkdp sharkdp merged commit 4fd82d5 into main Jan 8, 2025
23 checks passed
@sharkdp sharkdp deleted the david/property-test-updates branch January 8, 2025 21:24
repo: "ruff",
title: `Daily property test run failed on ${new Date().toDateString()}`,
body: "Runs listed here: https://github.com/astral-sh/ruff/actions/workflows/daily_property_tests.yaml",
labels: ["bug", "red_knot", "testing"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be

Suggested change
labels: ["bug", "red_knot", "testing"],
labels: ["bug", "red-knot", "testing"],

(https://github.com/astral-sh/ruff/issues?q=is%3Aissue%20state%3Aopen%20label%3Ared-knot)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you: #15361

owner: "astral-sh",
repo: "ruff",
title: `Daily property test run failed on ${new Date().toDateString()}`,
body: "Runs listed here: https://github.com/astral-sh/ruff/actions/workflows/daily_property_tests.yaml",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A typeshed contributor recently made a great improvement to the typeshed version of this workflow so that the issue text links directly to the exact run that failed rather than a list of all the runs that have ever happened: https://github.com/python/typeshed/pull/13210/files

We could probably make the same improvement to this workflow and daily_fuzz.yaml!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference testing Related to testing Ruff itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants