-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add runtime tests #475
Add runtime tests #475
Conversation
4696db6
to
8089980
Compare
8089980
to
941752f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Thanks for implementing this, I think this will be really useful. I had also worked on the same thing, but was pursuing sth slightly different. I have added my proof of concept to this branch. (see also my comments). Feel free to work off of it or remove it. I think having a seperate CI for regressions might be useful in the long run.
Happy to discuss this more.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
5b0281b
to
2855ff7
Compare
eb50f1b
to
8e6de5a
Compare
824a83f
to
b8798c5
Compare
New Baselines
|
@michaeldeistler This should be ready for review. The following is implemented and works:
If you want to run this locally do you can do Some thoughts I had: Even with a 20% tolerance the regression tests frequently fail for some reason. (might need to average across several runs or take the max across several runs for the baseline, but currently the regression tests take quite long. Would be good to merge #489 to speed this up). Lemme know if anything is unclear |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow this is amazing!!!
I don't think I understand when exactly the updating of the times is being run. On a comment to this particular PR (i.e., #475)? Or some other PR/issue? Also, what should the comment be? Anything? Who would the comment have to be by? Anyone? (BTW I am commenting now, so does this trigger the regression tests :D ?)
@@ -55,6 +55,8 @@ coverage.xml | |||
*.py,cover | |||
.hypothesis/ | |||
.pytest_cache/ | |||
tests/regression_test_results.json | |||
tests/regression_test_baselines.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this. Why do we first put it in the gitignore just to then force add it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this appears a bit odd, but I want to prevent users from pushing an updated baseline file from their local machine (which is why I have added it to the gitignore). The only way baselines should be updated is through github actions (to ensure regression tests are run in a consistent environment). That is why the github actions uses -f, while for the user changes to these files are ignored.
EDIT: This also makes running regression tests locally easier, since: running NEW_BASELINE=1 pytest -m regression
on main
and then pytest -m regression
on feature
. Ignores the changes to the baseline when switching the branch, so no stashing etc. necessary to run the test on feature
with the baseline file from main
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, got it, yes makes sense!
The event |
I have verified this to work in a seperate repo, if the workflow to update the baselines exists on the default branch. Upon commenting with Since the workflow can only be triggered on the default branch this PR has to be merged first and needs a seperate PR to update the baselines using Side note: The report is always written to the pytest output (above the Failure report), so you can check why the regression test action might have failed in detail. With this, I think this can be merged @michaeldeistler, unless you have any other reseverations. |
Thanks a lot! Feel free to merge (I cannot approve because it is technically "my" PR). |
Note that the tests are disabled in the workflow:
pytest -m "not runtime"