-
-
Notifications
You must be signed in to change notification settings - Fork 517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable broken and outdated CI #39467
base: develop
Are you sure you want to change the base?
Conversation
Documentation preview for this PR (built with commit 6b05035; changes) is ready! 🎉 |
There's a risk of we just forget to reenable it. Is there a good way to do this dynamically (i.e. check which tests the last release fail and skip these)? |
I'll open an issue where we can track which systems are failing, so we don't forget it. The issue tracker might actually be a more visible place in the first place as most people will not check the CI runs.
Not that I know. |
Great, thanks for taking care of this. |
we had a heated argument at some point whether 'minimal' configurations are meaningful beyond someone's curiosity - and they are certainly the longest ones (building gcc, python, and a bunch of equally mainstream packages from sources, that's just a contribution to global warming) |
Removing things in CI may break other things elsewhere. Especially the developer guide. In particular, https://doc-release--sagemath.netlify.app/html/en/developer/portability_testing. Did you check? The docker images created by CI linux are used by developers. The sage binder repo (https://github.com/sagemath/sage-binder-env) is using them (in particular the "minimal" image). The interactive sage doc relies on them. This PR will break the doc. Changes affecting developers widely should be discussed or at least posted on sage-devel. This PR falls in the category.
We do not remove code for temporary change. Do not invent a new protocol that code is removed to be disabled. Improving CI is good, but the present PR is also destructive. I object to this PR. |
@kwankyu What exactly is broken by this PR? |
And if you think this PR needs work, please label it so, not "disputed". |
Technically you're right in that this merely remove part of CI that has been broken by some other change made months ago (thus doesn't "in addition" break anything), and reverting this is as simple as a I don't personally use the minimal configurations and have no strong opinion on whether they're actually useful to developers. Of course having more configurations lead to additional maintenance burden and spent computing power (such as the maintenance cost needed now to make the minimal configuration build again), the cost may not be justified by the usefulness to developers/users. Anyway, since the intention of this PR is to temporarily disable the run until the bug is fixed, can't we just comment it out? I mean, we're all volunteers, if anyone volunteers to fix the minimal configurations then of course that will supersede this pull request and everyone are happy [1]… but if nobody does, the feature unfortunately will remain broken. [1] maybe except those who prefer the minimal configurations to be deleted anyway, but I think the usual workflow is deleting a feature requires 1-year deprecation period and/or announcement of some sort. |
One doesn't need a deprecation period to remove broken code, and minimal configurations are
Why do we keep killing the project this way, just because few people find these minimal configs interesting or something? |
Did you see the section of the developer guide that I pointed out? The removed chunks of ci-linux are responsible to push various docker images. One of the image, ubuntu-jammy-standard, is used in the CI check workflows for PRs. This PR will collapse the whole CI checks. Other images, ubuntu-jammy-minimal-with-targets and ubuntu-jammy-minimal-with-system-packages, are used in sage-binder-env. This PR will collapse it and the interactive sage doc that is using it. |
Can we not escalate this, thanks. Anyway, I think a good compromise is
Do you think this is reasonable? (anyway I'm not a maintainer and would be happy with anything) p/s. @kwankyu , I don't really completely understand what's going on, but if the particular CI has been failing for months, how come the incremental things etc. still work (this is |
["standard"] | ||
docker_push_repository: ghcr.io/${{ github.repository }}/ | ||
logs_artifact: false | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why delete this? What is broken?
# Make sure that all "standard" jobs can start simultaneously, | ||
# so that runners are available by the time that "default" starts. | ||
max_parallel: 50 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why delete this? What is broken?
fi; \\ | ||
rm -rf /sage/src; \\ | ||
mv src /sage/src; \\ | ||
cd /sage && ./bootstrap && ./config.status; \\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way what is this part supposed to do? As I read retrofit-worktree does
echo >&2 "usage: $0 WORKTREE_NAME WORKTREE_DIRECTORY"
echo >&2 "Ensures that the current working directory is a git repository,"
echo >&2 "then makes WORKTREE_DIRECTORY a git worktree named WORKTREE_NAME."
but I don't understand why. Your commit would make sense if you assume the retrofit-worktree always fail (but why would it?)
max_parallel: 8 | ||
|
||
minimal: | ||
if: ${{ success() || failure() }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why delete this?
There are broken platforms. We may selectively turn them off until fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"gentoo-python3.10" is dead, as Gentoo doesn't ship python 3.10 any more (and 3.11 will soon be gone, too)
maximal-pre: | ||
if: ${{ success() || failure() }} | ||
needs: [minimal] | ||
uses: ./.github/workflows/docker.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why delete this?
There are broken platforms. We may selectively turn them off until fixed.
|
||
optional: | ||
if: ${{ success() || failure() }} | ||
needs: [maximal-pre] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why delete this?
We may turn this off until fixed.
targets_optional: '$(echo $(export PATH=build/bin:$PATH && sage-package list :optional: --has-file "spkg-install.in|spkg-install|requirements.txt" --no-file "huge|has_nonfree_dependencies" | grep -v sagemath_doc | grep -v ^_))' | ||
|
||
experimental: | ||
if: ${{ success() || failure() }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why delete this?
We may turn this off until fixed.
|
||
needs: [dist] | ||
|
||
runs-on: ${{ matrix.os }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this also failing? I cannot find a run.
No.
CI-linux runs portability tests on supported platforms. As you can see in https://github.com/sagemath/sage/actions/runs/12979684199 some jobs like "standard" fail for some platforms while some other jobs like "optional" fail for all patforms. We should selectively turn them off until fixed, not remove the jobs.
because the "default" job running tests on ubuntu-jammy-standard succeeds, its docker image is pushed, and the incremental things are using the docker image to test PRs incrementally.
Yes. |
|
Sorry. They are irrelevant questions. The point is that those docker images have been a part of our infrastructure. Removing the infrastructure should not be done without community-wide discussion. |
No, the point it that there is no need to use these particular images. Standard images would do just as well. Please don't tell me that we must keep this |
That's true, but I think that isn't in the scope of this pull request. If you want to migrate the generation of the docker images to a standard image you can open a separate one (which sounds like a good thing because the docker image is available more quickly, right?) For this pull request, @kwankyu 's argument is we should not take advantage of the failing runs to also remove the successful runs. Whether -minimal should be removed can be discussed later. |
Please don't tell me that I chose minimal images just out of my frivolity. For Binder, creating lightest image was my priority. For the scope of this PR: I agree that portability-test jobs that fails on all platforms should stop wasting energy until fixed. That could be achieved by adding some "if: false" conditions and commenting out and slight refactoring. Removing code is not a way to disable things temporarily. |
Many of the CI runs after a new release are failing, for months now, see eg https://github.com/sagemath/sage/actions/runs/12979684199/job/36218126145. Some of these failures are genuine (eg a certain package cannot be built on a certain system) and some others are due to constraints of the build system (eg running out of free space). Since there is very little point in senselessly burning energy, all runs that were failing for the last releases are disabled. Once the underlying issues are fixed, they can be easily be re-enabled.
Moreover, the "minimal" runs where only a couple of systems packages are installed and most are build using sage are removed, keeping only the "maximal" where all available system packages are installed.
New test run: https://github.com/tobiasdiez/sage/actions/runs/13199372232/job/36847711005
📝 Checklist
⌛ Dependencies