Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache package build base images more aggressively #26756

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

riftEmber
Copy link
Member

@riftEmber riftEmber commented Feb 21, 2025

Allow the use of a cached base Docker image for package builds, even if the network connection is too poor to check if a newer version is available.

Our packaging builds fail frequently, and one of the most common causes is network timeouts while pulling base images. Though Docker is smart about not pulling already-cached images, it still must fetch metadata to determine if a newer image is available for the requested tag, and even this will sometimes time out. So, work around it by:

  1. Attempt to pull the image as a best-effort, but ignore failure to do so.
  2. Get the SHA256 repo-digest of the image, which is the mechanism Docker uses to find changed images. Note repo-digest vs plain old digest is important here as the former as repo-digest is per-tag, whereas digest is per-architecture per-tag, and we sometimes do multi-arch builds.
  3. Reference the base image by tag and digest in the Dockerfile to be built, so Docker sees we have that exact image present locally without needing to connect to a remote registry.

This workaround may become unnecessary or could be simplified in the future, if Docker's (version of) BuildKit allows specifying not to check for a newer image than the local copy; see moby/buildkit#5340, docker/buildx#1889, etc.

An alternate solution to this would be creating a local (non pull-through) registry to cache all the images we want, using it for all builds, and updating its copies of base images when possible. I didn't go this route as we build on multiple machines and it would be unwieldy to either 1) share one registry between them (especially if the network changes in the future) or 2) create and update a separate registry across multiple machines.

While here, also adjust some Dockerfiles to always apt-get update before and on the same line as an upgrade or install, to avoid known caching issues, which we've encountered in practice.

Resolves https://github.com/Cray/chapel-private/issues/6824.

[reviewer info placeholder]

To be merged with corresponding CI config changes PR.

Testing:

  • test version of the package build job succeeds

@riftEmber riftEmber changed the title Cache package build base images more aggressive Cache package build base images more aggressively Feb 21, 2025
@riftEmber riftEmber requested a review from jabraham17 February 21, 2025 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant