Use GHA's caching mechanism to save package manager caches between runs #51

vyasr · 2024-04-22T21:34:14Z

Currently RAPIDS CI jobs spend a significant amount of time constructing environments, whether they be pip or conda.
A meaningful chunk of this time is spent downloading packages from remote sources.
Aside from the inherent wastefulness in time and network bandwidth, these downloads also expose us to more network connectivity issues, which have plagued our CI in general.

We should investigate using Github's native dependency caching functionality.
GHA recommends caching for specific package managers using the corresponding setup-* scripts, but those are more general tools intended to actually set up the installation of those package managers as well.
Since we will have those package managers installed into our base images, we will have to manage the caching directly.
That shouldn't be too difficult though; we simply need to construct a suitable cache key corresponding to the path to each package manager's local cache (e.g. /opt/conda/pkgs for conda).

We will need to figure out what makes the most sense to put into a cache key.
One option would be to use a single cache for all conda packages across our entire matrix of jobs, but that would mean sharing a cache between different architectures and CUDA versions, which may not be ideal.
The opposite alternative would be having a separate cache for every matrix entry in a job (e.g. arch/CUDA version/Python version).
In general, we'll need to balance cache size (which should speed up cache upload/download), contention (I don't know how well GHA handles every PR in a repo trying to upload or download the exact same cache simultaneously, hopefully that's optimized well but we'll have to test), and cache hit rate (if different jobs have partial overlap in their dependencies, then using a shared cache will increase the hit rate).

The text was updated successfully, but these errors were encountered:

jjacobelli · 2024-05-21T13:22:56Z

Using GH native caching feature may not work as expected because RAPIDS is using self-hosted runners. When using caching with self-hosted runners, the cache is stored on GitHub-owned cloud storage, which means the runners will still need to download the cache from this storage for every run.
From GH documentation:

We are investigating to add some caching at the runner level for package managers like pip or conda

ajschmidt8 · 2024-06-07T16:44:54Z

thanks Vyas for bringing this issue to my attention.

Jordan's comment is correct. Caching dependencies with GitHub's native solution doesn't really work for self-hosted runners. there is a community issue about it below:

https://github.com/orgs/community/discussions/18549

We are working on a NGINX caching proxy that can be used to cache pip and conda packages close to our self-hosted runners. We are still in the testing phases, but we will be sure to broadcast the feature when it's ready.

Until then, I would recommend that no one work on this issue.

vyasr mentioned this issue Jul 22, 2024

Investigate using uv instead of pip in CI #86

Open

vyasr mentioned this issue Aug 21, 2024

[Meta] Reduce CI runtimes #95

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use GHA's caching mechanism to save package manager caches between runs #51

Use GHA's caching mechanism to save package manager caches between runs #51

vyasr commented Apr 22, 2024

jjacobelli commented May 21, 2024

ajschmidt8 commented Jun 7, 2024

Use GHA's caching mechanism to save package manager caches between runs #51

Use GHA's caching mechanism to save package manager caches between runs #51

Comments

vyasr commented Apr 22, 2024

jjacobelli commented May 21, 2024

ajschmidt8 commented Jun 7, 2024