-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AN-360 Fix slurm CI #7680
AN-360 Fix slurm CI #7680
Conversation
@@ -31,6 +31,21 @@ cromwell::build::slurm::setup_slurm_environment() { | |||
# Create various directories used by slurm | |||
sudo mkdir -p /var/run/munge | |||
sudo mkdir -p /var/spool/slurmd | |||
sudo chown slurm:slurm /var/spool/slurmd | |||
|
|||
# Set up an AppArmor profile for Apptainer to allow non-root users to use it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You called it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The homelab to day job pipeline in action
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing investigation 🫨
@@ -40,13 +55,14 @@ cromwell::build::slurm::setup_slurm_environment() { | |||
cat <<SLURM_CONF | sudo tee /etc/slurm/slurm.conf >/dev/null | |||
ClusterName=localhost | |||
ControlMachine=localhost | |||
NodeName=localhost | |||
PartitionName=localpartition Nodes=localhost Default=YES | |||
NodeName=localhost CPUs=4 Sockets=1 CoresPerSocket=2 ThreadsPerCore=2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These values obtained from running lscpu
on the GHA runner.
Description
Our SLURM CI broke when GHA runners upgraded to Ubuntu 24. Several changes needed to get SLURM and its containers to be happy, including:
/var/spool/slurmd
dirSelectType
Release Notes Confirmation
CHANGELOG.md
CHANGELOG.md
in this PRCHANGELOG.md
because it doesn't impact community usersTerra Release Notes