-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic partitions #1855
Comments
cc @ltalirz |
I'm not sure about the exact scenario. It adds lots of complexity, and I'm not sure of the value provided |
I think what Matt is saying here is: For those VM series where Azure provides breakdowns into different sizes (e.g. NC24ads A100 v4, NC48ads A100 v4, NC96ads A100 v4), bundle those in one partition and then, based on the number of cpus/gpus requested, have slurm request the smallest one that fulfils the requirements of the job. It does not really apply to the HB series, since the smaller versions here are just restricted CPUs with the same price, but it would e.g. also apply to the F series. |
Ah I forgot about the HB series carrying the same price across all sizes. Yes, for the scenarios where you only want part of the node I think this might be useful. Although under heavy load I think this cost savings effect will disappear/get small. It can still provide better isolation between jobs though (one bad job can't fill up /tmp anymore etc) |
In what area(s)?
Describe the feature
Do we expose the dynamic partitions that CC adds in 8.4? I think it would be useful if we could allocate smaller nodes if the job is smaller. E.g. running a 4 cpu job on HB120 vs HB16.
The text was updated successfully, but these errors were encountered: