[Feature] Support ResourceRequirement with preemptive knowledge of available server resources #138
Labels
project/DACCS
Related to DACCS project (https://github.com/orgs/DACCS-Climate)
triage/enhancement
New feature or request
triage/feature
New requested feature.
Description
CWL supports the definition of ResourceRequirement for specifying min/max CPU cores, RAM, tmp/out directory sizes, which indicates known requirements to run the app.
This allows CWL to fail the processes preemptively if the machine doesn't fulfill the requirements to avoid running it with guaranteed failure, but this is not considered/supported at the moment for ADES/EMS.
It would be very convenient if a workflow could pre-fetch this kind of information from a remote ADES so it can validate that the process step will not fail at some later stage, but this would require from this ADES to provide the details via some HTTP route, and also consider where/how the applications are executed (eg: celery running docker, what are the subset-limitations for this compared to full server resources on which they are running, and what if they are shared across jobs...).
Given multiple ADES
DataSources
, theruntimeContext.select_resources
hook could be used to select where to send an execution given a set of minimum requirements.References
Some schema validation tests are added, but effective resolution of
ResourceRequirements
not all working/considered).Some executions seem to go over the limits, but execution is not aborted/failed.
The text was updated successfully, but these errors were encountered: