-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(sdk/backend): Add support for placeholders in resource limits #11501
Conversation
9a90c6a
to
c0343b9
Compare
Note that I labeled this as a feature but it's also a bug fix as mentioned in the Slack thread here: Additionally, it seems that this recent unreleased commit added placeholder support to the wrong accelerator type field, so it'd be nice to merge this PR before the next KFP release: |
c0343b9
to
4d49c1d
Compare
I've tested the SDK portion as we're eagerly awaiting the capability of setting cpu/ram/accelerator type/count dynamically in our projects. I also see the following in the compiled pipeline configuration for a step using those dynamic properties. resources:
accelerator:
resourceCount: '{{$.inputs.parameters[''pipelinechannel--some-step-accelerator_count'']}}'
resourceType: '{{$.inputs.parameters[''pipelinechannel--some-step-accelerator_type'']}}'
resourceCpuLimit: '{{$.inputs.parameters[''pipelinechannel--some-step-cpu'']}}'
resourceMemoryLimit: '{{$.inputs.parameters[''pipelinechannel--some-step-ram'']}}' |
Note that this also resolves: #11375 Looks like a safe change. Folks pre-kfp 2.4 and post 2.4 will have continued support of old fields, even with pipelines rendered pre sdk 2.10, when submitted. So removing them from the pipeline render makes sense. Recompiling will remove these old fields, which does seem like it's api-changing (the api is implicit here in the yaml even when not defined in the proto) but looks backwards compatible since we'll continue to handle it on driver side. Since initial change completely removed it from vertex side, I do not think this is likely to have downstream effects there, but regardless fyi @chensun. I tested this with deprecated and new values, and with parameter values which are correctly resolved. Which is great, thanks @mprahl! some nit comments above @mprahl upto you to address them or not, let me know otherwise I can approve /lgtm |
4d49c1d
to
5abf3ea
Compare
/lgtm Thanks! |
The API introduced new fields prefixed with Resource (e.g. ResourceCpuLimit) to replace the old fields without the prefix. The Driver hadn't been updated to honor those fields but the SDK started using them which led to unexpected behavior. The Driver now honors both fields but prioritizes the new fields. The SDK now only sets the new fields. The outcome is that resource limits/requests can now use input parameters. Note that pipeline_spec_builder.py was doing some validation on the limits/requests being set, but that's already handled in the user facing method (e.g. set_cpu_limit). Resolves: kubeflow#11500 Signed-off-by: mprahl <[email protected]>
5abf3ea
to
d282374
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: HumairAK The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Description of your changes:
The API introduced new fields prefixed with Resource (e.g. ResourceCpuLimit) to replace the old fields without the prefix. The Driver hadn't been updated to honor those fields but the SDK started using them which led to unexpected behavior.
The Driver now honors both fields but prioritizes the new fields. The SDK now only sets the new fields.
The outcome is that resource limits/requests can now use input parameters.
Note that pipeline_spec_builder.py was doing some validation on the limits/requests being set, but that's already handled in the user facing method (e.g. set_cpu_limit).
Resolves:
#11500
Checklist: