feat(backend): mount EmptyDir volumes for launcher write locations #10538

gregsheremeta · 2024-03-04T21:49:34Z

Launcher writes input artifacts to root paths /gcs, /minio, and /s3. These paths are not accessible by non-root users by default, which is problematic in locked-down Kubernetes installations and/or OpenShift. /gcs is currently a contract for KFP v2 python component wrappers, so the path cannot be changed.

Mount an EmptyDir scratch volume to these paths to work around this.

Additionally, /.local and /.cache are written to by pip, so add EmptyDir mounts for those too.

Fixes: #5673
Fixes: #7345

Launcher writes input artifacts to root paths /gcs, /minio, and /s3. These paths are not accessible by non-root users by default, which is problematic in locked-down Kubernetes installations and/or OpenShift. /gcs is currently a contract for KFP v2 python component wrappers, so the path cannot be changed. Mount an EmptyDir scratch volume to these paths to work around this. Additionally, /.local and /.cache are written to by pip, so add EmptyDir mounts for those too. Fixes: kubeflow#5673 Fixes: kubeflow#7345

google-oss-prow · 2024-03-04T21:49:59Z

Hi @gregsheremeta. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

google-oss-prow · 2024-03-04T21:50:26Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign chensun for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

backend/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Tomcli

Thanks @gregsheremeta, just wondering is there anyway we can reduce the amount of the volume mount definition? Due to the k8s yaml size limitation, we won't able to fit more tasks if the individual container definition is too big.

Tomcli · 2024-03-04T23:02:35Z

/ok-to-test

Tomcli

@gregsheremeta You also need to update the unit tests because you modified the container spec in this PR.

google-oss-prow · 2024-03-04T23:19:38Z

@gregsheremeta: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
kubeflow-pipelines-samples-v2	`e46440b`	link	false	`/test kubeflow-pipelines-samples-v2`
kubeflow-pipeline-backend-test	`e46440b`	link	true	`/test kubeflow-pipeline-backend-test`

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

rimolive · 2024-03-12T20:52:00Z

@Tomcli Maybe add a condition to mount one of the 3 possible options?

andraxin · 2024-04-14T11:28:20Z

This will not solve the Permission denied error reported in #10397,
unless a version of the v2 launcher that includes this fix is used.

BTW: where/how do I tell Kubeflow which launcher image to use?

That's because the root cause for the original error is that the installer creates
(or used to create, after the fix) directories with mode 0644, i.e. drw-r--r--,
which makes it impossible for unprivileged users to create subdirectories within,
but doesn't hamper root...

Brief Demo

$ docker run --rm -it alpine
/ # adduser -D demo
/ # su - demo
05decfa077fd:~$ mkdir -m 0644 demo
05decfa077fd:~$ mkdir demo/test
mkdir: can't create directory 'demo/test': Permission denied
05decfa077fd:~$ 
/ # mkdir /home/demo/demo/test
/ # ls -lR /home/demo
/home/demo:
total 4
drw-r--r--    3 demo     demo          4096 Apr 14 11:16 demo

/home/demo/demo:
total 4
drwxr-xr-x    2 root     root          4096 Apr 14 11:16 test

/home/demo/demo/test:
total 0
/ #

BTW, that also explains why creating alternate images with world-read-write
/minio pre-created (i.e. RUN mkdir -m 1777 /minio) doesn't help.

I tracked this down by mounting a PVC at /minio and observing permissions
on the first level (that does get created).

rimolive · 2024-04-15T20:07:37Z

unless a version of the v2 launcher that includes this fix is used.

This fix was introduced in KFP 2.0.2

andraxin · 2024-04-16T23:25:20Z

unless a version of the v2 launcher that includes this fix is used.

This fix was introduced in KFP 2.0.2

Yes, I'm well aware. GitHub shows that quite prominently.

fix(backend): OutPutPath dir creation mode Fixes https://github.com/kubeflow/pipelines/issues/7629 (https://github.com/kubeflow/pipelines/pull/9946)
[master](https://github.com/kubeflow/pipelines) ([#9946](https://github.com/kubeflow/pipelines/pull/9946))
[sdk-2.7.0](https://github.com/kubeflow/pipelines/releases/tag/sdk-2.7.0)
[2.0.2](https://github.com/kubeflow/pipelines/releases/tag/2.0.2)

BUT, that doesn't really help me or answer my question.
The launcher that's spawned by default in the Kubeflow v1.7.x
deployment I have to use apparently pre-dates that; and,
I have no idea how/if Kubeflow can be instructed to grab
a newer launcher with the fix.

rimolive · 2024-04-17T01:17:11Z

Kubeflow 1.7 comes with an alpha release of KFP. You should either fully upgrade your kubeflow installation to 1.8 or you can upgrade individually kfp. Just a reminder that if you choose the later one, you need to use the multi-user manifests, as you probably deployment Kubeflow w/ multi-user support.

andraxin · 2024-04-17T02:08:46Z

Kubeflow 1.7 comes with an alpha release of KFP. You should either fully upgrade your kubeflow installation to 1.8 or you can upgrade individually kfp. Just a reminder that if you choose the later one, you need to use the multi-user manifests, as you probably deployment Kubeflow w/ multi-user support.

I guess I wasn't sufficiently clear, when I said I have to use Kubeflow 1.7.
I cannot upgrade the installation nor apply any manifests at the cluster level.
Applying manifest in my profile's namespace is about as far as I can reach.

My re-stated question becomes: as a Kubeflow user, can I somehow instruct
Kubeflow to use a newer launcher image to run my pipelines?

BTW, I can use custom images for my pipeline stages with world-read-write
/gcs, /minio, and /s3, which renders these emptyDir volume mounts moot.
If I could only tell Kubeflow to use a launcher with the fix, that is.

rimolive · 2024-04-17T11:11:48Z

My re-stated question becomes: as a Kubeflow user, can I somehow instruct
Kubeflow to use a newer launcher image to run my pipelines?

If, and only if, you upgrade your KFP installation to at least 2.0.5 release, you can parametrize the launcher and driver images.

andraxin · 2024-04-17T17:10:51Z

My re-stated question becomes: as a Kubeflow user, can I somehow instruct
Kubeflow to use a newer launcher image to run my pipelines?

If, and only if, you upgrade your KFP installation to at least 2.0.5 release, you can parametrize the launcher and driver images.

I see. Thanks for the info, and your patience! 😏

Given that the code changes you linked appear to be
in backend Golang code, what you call "KFP installation"
probably includes more than just pip install kfp==2.0.2
inside a notebook in the existing Kubeflow 1.7 cluster.
Is that correct?

rimolive · 2024-04-17T17:38:31Z

Yes, this is not a simple SDK upgrade or code changes. You will need to upgrade your entire KFP installation.

majuss · 2024-05-29T12:13:59Z

@rimolive can you please describe a hotfix how I can use this patch with kubeflow 1.8.1. Because for now kfp is not usable :(

rimolive · 2024-05-29T16:54:39Z

@rimolive can you please describe a hotfix how I can use this patch with kubeflow 1.8.1. Because for now kfp is not usable :(

KFP 2.0.5 should already have this fix, but this issue is different (mount emptyDir volumes in launcher) and we don't have the PR merged yet.

thesuperzapper · 2024-05-29T20:35:46Z

@chensun @HumairAK @Tomcli we definitely need to prioritize fixing this issue, because it's pretty bad to have a hard requirement on root container images, see #10397 for an example of user having this problem.

droctothorpe · 2024-05-30T19:47:17Z

FYI, this comment describes a short term solution (that depends on kyverno): #10397 (comment)

HumairAK · 2024-05-31T19:06:28Z

@gregsheremeta is currently out,

due to the expressed urgency and prioritzation of this, I'm picking this PR up here: #10857

please forward any further discussions to the pr above

I'll bring this up in the next KFP call as well if it is not merged by then

/close

google-oss-prow · 2024-05-31T19:06:32Z

@HumairAK: Closed this PR.

In response to this:

@gregsheremeta is currently out,

due to the expressed urgency and prioritzation of this, I'm picking this PR up here: #10857

please forward any further discussions to the pr above

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

google-oss-prow bot added size/L needs-ok-to-test labels Mar 4, 2024

google-oss-prow bot requested review from chensun and Tomcli March 4, 2024 21:50

gregsheremeta mentioned this pull request Mar 4, 2024

[sdk] Can't use create_component_from_func with pip packages when running as non-root #7345

Closed

Tomcli reviewed Mar 4, 2024

View reviewed changes

google-oss-prow bot added ok-to-test and removed needs-ok-to-test labels Mar 4, 2024

Tomcli suggested changes Mar 4, 2024

View reviewed changes

This was referenced Mar 6, 2024

[backend] Unable to create directory in Minio when using Artifacts: Permission denied #10397

Closed

[feature] Configuring securityContext at pod level #9783

Open

HumairAK mentioned this pull request May 31, 2024

feat(backend): mount EmptyDir volumes for launcher write locations #10857

Merged

1 task

google-oss-prow bot closed this May 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(backend): mount EmptyDir volumes for launcher write locations #10538

feat(backend): mount EmptyDir volumes for launcher write locations #10538

gregsheremeta commented Mar 4, 2024

google-oss-prow bot commented Mar 4, 2024

google-oss-prow bot commented Mar 4, 2024

Tomcli left a comment •

edited

Loading

Tomcli commented Mar 4, 2024

Tomcli left a comment

google-oss-prow bot commented Mar 4, 2024

rimolive commented Mar 12, 2024

andraxin commented Apr 14, 2024

rimolive commented Apr 15, 2024

andraxin commented Apr 16, 2024

rimolive commented Apr 17, 2024

andraxin commented Apr 17, 2024

rimolive commented Apr 17, 2024 •

edited

Loading

andraxin commented Apr 17, 2024

rimolive commented Apr 17, 2024 •

edited

Loading

majuss commented May 29, 2024

rimolive commented May 29, 2024

thesuperzapper commented May 29, 2024

droctothorpe commented May 30, 2024 •

edited

Loading

HumairAK commented May 31, 2024 •

edited

Loading

google-oss-prow bot commented May 31, 2024

feat(backend): mount EmptyDir volumes for launcher write locations #10538

feat(backend): mount EmptyDir volumes for launcher write locations #10538

Conversation

gregsheremeta commented Mar 4, 2024

google-oss-prow bot commented Mar 4, 2024

google-oss-prow bot commented Mar 4, 2024

Tomcli left a comment • edited Loading

Choose a reason for hiding this comment

Tomcli commented Mar 4, 2024

Tomcli left a comment

Choose a reason for hiding this comment

google-oss-prow bot commented Mar 4, 2024

rimolive commented Mar 12, 2024

andraxin commented Apr 14, 2024

Brief Demo

rimolive commented Apr 15, 2024

andraxin commented Apr 16, 2024

rimolive commented Apr 17, 2024

andraxin commented Apr 17, 2024

rimolive commented Apr 17, 2024 • edited Loading

andraxin commented Apr 17, 2024

rimolive commented Apr 17, 2024 • edited Loading

majuss commented May 29, 2024

rimolive commented May 29, 2024

thesuperzapper commented May 29, 2024

droctothorpe commented May 30, 2024 • edited Loading

HumairAK commented May 31, 2024 • edited Loading

google-oss-prow bot commented May 31, 2024

Tomcli left a comment •

edited

Loading

rimolive commented Apr 17, 2024 •

edited

Loading

rimolive commented Apr 17, 2024 •

edited

Loading

droctothorpe commented May 30, 2024 •

edited

Loading

HumairAK commented May 31, 2024 •

edited

Loading