-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature] Seek alternative for object store solution other than MinIO #7878
Comments
I would clarify it a bit more
|
@zijianjoy , is the MinIO license change to AGPLv3 the only motivation for looking for alternatives? As you wrote,
|
Minio is only used as a storage gateway. This feature has been deprecated in Minio since over a year https://blog.min.io/deprecation-of-the-minio-gateway/ To me the simplest replacement is to use an actual S3 client (which hopefully is also an S3 compatible client because not everyone is on AWS) |
No, your assesment is not correct. Minio usage differs by distribution and the most important default one used by Kubeflow is NOT the gateway mode. We need a reliable storage and gateway replacement. For further information read the whole conversation here #7725 |
My bad, I wasn't clear. |
I agree with @streamnsight - the |
This issue here is about the server, not the client. |
"For the vanilla distro that still requires a in-cluster storage solution, Minio S3 compat still does the job, and it's only a matter of pointing the S3 client to it." No, that is the reason for this issue. We need a server side replacement. |
Currently there is a dependency between the server and the client, the server cannot be changed easily without client modifications. The first step is to replace the client with a generic S3-compatible client, without changing the server. After that we can change the server with any object store that supports S3 API. Why do you think that the MinIO server needs to be replaced ? According to the issue description, |
@tzstoyanov Please also read everything from #7725 to fully understand the long-term goals for the artifact storage within Kubeflow. You can already use the minio server with most S3 clients, but yes, maybe it is problematic to use the minio client for other S3 servers. We are always looking for volunteers. I already mentored several people that now have their first commits to the Kubeflow project. |
@juliusvonkohout, I looked at the comments of the #7725. Thanks for pointing me to that, now I have a better idea of the problem and the work you did in the context of that PR. I have a few questions, will be happy if you or someone from the community can answer:
I can contribute to that, implement the |
@tzstoyanov please reach out on LinkedIn or slack for discussion. I am already working with one of your colleagues @difince https://kccnceu2023.sched.com/event/1HyY8/hardening-kubeflow-security-for-enterprise-environments-julius-von-kohout-dhl-deutsche-telekom-diana-dimitrova-atanasova-vmware Maybe these here are are lower hanging fruit. kubeflow/kubeflow#7032 (comment) we really need to focus on what to work on first, because getting stuff into KFP is difficult. |
Do we have any update on this? :) |
cubefs looks promising. |
@gsoec can you articulate why cubefs looks promising? I think we need an assessment similar to what @juliusvonkohout did in #7878 (comment). Based on that, comparing with ceph-rook appears to be most relevant. From what I see here, the latter appears to be the favorable solution... |
@lehrig please take a look at the last comments of #7725. i think we need to use istio or something else for the authentication part since only a few s3 providers fully support enterprise-level user management and authorization. Furthermore we could just plug and play any basic-S3 compatible storage backend and get rid of passwords and the necessary rotation altogether. |
So that you all know, sticking to this release has three different security vulnerabilities that are high or critical. So, anyone running this release would be exposed and vulnerable to these vulnerabilities, potentially causing data loss. What does it take for Kubeflow to upgrade the MinIO container version? - I can help. |
Agpl is not allowed, so Google denied an update. Please join the meetings at https://www.kubeflow.org/docs/about/community/ for discussing it. Especially the KFP meeting |
Another alternative could be https://ozone.apache.org |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
not stale |
it might be suitable for the multi bucket approach https://ozone.apache.org/docs/1.3.0/feature/s3-multi-tenancy-access-control.html |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
/lifecycle frozen We are actively working on this. |
Feature Area
/area backend
/area frontend
What is the use case or pain point?
Currently, KFP is using MinIO as the default object store for Artifact payload and Pipeline template. However, MinIO has fully changed its license to AGPLv3 since 2021. This change has prevented KFP from upgrading to the latest of MinIO: using AGPL software requires that anything it links to must also be licensed under the AGPL. Since we are not able to adopt the latest change from MinIO, we are seeking alternatives to replace it in future KFP deployment.
What feature would you like to see?
This new object store solution should have the following nature:
Exploration
We are currently considering following options, we would like to hear your opinions as well.
cc @chensun @IronPan @james-jwu
Love this idea? Give it a 👍. We prioritize fulfilling features with the most 👍.
The text was updated successfully, but these errors were encountered: