Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Motivation behind combining requirements.txt files for GenAIComps 1.2 #1260

Open
aMahanna opened this issue Feb 5, 2025 · 3 comments

Comments

@aMahanna
Copy link

aMahanna commented Feb 5, 2025

Hello,

In [email protected], microservice development allowed each service vendor the ability to define a custom requirements.txt to specify desired packages.

Now with [email protected], it looks like this is no longer the case: https://github.com/opea-project/GenAIComps/tree/v1.2/comps/dataprep/src

  • i.e dependencies from both Milvus and QDrant are added to the same requirements.txt file.

Can someone shed light on the motivation behind this design decision?

AFAIK, this approach could involve risk of encountering conflicting dependencies, and prevents individual vendors to provide a custom dependency list.

@eero-t
Copy link
Contributor

eero-t commented Feb 7, 2025

OPEA integrates enough SW that having nearly identical copies of everything (Python code, Docker files, integration scripts, docs etc) for every variant of every component, and each of them requiring its own image everywhere (DockerHub, CIs for OPEA), and corresponding Helm chart, was becoming a maintenance problem.

Same image/service supports now multiple backends, which allows reducing that duplication to more manageable amount.

AFAIK, this approach could involve risk of encountering conflicting dependencies,

If there's a component that does not work together with the other components, I would assume it can still be split.

and prevents individual vendors to provide a custom dependency list.

Why they could not still do, especially if they're using only one of the backends? Especially if service would import only [1] the configured backend...

[1] Currently service imports all (9) of them, although only one of them is used. While that can be useful for checking that the dep versions specified in requirements.txt can be imported together, that's a bit of a corner-case, and can significantly slow down the service startup. Maybe you'd like to file a separate bug about that?

PS. While combining requirements together increases size of individual images somewhat, reduction in number of images more than compensates for that. In most cases the additional deps should not increase they image size that much (large part of the size should come from the shared deps).

@aMahanna
Copy link
Author

aMahanna commented Feb 7, 2025

@eero-t Thank you for your reply.

So just to confirm:

Assume a new Dataprep component is introduced, and package X must be added to the central requirements.txt file.

This package X depends on numpy < 2.0.0.

However, other packages that already exist in requirements.txt depend on numpy > 2.0.0

How would you recommend we proceed in this case? Create a separate requirements.txt file and include it as part of the PR?

@eero-t
Copy link
Contributor

eero-t commented Feb 7, 2025

While I saw the problems that the OPEA changes try to address, I'm not OPEA developer myself, just a contributor, so these are just my own thoughts on the matter...

If new component requires significantly older version of a major component like numpy, I think there would some questions about whether that component is still maintained. If it's not maintained, it's not that likely to get in. If it is, there would be question of why it's still at so old version / are there plans to update it.

If the reasons are valid and not likely to go away in near future, there indeed needs to a split if it's merged. New requirements file is not enough, there would need to Dockerfile to build image with the different set of deps, CI integration for testing it etc, which is quite a bit longer process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants