Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create huggingface integration test example for v0.12.1 #235

Open
misohu opened this issue May 22, 2024 · 2 comments · May be fixed by #298
Open

Create huggingface integration test example for v0.12.1 #235

misohu opened this issue May 22, 2024 · 2 comments · May be fixed by #298
Labels
enhancement New feature or request

Comments

@misohu
Copy link
Member

misohu commented May 22, 2024

Context

With v0.12.1 huggingface ClusterServingRuntime was introduced. We should add integration test for this runtime.

What needs to get done

Create example yaml for huggingfaceserver serving runtime. Similar to other examples.

Definition of Done

  1. huggingfaceserver example is part of integration tests.
@misohu misohu added the enhancement New feature or request label May 22, 2024
Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5727.

This message was autogenerated

@misohu
Copy link
Member Author

misohu commented Jan 29, 2025

First I followed the docs from kserve documentation and I have tried to deploy the llama3 with huggingface inference service. This model tho requires 25GBs of space + a GPU.

After that I played around and found smaller model on hugging face which I successfully executed locally. Here is the PR #298. Problem with this test is that in order to run the test we need todeploy a pod with huggingface image to the runner which is almost 5 GB big (almost 8GB for the rock). This is problem for the runner as the image cant fit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant