Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VESSL AI LLMProvider integration #17414

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

VESSL AI LLMProvider integration #17414

wants to merge 15 commits into from

Conversation

nsd9696
Copy link

@nsd9696 nsd9696 commented Jan 3, 2025

Description

  • Integrating VESSL AI LLMProvider to use in Llama index.
  • VESSL AI provider will be served through vLLM and OpenAI Compatible.
  • User can manually serve their own huggingface model with 1) model_name, 2) vessl yaml_file or 3) connect with pre-served vessl llm service endpoint.
  • Example
from llama_index.llms.vesslai import VesslAILLM

llm = VesslAILLM()

#1 Serve with hf model name
llm.serve(
    service_name = "llama-index-vesslai",
    model_name = "mistralai/Mistral-7B-Instruct-v0.3",
    hf_token = "HF_TOKEN",
    api_key="openai-api-key"
)
#2 Serve with yaml file
llm.serve(
    service_name = "llama-index-vesslai",
    yaml_path="/users/own/vessl/service.yaml",
    api_key="openai-api-key"
)
#3 Connect with pre-served endpoint
llm.connect(
    served_model_name="mistralai/Mistral-7B-Instruct-v0.3",
    endpoint = "https://model-service-gateway-abc.oregon.google-cluster.vessl.ai/v1",
)
resp = llm.complete("Who is Paul Graham?")

Fixes # (issue)

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

  • Yes
  • No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

  • Yes
  • No

Type of Change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing.

  • I added new unit tests to cover this change
  • I believe this change is already covered by existing unit tests

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran make format; make lint to appease the lint gods

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Jan 3, 2025
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to commit this file

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should write an actual readme (see other llms, should show the install and basic usage)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the sources of other llms, and most of them seem to be the same as "python_sources()". What specific details need to be included?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I left the comment on the wrong file, this was intended for the README.md file 😅


self.organization_name = organization_name

def serve(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious about the decision to do serve and connect outside of the __init__() function? Do your users often switch this after the llm object is created? In most llama-index LLMs, you would just do llm = VesslAILLM(...) and then from there you can directly use it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its fine either way tbh, was just curious

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback. To use VESSL, authentication through configure is required. I wanted to handle this process during initialization and explicitly separate the serving and connection of the llm_provider afterward. Internally at VESSL, we have discussed this flow, and it seems to be fine.

llm = VesslAILLM()

#1 Serve hf model name
llm.serve(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on making serve and connect async? Seems like this could possibly be a blocking operation with wait_for_gateway_enabled ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets put example notebooks in docs/docs/examples/llms/

print(f"The service {service_name} is currently rolling out.")
if _request_abort_rollout(service_name):
print("Waiting for the existing rollout to be aborted...")
time.sleep(30)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ouch. Another vote to have async imo

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's quite a lot of code, is any of it testable? (you'd have to mock out api calls though)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants