-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix lmi/vllm virtual envs, update to vllm 0.7.1 #2703
Conversation
@@ -21,16 +21,11 @@ | |||
resolve_chat_template_content_format) | |||
|
|||
|
|||
def is_chat_completions_request(inputs: Dict) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted because it's not used
@@ -41,12 +36,6 @@ def parse_chat_completions_request_vllm( | |||
"You must enable rolling batch to use the chat completions format." | |||
) | |||
|
|||
if not is_mistral_tokenizer and not hasattr(tokenizer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted because the vllm utils do this validation for us already
git reset --hard 4b2092c | ||
$venv_pip install . | ||
cd .. | ||
rm -rf AutoFP8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we not need FP8 installation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not anymore! we're using llm compressor now #2701
d43e822
to
16cc16a
Compare
Description
This changes updates to vllm 0.7.1, which involves shuffling around some dependencies and being less strict with dependency versions.
Additionally, it updates the chat processing for vllm to be functional. There is still a good amount we need to implement for chat processing, that i'll take up in a follow up PR:
I have tested this with (single test for each):
I also added a chat test for mistral with vllm.
Type of change
Please delete options that are not relevant.
Checklist:
pytest tests.py -k "TestCorrectnessLmiDist" -m "lmi_dist"
Feature/Issue validation/testing
Please describe the Unit or Integration tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.
Test A
Logs for Test A
Test B
Logs for Test B