-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨Deepsparse Backend implementation #29
Open
parfeniukink
wants to merge
24
commits into
main
Choose a base branch
from
parfeniukink/features/deepsparse-backend
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 4 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
c3abc8d
WIP
a6d9a05
✅ Tests are fixed
d116c0c
📌 deepsparse is added to dependencies
c000dbf
✨ deepsparse backend integration is added
52e1d3b
deepsparse package limitations are applied
7218795
⚰️ removed `pytest.mark.asyncio()` due to pytest-asyncio module
a5357ca
📝 fixed class example
68381a5
🧵 rollback `pytest.mark.asyncio` fixtures
5acb3a8
✨ Deepsparse Backend integration first implementation
45e07d0
code quality is provided
1753469
Merge branch 'main' into parfeniukink/features/deepsparse-backend
1f1e038
fit Deepsparse Backend to work with new Backend abstraction
ce1c3ba
🔧 `GUIDELLM__LLM_MODEL` shared across all the backends
8e88bae
Test emulated data source constant -> settings value
75e708b
💄 mdformat is happy
3c03961
Merge branch 'main' into parfeniukink/features/deepsparse-backend
913253f
✅ Tests are fixed according to a new Backend base implementation
e376ed9
🔨 tox tests include `deepsparse` dependency
3a2c6c1
🏷️ Type annotations are added
74a6dfd
🐛 Assert with config values instead of constants
1a53951
📌 .[deepsparse] dependency is skipped if Python>3.11
39ffcb3
🚚 DeepsparseBackend is moved to a another module
29e38e4
✅ Deepsparse tests are ignored if Python>=3.12
4b3b4b5
💚 Linters are happy
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,155 @@ | ||
from typing import Any, Dict, Generator, Optional, Type | ||
|
||
import pytest | ||
from pydantic import BaseModel | ||
|
||
from guidellm.backend import Backend, DeepsparseBackend | ||
from guidellm.config import reload_settings | ||
from guidellm.core import TextGenerationRequest | ||
from guidellm.utils import random_strings | ||
|
||
|
||
class TestDeepsparseTextGeneration(BaseModel): | ||
"""The representation of a deepsparse data structure.""" | ||
|
||
text: str | ||
|
||
|
||
class TestTextGenerationPipeline: | ||
"""Deepsparse TextGeneration test interface. | ||
|
||
By default this class generates '10' text responses. | ||
|
||
This class includes an additional development information | ||
for better testing experience. | ||
|
||
Method `__call__` allows to mock the result object that comes from | ||
`deepsparse.pipeline.Pipeline()` so everything is encapsulated right here. | ||
|
||
:param self._generation: dynamic representation of generated responses | ||
from deepsparse interface. | ||
""" | ||
|
||
def __init__(self): | ||
self._generations: list[TestDeepsparseTextGeneration] = [] | ||
self._prompt: Optional[str] = None | ||
self._max_new_tokens: Optional[int] = None | ||
|
||
def __call__( | ||
self, *_, prompt: str, max_new_tokens: Optional[int] = None, **kwargs | ||
) -> Any: | ||
"""Mocks the result from `deepsparse.pipeline.Pipeline()()`. | ||
Set reserved request arguments on call | ||
""" | ||
|
||
self._prompt = prompt | ||
self._max_new_tokens = max_new_tokens | ||
|
||
return self | ||
|
||
@property | ||
def generations(self) -> Generator[TestDeepsparseTextGeneration, None, None]: | ||
for text in random_strings( | ||
min=10, max=50, n=self._max_new_tokens if self._max_new_tokens else 10 | ||
): | ||
generation = TestDeepsparseTextGeneration(text=text) | ||
self._generations.append(generation) | ||
yield generation | ||
|
||
|
||
@pytest.fixture(autouse=True) | ||
def mock_deepsparse_pipeline(mocker): | ||
return mocker.patch( | ||
"deepsparse.Pipeline.create", | ||
return_value=TestTextGenerationPipeline(), | ||
) | ||
|
||
|
||
@pytest.mark.smoke() | ||
@pytest.mark.parametrize( | ||
"create_payload", | ||
[ | ||
{}, | ||
{"model": "test/custom_llm"}, | ||
], | ||
) | ||
def test_backend_creation(create_payload: Dict): | ||
"""Test the "Deepspaarse Backend" class | ||
with defaults and custom input parameters. | ||
""" | ||
|
||
backends: list[DeepsparseBackend] = [ | ||
Backend.create("deepsparse", **create_payload), | ||
DeepsparseBackend(**create_payload), | ||
] | ||
|
||
for backend in backends: | ||
assert getattr(backend, "pipeline") | ||
( | ||
getattr(backend, "model") == custom_model | ||
if (custom_model := create_payload.get("model")) | ||
else getattr(backend, "default_model") | ||
) | ||
|
||
|
||
@pytest.mark.smoke() | ||
def test_backend_model_from_env(mocker): | ||
mocker.patch.dict( | ||
"os.environ", | ||
{"GUIDELLM__DEEPSPRASE__MODEL": "test_backend_model_from_env"}, | ||
) | ||
|
||
reload_settings() | ||
|
||
backends: list[DeepsparseBackend] = [ | ||
Backend.create("deepsparse"), | ||
DeepsparseBackend(), | ||
] | ||
|
||
for backend in backends: | ||
assert getattr(backend, "model") == "test_backend_model_from_env" | ||
|
||
|
||
@pytest.mark.smoke() | ||
@pytest.mark.parametrize( | ||
"text_generation_request_create_payload", | ||
[ | ||
{"prompt": "Test prompt"}, | ||
{"prompt": "Test prompt", "output_token_count": 20}, | ||
], | ||
) | ||
@pytest.mark.asyncio() | ||
async def test_make_request(text_generation_request_create_payload: Dict): | ||
backend = DeepsparseBackend() | ||
|
||
output_tokens: list[str] = [] | ||
async for response in backend.make_request( | ||
request=TextGenerationRequest(**text_generation_request_create_payload) | ||
): | ||
if response.add_token: | ||
output_tokens.append(response.add_token) | ||
assert "".join(output_tokens) == "".join( | ||
(generation.text for generation in backend.pipeline._generations) | ||
) | ||
|
||
if max_tokens := text_generation_request_create_payload.get("output_token_count"): | ||
assert len(backend.pipeline._generations) == max_tokens | ||
|
||
|
||
@pytest.mark.smoke() | ||
@pytest.mark.parametrize( | ||
"text_generation_request_create_payload,error", | ||
[ | ||
({"prompt": "Test prompt", "output_token_count": -1}, ValueError), | ||
], | ||
) | ||
@pytest.mark.asyncio() | ||
async def test_make_request_invalid_request_payload( | ||
text_generation_request_create_payload: Dict, error: Type[Exception] | ||
): | ||
backend = DeepsparseBackend() | ||
with pytest.raises(error): | ||
async for _ in backend.make_request( | ||
request=TextGenerationRequest(**text_generation_request_create_payload) | ||
): | ||
pass |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean deepsparse is a dep of guidellm? We should keep it optional at most IMO, so could this be in a try catch with an informational message to install?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mgoin Could you please check this file? Is it kind of a validation you are talking about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, since the
backend/deepsparse/__init__.py
exists we determine it as a module so before you reach thedeepsparse/backend.py
you are going to run such a validation.Also, all the imports are like
from guidellm.backend import Backend, BackendEngine
but if you need the deepsparse justfrom .guidellm.backend.deepsparse import DeepsparseBackend
which runs this validation.Also the
deepsparse
is an optional dependency. At least it is in optional section