-
Notifications
You must be signed in to change notification settings - Fork 845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add dynamic clients for all APIs #348
Conversation
@@ -239,7 +247,9 @@ async def chat_completion( | |||
response_format: Optional[ResponseFormat] = None, | |||
stream: Optional[bool] = False, | |||
logprobs: Optional[LogProbConfig] = None, | |||
) -> Union[ChatCompletionResponse, ChatCompletionResponseStreamChunk]: ... | |||
) -> Union[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FINALLY! we actually type-hint it in the way it is supposed to be.
This might become an issue with our OpenAPI generator, but we will fix that downstream. Our source must be always correct.
return APIClient | ||
|
||
|
||
async def example(model: str = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will be deleting this code later since someone is going to stumble on this and use it inadvertently again
18ae0d9
to
386372d
Compare
docs/resources/llama-stack-spec.html
Outdated
"content": { | ||
"text/event-stream": { | ||
"schema": { | ||
"$ref": "#/components/schemas/AgentTurnResponseStreamChunk" | ||
"$ref": "#/components/schemas/Turn" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks bad though uh oh, need to check
What does this PR do?
We have frequently bit-rotten (apis//client.py) files. They have two drawbacks:
This PR is the first step towards killing these hand-written implementations. It dynamically creates Client classes for each API protocol and registers appropriate methods based on type introspection.
Test Plan
First, I ran an ollama server (
ollama run llama3.2:3b-instruct-fp16
) and then started a Llama Stack using--template ollama
on port 5003.Then I set up the following yaml for testing:
Then I ran the following set of tests:
Then I modified the config.yaml to be:
And ran the following tests:
This test did not fully pass due to an unexpected model response from the ollama 3b-instruct llama model w.r.t. tool calling but tests which didn't exercise tool calling passed.