-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tool-call: refactor common chat / tool-call api (+ tests / fixes) #11900
Conversation
struct common_chat_tool { | ||
std::string name; | ||
std::string description; | ||
std::string parameters; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working through the MCP implementation I added a similar structure to this one. It would be good to normalize the two. Here's what I ended up adding:
struct tool {
struct param {
std::string name;
std::string type;
std::string description;
};
std::string tool_name;
std::string tool_description;
std::vector<param> params;
std::vector<std::string> required_params;
};
I think instead of having parameters as an opaque JSON object it would be worthwhile to expand it out into a vector of parameters (or something similar) so that the entire structure is accessible without JSON, making the conversion between MCP to openai-compat more seamless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, it is not that difficult to convert to the JSON parameters from the structure above if the change would require significant alterations in the template logic in this PR. I am happy to make the change in the #11556 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not assume parameters
is an object schema, it could technically be a {"type": "string"}
. Llama 3.2 seems to handle object, array and string for the ipython
tool for instance
see message loop of llama 3.2 's template
{%- for message in messages %}
{%- if not (message.role == 'ipython' or message.role == 'tool' or 'tool_calls' in message) %}
{{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' }}
{%- elif 'tool_calls' in message %}
{%- if not message.tool_calls|length == 1 %}
{{- raise_exception("This model only supports single tool-calls at once!") }}
{%- endif %}
{%- set tool_call = message.tool_calls[0].function %}
{{- '<|start_header_id|>assistant<|end_header_id|>\n\n' -}}
{{- '{"name": "' + tool_call.name + '", ' }}
{{- '"parameters": ' }}
{{- tool_call.arguments | tojson }}
{{- "}" }}
{{- "<|eot_id|>" }}
{%- elif message.role == "tool" or message.role == "ipython" %}
{{- "<|start_header_id|>ipython<|end_header_id|>\n\n" }}
{%- if message.content is mapping or message.content is iterable %}
{{- message.content | tojson }}
{%- else %}
{{- message.content }}
{%- endif %}
{{- "<|eot_id|>" }}
{%- endif %}
{%- endfor %}
Not sure either we should describe any possible JSON schema as a fixed data structure, although that would be quite fun and might guide improvements of the native json schema conversion coverage.
|
||
struct common_chat_templates { | ||
bool has_explicit_template; // Model had builtin template or template overridde was specified. | ||
std::unique_ptr<common_chat_template> template_default; // always set (defaults to chatml) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering, is it possible (after this refactoring) to finally remove this std::unique_ptr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, probably, I'll look into it as follow up
msg_part.text = part.at("text"); | ||
msg.content_parts.push_back(msg_part); | ||
} | ||
} else if (!content.is_null()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is redundant, right?
} else if (!content.is_null()) { | |
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not: content can be set to string, array or null (or else that branch throws)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(oai returns content: null when there are tool_calls)
throw std::runtime_error("Expected 'content' (ref: https://github.com/ggml-org/llama.cpp/issues/8367)"); | ||
} | ||
if (message.contains("reasoning_content")) { | ||
msg.reasoning_content = message.at("reasoning_content"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% sure about this, but at least for deepseek models, the reasoning_content
will be ignored for input messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, and actually even if we were, say, to turn reasoning_content
back to <think>
tags inside the content, their template would explicitly filter it out:
{%- set content = message['content'] -%}
{%- if '</think>' in content -%}
{%- set content = content.split('</think>')[-1] -%}
{%- endif -%}
{{- '<|Assistant|>' + content + '<|end▁of▁sentence|>' -}}
(but anyone can write a template that does differently, for instance if we wanted to avoid invalidating the KV cache)
Refactoring of chat / tool-call logic (follow up to #11016) along the lines of @ggerganov's suggestions (ref):
common_chat_*
fromcommon.*
tocommon/chat.*
common/minja/*
, now only imported fromchat.cpp
json.hpp
dep fromchat.hpp
(andtest-chat.cpp
only uses it to normalize arguments during comparisons)common_chat_tool
struct + refinedcommon_chat_msg
to accept multipart content, tool name, tool call id to stop depending onjson
common_chat_apply_template
becomescommon_chat_templates_apply
Also some fixes:
tool-call
: allow--chat-template chatml
w/--jinja
, default to chatml upon parsing issue, avoid double bos #11616 (was preventing insertion of bos/eos tags inside the template)--jinja
w/o tool call w/ grammar or json_schemacc/ @bandoti
TODO: