Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tool-call: refactor common chat / tool-call api (+ tests / fixes) #11900

Merged
merged 43 commits into from
Feb 18, 2025

Conversation

ochafik
Copy link
Collaborator

@ochafik ochafik commented Feb 16, 2025

Refactoring of chat / tool-call logic (follow up to #11016) along the lines of @ggerganov's suggestions (ref):

  • Move all common_chat_* from common.* to common/chat.*
  • Moved minja headers to common/minja/*, now only imported from chat.cpp
  • Removed json.hpp dep from chat.hpp (and test-chat.cpp only uses it to normalize arguments during comparisons)
    • Added common_chat_tool struct + refined common_chat_msg to accept multipart content, tool name, tool call id to stop depending on json
  • common_chat_apply_template becomes common_chat_templates_apply
  • Added common helpers to convert oaicompat JSON to / from common_chat_tool / common_chat_msg vectors
    • Templated signature for oaicompat converters so they can be called w/ strings... or preparsed json (from utils.hpp, to avoid needless round-trip of json dump / parse)

Also some fixes:

cc/ @bandoti

TODO:

  • Add some missing tests (json<->tools converter, array message.content)
  • Wire some logs / debug info I've commented out
  • Ensure no regression in legacy array message.content

@github-actions github-actions bot added script Script related testing Everything test related examples python python script changes server labels Feb 16, 2025
@ochafik ochafik changed the title tool-call: refactor common chat / tool-call api tool-call: refactor common chat / tool-call api (+ tests / fixes) Feb 16, 2025
@ochafik ochafik marked this pull request as ready for review February 16, 2025 23:35
@ochafik ochafik requested a review from ngxson as a code owner February 16, 2025 23:35
struct common_chat_tool {
std::string name;
std::string description;
std::string parameters;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working through the MCP implementation I added a similar structure to this one. It would be good to normalize the two. Here's what I ended up adding:

    struct tool {
        struct param {
            std::string name;
            std::string type;
            std::string description;
        };
        std::string tool_name;
        std::string tool_description;
        std::vector<param> params;
        std::vector<std::string> required_params;
    };

I think instead of having parameters as an opaque JSON object it would be worthwhile to expand it out into a vector of parameters (or something similar) so that the entire structure is accessible without JSON, making the conversion between MCP to openai-compat more seamless.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, it is not that difficult to convert to the JSON parameters from the structure above if the change would require significant alterations in the template logic in this PR. I am happy to make the change in the #11556 😊

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not assume parameters is an object schema, it could technically be a {"type": "string"}. Llama 3.2 seems to handle object, array and string for the ipython tool for instance

see message loop of llama 3.2 's template
{%- for message in messages %}
    {%- if not (message.role == 'ipython' or message.role == 'tool' or 'tool_calls' in message) %}
        {{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' }}
    {%- elif 'tool_calls' in message %}
        {%- if not message.tool_calls|length == 1 %}
            {{- raise_exception("This model only supports single tool-calls at once!") }}
        {%- endif %}
        {%- set tool_call = message.tool_calls[0].function %}
        {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' -}}
        {{- '{"name": "' + tool_call.name + '", ' }}
        {{- '"parameters": ' }}
        {{- tool_call.arguments | tojson }}
        {{- "}" }}
        {{- "<|eot_id|>" }}
    {%- elif message.role == "tool" or message.role == "ipython" %}
        {{- "<|start_header_id|>ipython<|end_header_id|>\n\n" }}
        {%- if message.content is mapping or message.content is iterable %}
            {{- message.content | tojson }}
        {%- else %}
            {{- message.content }}
        {%- endif %}
        {{- "<|eot_id|>" }}
    {%- endif %}
{%- endfor %}

Not sure either we should describe any possible JSON schema as a fixed data structure, although that would be quite fun and might guide improvements of the native json schema conversion coverage.


struct common_chat_templates {
bool has_explicit_template; // Model had builtin template or template overridde was specified.
std::unique_ptr<common_chat_template> template_default; // always set (defaults to chatml)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering, is it possible (after this refactoring) to finally remove this std::unique_ptr?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, probably, I'll look into it as follow up

msg_part.text = part.at("text");
msg.content_parts.push_back(msg_part);
}
} else if (!content.is_null()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is redundant, right?

Suggested change
} else if (!content.is_null()) {
} else {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not: content can be set to string, array or null (or else that branch throws)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(oai returns content: null when there are tool_calls)

throw std::runtime_error("Expected 'content' (ref: https://github.com/ggml-org/llama.cpp/issues/8367)");
}
if (message.contains("reasoning_content")) {
msg.reasoning_content = message.at("reasoning_content");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not 100% sure about this, but at least for deepseek models, the reasoning_content will be ignored for input messages

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, and actually even if we were, say, to turn reasoning_content back to <think> tags inside the content, their template would explicitly filter it out:

        {%- set content = message['content'] -%}
        {%- if '</think>' in content -%}
            {%- set content = content.split('</think>')[-1] -%}
        {%- endif -%}
        {{- '<|Assistant|>' + content + '<|end▁of▁sentence|>' -}}

(but anyone can write a template that does differently, for instance if we wanted to avoid invalidating the KV cache)

@ochafik ochafik merged commit 63e489c into ggml-org:master Feb 18, 2025
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples python python script changes script Script related server testing Everything test related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants