Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determine which models exist with API calls rather than hard coding them. #567

Open
pmetzger opened this issue Jan 13, 2025 · 2 comments
Open
Labels
enhancement New feature or request

Comments

@pmetzger
Copy link

OpenAI at least seems to have an API call to determine what models are available. Rather than hard-coding the available models, the code should learn what they are at start time.

@pmetzger pmetzger added the enhancement New feature or request label Jan 13, 2025
@karthink
Copy link
Owner

karthink commented Jan 13, 2025

learn what they are at start time.

  • I don't know what "start time" means, please see the discussion in Enumerate Ollama models on the server pointed to by :host directive #447.

    • I don't want to make network requests when the user's gptel configuration in their init file is loaded. I would be very annoyed if an Emacs package exhibited web-browser-like behavior like this.
    • Similarly, I don't want to make a network request when the user runs M-x gptel -- this should just open up a buffer.
    • Moreover, dedicated chat buffers are just one way to use gptel, and for many users it's not even the preferred way. So there is no guarantee M-x gptel will be called. So when should this API call be made?
  • Model info fetched from most APIs (including OpenAI) does not have all the information we include, such as the costs, cutoff-dates, capabilities and context window sizes. Only some of this information is provided. We show this information as annotations when selecting a model:

screenshot_20250113T214818

  • In contrast, it is a simple matter to add a model + metadata when OpenAI releases one, usually an interested user makes a pull request to gptel.

  • If it's not yet in gptel, you can add models to any gptel-backend yourself:

;; Add model to backend:
(push 'gemini-2.0-flash-thinking-exp (gptel-backend-models gptel-backend))
;; Add model metadata (OPTIONAL)
(put 'gemini-2.0-flash-thinking-exp :description "Gemini model that produces...")
(put 'gemini-2.0-flash-thinking-exp :context-window 32)

where gptel-backend is the active gptel backend (Gemini in this example). In #529, we're trying to provide a more user-friendly way to do this -- although I think the above is pretty standard if you've used elisp.

(Finding available models automatically is more of a requirement for Ollama, where there is no standard list of models that gptel can track.)

@pmetzger
Copy link
Author

I don't want to make network requests when the user's gptel configuration in their init file is loaded.

Fine, then when the user requests. It seems unreasonable to be writing these things by hand; it means that as new models are deployed you can't automatically keep up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants