-
Notifications
You must be signed in to change notification settings - Fork 258
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'develop' into mtk/compatibility-updates
- Loading branch information
Showing
40 changed files
with
696 additions
and
424 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
104 changes: 104 additions & 0 deletions
104
...acy_docs/edb-postgres-ai/ai-accelerator/models/supported-models/completions.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,104 @@ | ||
--- | ||
title: "Completions" | ||
navTitle: "Completions" | ||
description: "Completions is a text completion model that enables use of any OpenAI API compatible text generation model." | ||
--- | ||
|
||
Model name: `completions` | ||
|
||
Model aliases: | ||
|
||
* `openai_completions` | ||
* `nim_completions` | ||
|
||
## About Completions | ||
|
||
Completions enables the use of any OpenAI API compatible text generation model. | ||
|
||
It is suitable for chat/text transforms, text completion, and other text generation tasks. | ||
|
||
Depending on the name of the model, the model provider will set defaults accordingly. | ||
|
||
When invoked as `completions` or `openai_completions`, the model provider will default to using the OpenAI API. | ||
|
||
When invoked as `nim_completions`, the model provider will default to using the NVIDIA NIM API. | ||
|
||
|
||
## Supported aidb operations | ||
|
||
* decode_text | ||
* decode_text_batch | ||
|
||
## Supported models | ||
|
||
* Any text generation model that is supported by the provider. | ||
|
||
## Supported OpenAI models | ||
|
||
See a list of supported OpenAI models [here](https://platform.openai.com/docs/models#models-overview). | ||
|
||
## Supported NIM models | ||
|
||
* [ibm/granite-guardian-3.0-8b](https://build.nvidia.com/ibm/granite-guardian-3_0-8b) | ||
* [ibm/granite-3.0-8b-instruct](https://build.nvidia.com/ibm/granite-3_0-8b-instruct) | ||
* [ibm/granite-3.0-3b-a800m-instruct](https://build.nvidia.com/ibm/granite-3_0-3b-a800m-instruct) | ||
* [meta/llama-3.3-70b-instruct](https://build.nvidia.com/meta/llama-3_3-70b-instruct) | ||
* [meta/llama-3.2-3b-instruct](https://build.nvidia.com/meta/llama-3.2-3b-instruct) | ||
* [meta/llama-3.2-1b-instruct](https://build.nvidia.com/meta/llama-3.2-1b-instruct) | ||
* [meta/llama-3.1-405b-instruct](https://build.nvidia.com/meta/llama-3_1-405b-instruct) | ||
* [meta/llama-3.1-70b-instruct](https://build.nvidia.com/meta/llama-3_1-70b-instruct) | ||
* [meta/llama-3.1-8b-instruct](https://build.nvidia.com/meta/llama-3_1-8b-instruct) | ||
* [meta/llama3-70b-instruct](https://build.nvidia.com/meta/llama3-70b) | ||
* [meta/llama3-8b-instruct](https://build.nvidia.com/meta/llama3-8b) | ||
* [nvidia/llama-3.1-nemotron-70b-instruct](https://build.nvidia.com/nvidia/llama-3_1-nemotron-70b-instruct) | ||
* [nvidia/llama-3.1-nemotron-51b-instruct](https://build.nvidia.com/nvidia/llama-3_1-nemotron-51b-instruct) | ||
* [nvidia/nemotron-mini-4b-instruct](https://build.nvidia.com/nvidia/nemotron-mini-4b-instruct) | ||
* [nvidia/nemotron-4-340b-instruct](https://build.nvidia.com/nvidia/nemotron-4-340b-instruct) | ||
* [google/shieldgemma-9b](https://build.nvidia.com/google/shieldgemma-9b) | ||
* [google/gemma-7b](https://build.nvidia.com/google/gemma-7b) | ||
* [google/codegemma-7b](https://build.nvidia.com/google/codegemma-7b) | ||
|
||
## Creating the default model | ||
|
||
There is no default model for Completions. You can create any supported model using the `aidb.create_model` function. | ||
|
||
## Creating an OpenAI model | ||
|
||
You can create any supported OpenAI model using the `aidb.create_model` function. | ||
|
||
In this example, we are creating a GPT-4o model with the name `my_openai_model`: | ||
|
||
```sql | ||
SELECT aidb.create_model( | ||
'my_openai_model', | ||
'openai_completions', | ||
'{"model": "gpt-4o"}'::JSONB, | ||
'{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"}'::JSONB | ||
); | ||
``` | ||
|
||
## Creating a NIM model | ||
|
||
```sql | ||
SELECT aidb.create_model( | ||
'my_nim_completions', | ||
'nim_completions', | ||
'{"model": "meta/llama-3.2-1b-instruct"}'::JSONB, | ||
credentials=>'{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"'::JSONB); | ||
``` | ||
|
||
## Model configuration settings | ||
|
||
The following configuration settings are available for OpenAI models: | ||
|
||
* `model` - The model to use. | ||
* `url` - The URL of the model to use. This is optional and can be used to specify a custom model URL. | ||
* If `openai_completions` (or `completions`) is the `model`, `url` defaults to `https://api.openai.com/v1/chat/completions`. | ||
* If `nim_completions` is the `model`, `url` defaults to `https://integrate.api.nvidia.com/v1/chat/completions`. | ||
* `max_concurrent_requests` - The maximum number of concurrent requests to make to the OpenAI model. Defaults to `25`. | ||
|
||
## Model credentials | ||
|
||
The following credentials are required for these models: | ||
|
||
* `api_key` - The API key to use for authentication. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
54 changes: 54 additions & 0 deletions
54
advocacy_docs/edb-postgres-ai/ai-accelerator/models/supported-models/nim_clip.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
--- | ||
title: "CLIP" | ||
navTitle: "CLIP" | ||
description: "CLIP (Contrastive Language-Image Pre-training) is a model that learns visual concepts from natural language supervision." | ||
--- | ||
|
||
Model name: `nim_clip` | ||
|
||
## About CLIP | ||
|
||
CLIP (Contrastive Language-Image Pre-training) is a model that learns visual concepts from natural language supervision. It is a zero-shot learning model that can be used for a wide range of vision and language tasks. | ||
|
||
This specific model runs on NVIDIA NIM. More information about CLIP on NIM can be found [here](https://build.nvidia.com/nvidia/nvclip). | ||
|
||
|
||
## Supported aidb operations | ||
|
||
* encode_text | ||
* encode_text_batch | ||
* encode_image | ||
* encode_image_batch | ||
|
||
## Supported models | ||
|
||
### NVIDIA NGC | ||
|
||
* nvidia/nvclip (default) | ||
|
||
|
||
## Creating the default model | ||
|
||
```sql | ||
SELECT aidb.create_model( | ||
'my_nim_clip_model', | ||
'nim_clip', | ||
credentials=>'{"api_key": "<API_KEY_HERE>"'::JSONB | ||
); | ||
``` | ||
|
||
There is only one model, the default `nvidia/nvclip`, so we do not need to specify the model in the configuration. | ||
|
||
## Model configuration settings | ||
|
||
The following configuration settings are available for CLIP models: | ||
|
||
* `model` - The NIM model to use. The default is `nvidia/nvclip` and is the only model available. | ||
* `url` - The URL of the model to use. This is optional and can be used to specify a custom model URL. Defaults to `https://integrate.api.nvidia.com/v1/embeddings`. | ||
* `dimensions` - Model output vector size, defaults to 1024 | ||
|
||
## Model credentials | ||
|
||
The following credentials are required if executing inside NVIDIA NGC: | ||
|
||
* `api_key` - The NVIDIA Cloud API key to use for authentication. |
Oops, something went wrong.