Merge branch 'develop' into mtk/compatibility-updates

EnterpriseDB · Feb 4, 2025 · b3c0505 · b3c0505
2 parents d463dbe + e629011
commit b3c0505
Show file tree

Hide file tree

Showing 40 changed files with 696 additions and 424 deletions.
diff --git a/advocacy_docs/edb-postgres-ai/ai-accelerator/installing/complete.mdx b/advocacy_docs/edb-postgres-ai/ai-accelerator/installing/complete.mdx
@@ -34,8 +34,8 @@ __OUTPUT__
                                      List of installed extensions
        Name       | Version |   Schema   |                        Description
 ------------------+---------+------------+------------------------------------------------------------
- aidb             | 1.0.7   | aidb       | aidb: makes it easy to build AI applications with postgres
- pgfs             | 1.0.4   | pgfs       | pgfs: enables access to filesystem-like storage locations
+ aidb             | 2.1.1   | aidb       | aidb: makes it easy to build AI applications with postgres
+ pgfs             | 1.0.6   | pgfs       | pgfs: enables access to filesystem-like storage locations
  vector           | 0.8.0   | public     | vector data type and ivfflat and hnsw access methods
 ```
 

diff --git a/advocacy_docs/edb-postgres-ai/ai-accelerator/installing/index.mdx b/advocacy_docs/edb-postgres-ai/ai-accelerator/installing/index.mdx
@@ -12,4 +12,3 @@ Pipelines is delivered as a set of extensions. Depending on how you are deployin
 - [Manually installing pipelines packages](packages)
 
 Once the packages are installed, you can [complete the installation](complete) by activating the extensions within Postgres.
-
diff --git a/advocacy_docs/edb-postgres-ai/ai-accelerator/limitations.mdx b/advocacy_docs/edb-postgres-ai/ai-accelerator/limitations.mdx
@@ -32,3 +32,14 @@ The impact of this depends on what type of embedding is being performed.
 ### Data Formats
 
 * Pipelines currently only supports Text and Image formats. Other formats, including structured data, video, and audio, are not currently supported.
+
+### Upgrading
+
+When upgrading the aidb and pgfs extension, there is currently no support for Postgres extension upgrades. You must therefor drop and recreate the extensions when upgrading to a new version of the extensions.
+
+```sql
+DROP EXTENSION aidb CASCADE;
+DROP EXTENSION pgfs CASCADE;
+CREATE EXTENSION aidb CASCADE;
+CREATE EXTENSION pgfs CASCADE;
+```
diff --git a/advocacy_docs/edb-postgres-ai/ai-accelerator/models/openai-api-compatibility.mdx b/advocacy_docs/edb-postgres-ai/ai-accelerator/models/openai-api-compatibility.mdx
@@ -1,10 +1,10 @@
 ---
 title: "Using an OpenAI compatible API with Pipelines"
-navTitle: "OpenAI Compatible Models"
+navTitle: "OpenAI compatible Models"
 description: "Using an OpenAI compatible API with Pipelines by setting options and credentials."
 ---
 
-To make use of an OpenAI compliant API, you can use the openai_embeddings or openai_completions model providers. Note that a retriever will need to encode first so you can only use the embeddings model provider with a retriever.
+To make use of an OpenAI compliant API, you can use the embeddings or completions model providers. Note that a retriever will need to encode first so you can only use the embeddings model provider with a retriever.
 
 ## Why use an OpenAI compatible API?
 
@@ -21,23 +21,23 @@ The starting point for this process is creating a model. When you create a model
 ```sql
 select aidb.create_model(
 'my_local_ollama',
-'openai_embeddings',
-'{"model":"llama3.3", "url":"http://llama.local:11434/v1/embeddings", "dimensions":8192}'::JSONB,
+'embeddings',
+'{"model":"llama3.1", "url":"http://llama.local:11434/v1/embeddings", "dimensions":2000}'::JSONB,
 '{"api_key":""}'::JSONB);
 ```
 
 ### Model name and model provider
 
 The model name is the first parameter and set to “my_local_ollama” which we will use later.
 
-We specify the model provider as “openai_embeddings” which is the provider that defaults to using OpenAI servers, but can be overridden by the configuration (the next parameter), to talk to any compliant server.
+We specify the model provider as “embeddings” which is the provider that defaults to using OpenAI servers, but can be overridden by the configuration (the next parameter), to talk to any compliant server.
 
 ### Configuration
 
 The next parameter is the configuration. This is a JSON string, which when expanded has three parameters, the model, the url and the dimensions.
 
 ```json
-'{"model":"llama3.3", "url":"http://llama.local:11434/v1/embeddings", "dimensions":8192}'::JSONB
+'{"model":"llama3.1", "url":"http://llama.local:11434/v1/embeddings", "dimensions":2000}'::JSONB
 ```
 
 In this case, we are setting the model to [“llama3.3”](https://ollama.com/library/llama3.3), a relatively new and powerful model. Remember to run `ollama run llama3.3` to pull and start the model on the server.
@@ -48,15 +48,15 @@ The next json setting is the important one, overriding the endpoint that the aid
 * It has port 11434 (the default port for Ollama) open to service requests over HTTP (not HTTPS in this case).
 * The path to the endpoint on the server `/v1/embeddings`; the same as OpenAI.
 
-Putting those components together we get `[`http://llama.local:11434/v1/embeddings`](http://art.local:11434/v1/embeddings","api_key":"","dimensions":8192}'::JSONB)` as our end point.
+Putting those components together we get `[`http://llama.local:11434/v1/embeddings`](http://art.local:11434/v1/embeddings","api_key":"","dimensions":2000}'::JSONB)` as our end point.
 
-The last JSON parameter in this example is “dimensions” which is a hint to the system about how many vector values to expect from the model. If we [look up llama3.3’s properties](https://ollama.com/library/llama3.3/blobs/4824460d29f2) we can see the `llama.embedding_length` value is 8192\. The provider defaults to 1536 (with some hard-wired exceptions depending on model) but it doesn’t know about llama3.3, so we have to pass the dimension value of 8192 in the configuration.
+The last JSON parameter in this example is “dimensions” which is a hint to the system about how many vector values to expect from the model. If we [look up llama3.3’s properties](https://ollama.com/library/llama3.3/blobs/4824460d29f2) we can see the `llama.embedding_length` value is 8192. The provider defaults to 1536 (with some hard-wired exceptions depending on model) but it doesn’t know about llama3.3's max. Another factor is [pgvector is limited to 2000 dimensions](https://github.com/pgvector/pgvector?tab=readme-ov-file#what-if-i-want-to-index-vectors-with-more-than-2000-dimensions). So we pass a  dimension value of 2000 in the configuration, to get the maximum dimensions available with pgvector.
 
 That completes the configuration parameter.
 
 ### Credentials
 
-The last parameter is the credentials parameter, which is another JSON string. It’s usually used for carrying the `api_key` for the OpenAI service and any other necessary credential information. It is not part of the configuration and by being separate, it can be securely hidden from users with lesser permissions. For our ollama connection, we don’t need an api\_key, but the model provider currently requires that one is specified. We can specify an empty string for the api\_key to satisfy this requirement.
+The last parameter is the credentials parameter, which is another JSON string. It’s usually used for carrying the `api_key` for the OpenAI service and any other necessary credential information. It is not part of the configuration and by being separate, it can be securely hidden from users with lesser permissions. For our ollama connection, we don’t need an `api_key`, but the model provider currently requires that one is specified. We can specify an empty string for the `api_key` to satisfy this requirement.
 
 ## Using the model
 

diff --git a/advocacy_docs/edb-postgres-ai/ai-accelerator/models/primitives.mdx b/advocacy_docs/edb-postgres-ai/ai-accelerator/models/primitives.mdx
@@ -65,3 +65,17 @@ select aidb.decode_text_batch('my_bert_model', ARRAY[
     'summarize: The missile knows where it is at all times. It knows this because it knows where it isn''t. By subtracting where it is from where it isn''t, or where it isn''t from where it is (whichever is greater), it obtains a difference, or deviation. The guidance subsystem uses deviations to generate corrective commands to drive the missile from a position where it is to a position where it isn''t, and arriving at a position where it wasn''t, it now is.'
 ]);
 ```
+
+## Rerank Text
+
+Call aidb.rerank_text to get text reranking logits.
+
+```sql
+SELECT aidb.rerank_text('my_reranking_model',
+    'What is the best open source database?',
+    ARRAY[
+        'PostgreSQL',
+        'The quick brown fox jumps over the lazy dog.',
+        'Hercule Poirot'
+    ]);
+```
diff --git a/...acy_docs/edb-postgres-ai/ai-accelerator/models/supported-models/completions.mdx b/...acy_docs/edb-postgres-ai/ai-accelerator/models/supported-models/completions.mdx
@@ -0,0 +1,104 @@
+---
+title: "Completions"
+navTitle: "Completions"
+description: "Completions is a text completion model that enables use of any OpenAI API compatible text generation model."
+---
+
+Model name: `completions`
+
+Model aliases:
+
+* `openai_completions`
+* `nim_completions`
+
+## About Completions
+
+Completions enables the use of any OpenAI API compatible text generation model.
+
+It is suitable for chat/text transforms, text completion, and other text generation tasks.
+
+Depending on the name of the model, the model provider will set defaults accordingly.
+
+When invoked as `completions` or `openai_completions`, the model provider will default to using the OpenAI API.
+
+When invoked as `nim_completions`, the model provider will default to using the NVIDIA NIM API.
+
+
+## Supported aidb operations
+
+* decode_text
+* decode_text_batch
+
+## Supported models
+
+* Any text generation model that is supported by the provider.
+
+## Supported OpenAI models
+
+See a list of supported OpenAI models [here](https://platform.openai.com/docs/models#models-overview).
+
+## Supported NIM models
+
+* [ibm/granite-guardian-3.0-8b](https://build.nvidia.com/ibm/granite-guardian-3_0-8b)  
+* [ibm/granite-3.0-8b-instruct](https://build.nvidia.com/ibm/granite-3_0-8b-instruct)  
+* [ibm/granite-3.0-3b-a800m-instruct](https://build.nvidia.com/ibm/granite-3_0-3b-a800m-instruct)  
+* [meta/llama-3.3-70b-instruct](https://build.nvidia.com/meta/llama-3_3-70b-instruct)  
+* [meta/llama-3.2-3b-instruct](https://build.nvidia.com/meta/llama-3.2-3b-instruct)  
+* [meta/llama-3.2-1b-instruct](https://build.nvidia.com/meta/llama-3.2-1b-instruct)  
+* [meta/llama-3.1-405b-instruct](https://build.nvidia.com/meta/llama-3_1-405b-instruct)  
+* [meta/llama-3.1-70b-instruct](https://build.nvidia.com/meta/llama-3_1-70b-instruct)  
+* [meta/llama-3.1-8b-instruct](https://build.nvidia.com/meta/llama-3_1-8b-instruct)  
+* [meta/llama3-70b-instruct](https://build.nvidia.com/meta/llama3-70b)  
+* [meta/llama3-8b-instruct](https://build.nvidia.com/meta/llama3-8b)  
+* [nvidia/llama-3.1-nemotron-70b-instruct](https://build.nvidia.com/nvidia/llama-3_1-nemotron-70b-instruct)  
+* [nvidia/llama-3.1-nemotron-51b-instruct](https://build.nvidia.com/nvidia/llama-3_1-nemotron-51b-instruct)  
+* [nvidia/nemotron-mini-4b-instruct](https://build.nvidia.com/nvidia/nemotron-mini-4b-instruct)  
+* [nvidia/nemotron-4-340b-instruct](https://build.nvidia.com/nvidia/nemotron-4-340b-instruct)  
+* [google/shieldgemma-9b](https://build.nvidia.com/google/shieldgemma-9b)  
+* [google/gemma-7b](https://build.nvidia.com/google/gemma-7b)  
+* [google/codegemma-7b](https://build.nvidia.com/google/codegemma-7b)
+
+## Creating the default model
+
+There is no default model for Completions. You can create any supported model using the `aidb.create_model` function.
+
+## Creating an OpenAI model
+
+You can create any supported OpenAI model using the `aidb.create_model` function.
+
+In this example, we are creating a GPT-4o model with the name `my_openai_model`:
+
+```sql
+SELECT aidb.create_model(
+  'my_openai_model',
+  'openai_completions',
+  '{"model": "gpt-4o"}'::JSONB,
+  '{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"}'::JSONB 
+);
+```
+
+## Creating a NIM model
+
+```sql
+SELECT aidb.create_model(
+          'my_nim_completions', 
+          'nim_completions',
+          '{"model": "meta/llama-3.2-1b-instruct"}'::JSONB,
+          credentials=>'{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"'::JSONB);
+```
+
+## Model configuration settings
+
+The following configuration settings are available for OpenAI models:
+
+* `model` - The model to use.
+* `url` - The URL of the model to use. This is optional and can be used to specify a custom model URL.  
+  * If `openai_completions` (or `completions`) is the `model`, `url` defaults to `https://api.openai.com/v1/chat/completions`. 
+  * If `nim_completions` is the `model`, `url` defaults to `https://integrate.api.nvidia.com/v1/chat/completions`.
+* `max_concurrent_requests` - The maximum number of concurrent requests to make to the OpenAI model. Defaults to `25`.
+
+## Model credentials
+
+The following credentials are required for these models:
+
+* `api_key` - The API key to use for authentication.
diff --git a/...ls/supported-models/openai-embeddings.mdx → ...or/models/supported-models/embeddings.mdx b/...ls/supported-models/openai-embeddings.mdx → ...or/models/supported-models/embeddings.mdx
@@ -1,16 +1,25 @@
 ---
-title: "OpenAI Embeddings"
-navTitle: "OpenAI Embeddings"
-description: "OpenAI Embeddings is a text embedding model that enables use of any OpenAI text embedding model."
+title: "Embeddings"
+navTitle: "Embeddings"
+description: "Embeddings is a text embedding model that enables use of any OpenAI API compatible text embedding model."
 ---
 
-Model name: `openai_embeddings`
+Model name: `embeddings`
 
-## About OpenAI Embeddings
+Model aliases:
 
-OpenAI Embeddings is a text embedding model that enables use of any supported OpenAI text embedding model. It is suitable for text classification, clustering, and other text embedding tasks.
+* `openai_embeddings`
+* `nim_embeddings`
 
-See a list of supported OpenAI models [here](https://platform.openai.com/docs/guides/embeddings#embedding-models).
+## About Embeddings
+
+OpenAI Embeddings is a text embedding model that enables use of any OpenAI API complatible text embedding model. It is suitable for text classification, clustering, and other text embedding tasks.
+
+Depending on the name of the model, the model provider will set defaults accordingly.
+
+When invoked as `embeddings` or `openai_embeddings`, the model provider will default to using the OpenAI API.
+
+When invoked as `nim_embeddings`, the model provider will default to using the NVIDIA NIM API.
 
 ## Supported aidb operations
 
@@ -19,10 +28,18 @@ See a list of supported OpenAI models [here](https://platform.openai.com/docs/gu
 
 ## Supported models
 
-* Any text embedding model that is supported by OpenAI. This includes `text-embedding-3-small`, `text-embedding-3-large`, and `text-embedding-ada-002`.
+* Any text embedding model that is supported by the provider.
+
+### Supported OpenAI models
+
+* Any text embedding model that is supported by OpenAI. This includes `text-embedding-3-small`, `text-embedding-3-large`, and `text-embedding-ada-002`. See a list of supported OpenAI models [here](https://platform.openai.com/docs/guides/embeddings#embedding-models).
 * Defaults to `text-embedding-3-small`.
 
-## Creating the default model
+### Supported NIM models
+
+* [nvidia/nv-embedqa-e5-v5](https://build.nvidia.com/nvidia/nv-embedqa-e5-v5) (default)
+
+## Creating the default with OpenAI model
 
 ```sql
 SELECT aidb.create_model('my_openai_embeddings', 
@@ -52,23 +69,11 @@ Because we are passing the configuration options and the credentials, unlike the
 The following configuration settings are available for OpenAI models:
 
 * `model` - The OpenAI model to use.
-* `url` - The URL of the OpenAI model to use. This is optional and can be used to specify a custom model URL. Defaults to `https://api.openai.com/v1/chat/completions`.
+* `url` - The URL of the model to use. This is optional and can be used to specify a custom model URL.  
+  * If `openai_completions` (or `completions`) is the `model`, `url` defaults to `https://api.openai.com/v1/chat/completions`. 
+  * If `nim_completions` is the `model`, `url` defaults to `https://integrate.api.nvidia.com/v1/chat/completions`.
 * `max_concurrent_requests` - The maximum number of concurrent requests to make to the OpenAI model. Defaults to `25`.
 
-## Available OpenAI Embeddings models
-
-* sentence-transformers/all-MiniLM-L6-v2 (default)
-* sentence-transformers/all-MiniLM-L6-v1
-* sentence-transformers/all-MiniLM-L12-v1
-* sentence-transformers/msmarco-bert-base-dot-v5
-* sentence-transformers/multi-qa-MiniLM-L6-dot-v1
-* sentence-transformers/paraphrase-TinyBERT-L6-v2
-* sentence-transformers/all-distilroberta-v1
-* sentence-transformers/all-MiniLM-L6-v2
-* sentence-transformers/multi-qa-MiniLM-L6-cos-v1
-* sentence-transformers/paraphrase-multilingual-mpnet-base-v2
-* sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
-
 ## Model credentials
 
 The following credentials are required for OpenAI models:

diff --git a/advocacy_docs/edb-postgres-ai/ai-accelerator/models/supported-models/index.mdx b/advocacy_docs/edb-postgres-ai/ai-accelerator/models/supported-models/index.mdx
@@ -12,8 +12,11 @@ navigation:
 
 This section provides details of the supported models in EDB Postgres AI - AI Accelerator - Pipelines and their capabilities.
 
-* [T5](t5)
-* [OpenAI Embeddings](openai-embeddings)
-* [OpenAI Completions](openai-completions)
-* [BERT](bert)
-* [CLIP](clip)
+* [T5](t5).
+* [Embeddings](embeddings), including openai-embeddings and nim-embeddings.
+* [Completions](completions), including openai-completions and nim-completions.
+* [BERT](bert).
+* [CLIP](clip).
+* [NIM_CLIP](nim_clip).
+* [NIM_RERANKING](nim_reranking).
+
diff --git a/advocacy_docs/edb-postgres-ai/ai-accelerator/models/supported-models/nim_clip.mdx b/advocacy_docs/edb-postgres-ai/ai-accelerator/models/supported-models/nim_clip.mdx
@@ -0,0 +1,54 @@
+---
+title: "CLIP"
+navTitle: "CLIP"
+description: "CLIP (Contrastive Language-Image Pre-training) is a model that learns visual concepts from natural language supervision."
+---
+
+Model name: `nim_clip`
+
+## About CLIP
+
+CLIP (Contrastive Language-Image Pre-training) is a model that learns visual concepts from natural language supervision. It is a zero-shot learning model that can be used for a wide range of vision and language tasks.
+
+This specific model runs on NVIDIA NIM. More information about CLIP on NIM can be found [here](https://build.nvidia.com/nvidia/nvclip).
+
+
+## Supported aidb operations
+
+* encode_text
+* encode_text_batch
+* encode_image
+* encode_image_batch
+
+## Supported models
+
+### NVIDIA NGC
+
+* nvidia/nvclip (default)
+
+
+## Creating the default model
+
+```sql
+SELECT aidb.create_model(
+    'my_nim_clip_model', 
+    'nim_clip',
+    credentials=>'{"api_key": "<API_KEY_HERE>"'::JSONB
+);
+```
+
+There is only one model, the default `nvidia/nvclip`, so we do not need to specify the model in the configuration. 
+
+## Model configuration settings
+
+The following configuration settings are available for CLIP models:
+
+* `model` - The NIM model to use. The default is `nvidia/nvclip` and is the only model available.
+* `url` - The URL of the model to use. This is optional and can be used to specify a custom model URL. Defaults to `https://integrate.api.nvidia.com/v1/embeddings`.
+* `dimensions` - Model output vector size, defaults to 1024
+
+## Model credentials
+
+The following credentials are required if executing inside NVIDIA NGC:
+
+* `api_key` - The NVIDIA Cloud API key to use for authentication.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -12,4 +12,3 @@ Pipelines is delivered as a set of extensions. Depending on how you are deployin
		- [Manually installing pipelines packages](packages)

		Once the packages are installed, you can [complete the installation](complete) by activating the extensions within Postgres.