Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Dimension parameter for embeddings #144

Merged
merged 3 commits into from
Jan 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 50 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ We can also set ChatGPT parameters for chat completion at startup. Check the [of

The configuration can be automatically read from [IConfiguration](https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.configuration.iconfiguration), using for example a _ChatGPT_ section in the _appsettings.json_ file:

```yaml
```
"ChatGPT": {
"Provider": "OpenAI", // Optional. Allowed values: OpenAI (default) or Azure
"ApiKey": "", // Required
Expand All @@ -159,6 +159,9 @@ The configuration can be automatically read from [IConfiguration](https://learn.
// "FrequencyPenalty": 0,
// "ResponseFormat": { "Type": "text" }, // Allowed values for Type: text (default) or json_object
// "Seed": 42 // Optional (any integer value)
//},
//"DefaultEmbeddingParameters": {
// "Dimensions": 1536
//}
}
```
Expand Down Expand Up @@ -550,7 +553,52 @@ var response = await chatGptClient.GenerateEmbeddingAsync(message);
var embeddings = response.GetEmbedding();
```

This code will give you a float array containing all the embeddings for the specified message. The length of the array depends on the model used. For example, if we use the _text-embedding-ada-002_ model, the array will contain 1536 elements.
This code will give you a float array containing all the embeddings for the specified message. The length of the array depends on the model used:

| Model| Output dimension |
| - | - |
| text-embedding-ada-002 | 1536 |
| text-embedding-3-small | 1536 |
| text-embedding-3-large | 3072 |

Newer models like _text-embedding-3-small_ and _text-embedding-3-large_ allows developers to trade-off performance and cost of using embeddings. Specifically, developers can shorten embeddings without the embedding losing its concept-representing properties.

As for ChatGPT, this settings can be done in various ways:

- Via code:

```csharp
builder.Services.AddChatGpt(options =>
{
// ...

options.DefaultEmbeddingParameters = new EmbeddingParameters
{
Dimensions = 256
};
});
```

- Using the _appsettings.json_ file:

```
"ChatGPT": {
"DefaultEmbeddingParameters": {
"Dimensions": 256
}
}
```

Then, if you want to change the dimension for a particular request, you can specify the *EmbeddingParameters* argument in the **GetEmbeddingAsync** invocation:

```csharp
var response = await chatGptClient.GenerateEmbeddingAsync(request.Message, new EmbeddingParameters
{
Dimensions = 512
});

var embeddings = response.GetEmbedding(); // The length of the array is 512
```

If you need to calculate the [cosine similarity](https://en.wikipedia.org/wiki/Cosine_similarity) between two embeddings, you can use the **EmbeddingUtility.CosineSimilarity** method.

Expand Down
25 changes: 25 additions & 0 deletions docs/ChatGptNet.Models.Embeddings/EmbeddingParameters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# EmbeddingParameters class

Represents embeddings parameters.

```csharp
public class EmbeddingParameters
```

## Public Members

| name | description |
| --- | --- |
| [EmbeddingParameters](EmbeddingParameters/EmbeddingParameters.md)() | The default constructor. |
| [Dimensions](EmbeddingParameters/Dimensions.md) { get; set; } | The number of dimensions the resulting output embeddings should have. Only supported in `text-embedding-3` and later models. |

## Remarks

See [Create embeddings](https://platform.openai.com/docs/api-reference/embeddings/create) for more information.

## See Also

* namespace [ChatGptNet.Models.Embeddings](../ChatGptNet.md)
* [EmbeddingParameters.cs](https://github.com/marcominerva/ChatGptNet/tree/master/src/ChatGptNet/Models/Embeddings/EmbeddingParameters.cs)

<!-- DO NOT EDIT: generated by xmldocmd for ChatGptNet.dll -->
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# EmbeddingParameters.Dimensions property

The number of dimensions the resulting output embeddings should have. Only supported in `text-embedding-3` and later models.

```csharp
public int? Dimensions { get; set; }
```

## See Also

* class [EmbeddingParameters](../EmbeddingParameters.md)
* namespace [ChatGptNet.Models.Embeddings](../../ChatGptNet.md)

<!-- DO NOT EDIT: generated by xmldocmd for ChatGptNet.dll -->
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# EmbeddingParameters constructor

The default constructor.

```csharp
public EmbeddingParameters()
```

## See Also

* class [EmbeddingParameters](../EmbeddingParameters.md)
* namespace [ChatGptNet.Models.Embeddings](../../ChatGptNet.md)

<!-- DO NOT EDIT: generated by xmldocmd for ChatGptNet.dll -->
4 changes: 3 additions & 1 deletion docs/ChatGptNet.Models.Embeddings/OpenAIEmbeddingModels.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@ public static class OpenAIEmbeddingModels

| name | description |
| --- | --- |
| const [TextEmbeddingAda002](OpenAIEmbeddingModels/TextEmbeddingAda002.md) | The second generation embedding model provided by OpenAI. |
| const [TextEmbedding3Large](OpenAIEmbeddingModels/TextEmbedding3Large.md) | Most capable embedding model for both english and non-english tasks. It uses a 3072 output dimension. |
| const [TextEmbedding3Small](OpenAIEmbeddingModels/TextEmbedding3Small.md) | Increased performance over 2nd generation ada embedding model. It uses a 1536 output dimension. |
| const [TextEmbeddingAda002](OpenAIEmbeddingModels/TextEmbeddingAda002.md) | The second generation embedding model provided by OpenAI. It uses a 1536 output dimension. |

## Remarks

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# OpenAIEmbeddingModels.TextEmbedding3Large field

Most capable embedding model for both english and non-english tasks. It uses a 3072 output dimension.

```csharp
public const string TextEmbedding3Large;
```

## See Also

* class [OpenAIEmbeddingModels](../OpenAIEmbeddingModels.md)
* namespace [ChatGptNet.Models.Embeddings](../../ChatGptNet.md)

<!-- DO NOT EDIT: generated by xmldocmd for ChatGptNet.dll -->
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# OpenAIEmbeddingModels.TextEmbedding3Small field

Increased performance over 2nd generation ada embedding model. It uses a 1536 output dimension.

```csharp
public const string TextEmbedding3Small;
```

## See Also

* class [OpenAIEmbeddingModels](../OpenAIEmbeddingModels.md)
* namespace [ChatGptNet.Models.Embeddings](../../ChatGptNet.md)

<!-- DO NOT EDIT: generated by xmldocmd for ChatGptNet.dll -->
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# OpenAIEmbeddingModels.TextEmbeddingAda002 field

The second generation embedding model provided by OpenAI.
The second generation embedding model provided by OpenAI. It uses a 1536 output dimension.

```csharp
public const string TextEmbeddingAda002;
Expand Down
1 change: 1 addition & 0 deletions docs/ChatGptNet/ChatGptOptions.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ public class ChatGptOptions
| --- | --- |
| [ChatGptOptions](ChatGptOptions/ChatGptOptions.md)() | The default constructor. |
| [DefaultEmbeddingModel](ChatGptOptions/DefaultEmbeddingModel.md) { get; set; } | Gets or sets the default model for embedding. (default: [`TextEmbeddingAda002`](../ChatGptNet.Models.Embeddings/OpenAIEmbeddingModels/TextEmbeddingAda002.md) when the provider is OpenAI). |
| [DefaultEmbeddingParameters](ChatGptOptions/DefaultEmbeddingParameters.md) { get; } | Gets or sets the default parameters for embeddings. |
| [DefaultModel](ChatGptOptions/DefaultModel.md) { get; set; } | Gets or sets the default model for chat completion. (default: [`Gpt35Turbo`](../ChatGptNet.Models/OpenAIChatGptModels/Gpt35Turbo.md) when the provider is OpenAI). |
| [DefaultParameters](ChatGptOptions/DefaultParameters.md) { get; } | Gets or sets the default parameters for chat completion. |
| [MessageExpiration](ChatGptOptions/MessageExpiration.md) { get; set; } | Gets or sets the expiration for cached conversation messages (default: 1 hour). |
Expand Down
15 changes: 15 additions & 0 deletions docs/ChatGptNet/ChatGptOptions/DefaultEmbeddingParameters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# ChatGptOptions.DefaultEmbeddingParameters property

Gets or sets the default parameters for embeddings.

```csharp
public EmbeddingParameters DefaultEmbeddingParameters { get; }
```

## See Also

* class [EmbeddingParameters](../../ChatGptNet.Models.Embeddings/EmbeddingParameters.md)
* class [ChatGptOptions](../ChatGptOptions.md)
* namespace [ChatGptNet](../../ChatGptNet.md)

<!-- DO NOT EDIT: generated by xmldocmd for ChatGptNet.dll -->
3 changes: 2 additions & 1 deletion docs/ChatGptNet/ChatGptOptionsBuilder.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ public class ChatGptOptionsBuilder
| name | description |
| --- | --- |
| [ChatGptOptionsBuilder](ChatGptOptionsBuilder/ChatGptOptionsBuilder.md)() | The default constructor. |
| [DefaultEmbeddingModel](ChatGptOptionsBuilder/DefaultEmbeddingModel.md) { get; set; } | Gets or sets the default model for embedding. (default: [`TextEmbeddingAda002`](../ChatGptNet.Models.Embeddings/OpenAIEmbeddingModels/TextEmbeddingAda002.md) when the provider is OpenAI). |
| [DefaultEmbeddingModel](ChatGptOptionsBuilder/DefaultEmbeddingModel.md) { get; set; } | Gets or sets the default model for embeddings. (default: [`TextEmbeddingAda002`](../ChatGptNet.Models.Embeddings/OpenAIEmbeddingModels/TextEmbeddingAda002.md) when the provider is OpenAI). |
| [DefaultEmbeddingParameters](ChatGptOptionsBuilder/DefaultEmbeddingParameters.md) { get; } | Gets or sets the default parameters for embeddings. |
| [DefaultModel](ChatGptOptionsBuilder/DefaultModel.md) { get; set; } | Gets or sets the default model for chat completion. (default: [`Gpt35Turbo`](../ChatGptNet.Models/OpenAIChatGptModels/Gpt35Turbo.md) when the provider is OpenAI). |
| [DefaultParameters](ChatGptOptionsBuilder/DefaultParameters.md) { get; set; } | Gets or sets the default parameters for chat completion. |
| [MessageExpiration](ChatGptOptionsBuilder/MessageExpiration.md) { get; set; } | Gets or sets the expiration for cached conversation messages (default: 1 hour). |
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# ChatGptOptionsBuilder.DefaultEmbeddingModel property

Gets or sets the default model for embedding. (default: [`TextEmbeddingAda002`](../../ChatGptNet.Models.Embeddings/OpenAIEmbeddingModels/TextEmbeddingAda002.md) when the provider is OpenAI).
Gets or sets the default model for embeddings. (default: [`TextEmbeddingAda002`](../../ChatGptNet.Models.Embeddings/OpenAIEmbeddingModels/TextEmbeddingAda002.md) when the provider is OpenAI).

```csharp
public string? DefaultEmbeddingModel { get; set; }
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# ChatGptOptionsBuilder.DefaultEmbeddingParameters property

Gets or sets the default parameters for embeddings.

```csharp
public EmbeddingParameters DefaultEmbeddingParameters { get; }
```

## See Also

* class [EmbeddingParameters](../../ChatGptNet.Models.Embeddings/EmbeddingParameters.md)
* class [ChatGptOptionsBuilder](../ChatGptOptionsBuilder.md)
* namespace [ChatGptNet](../../ChatGptNet.md)

<!-- DO NOT EDIT: generated by xmldocmd for ChatGptNet.dll -->
2 changes: 1 addition & 1 deletion docs/ChatGptNet/IChatGptClient.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ public interface IChatGptClient
| [AskStreamAsync](IChatGptClient/AskStreamAsync.md)(…) | Requests a new chat interaction with streaming response, like in ChatGPT. (2 methods) |
| [ConversationExistsAsync](IChatGptClient/ConversationExistsAsync.md)(…) | Checks if a chat conversation exists. |
| [DeleteConversationAsync](IChatGptClient/DeleteConversationAsync.md)(…) | Deletes a chat conversation, clearing all the history. |
| [GenerateEmbeddingAsync](IChatGptClient/GenerateEmbeddingAsync.md)(…) | Generates embeddings for a message. (2 methods) |
| [GenerateEmbeddingAsync](IChatGptClient/GenerateEmbeddingAsync.md)(…) | Generates embeddings for a text. (2 methods) |
| [GetConversationAsync](IChatGptClient/GetConversationAsync.md)(…) | Retrieves a chat conversation from the cache. |
| [LoadConversationAsync](IChatGptClient/LoadConversationAsync.md)(…) | Loads messages into a new conversation. (2 methods) |
| [SetupAsync](IChatGptClient/SetupAsync.md)(…) | Setups a new conversation with a system message, that is used to influence assistant behavior. (2 methods) |
Expand Down
20 changes: 13 additions & 7 deletions docs/ChatGptNet/IChatGptClient/GenerateEmbeddingAsync.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,17 @@
# IChatGptClient.GenerateEmbeddingAsync method (1 of 2)

Generates embeddings for a list of messages.
Generates embeddings for a list of texts.

```csharp
public Task<EmbeddingResponse> GenerateEmbeddingAsync(IEnumerable<string> messages,
string? model = null, CancellationToken cancellationToken = default)
public Task<EmbeddingResponse> GenerateEmbeddingAsync(IEnumerable<string> texts,
EmbeddingParameters? parameters = null, string? model = null,
CancellationToken cancellationToken = default)
```

| parameter | description |
| --- | --- |
| messages | The messages to use for generating embeddings. |
| texts | The texts to use for generating embeddings. |
| parameters | An [`EmbeddingParameters`](../../ChatGptNet.Models.Embeddings/EmbeddingParameters.md) object used to override the default embedding parameters in the [`DefaultEmbeddingParameters`](../ChatGptOptions/DefaultEmbeddingParameters.md) property. |
| model | The name of the embedding model. If *model* is `null`, then the one specified in the [`DefaultEmbeddingModel`](../ChatGptOptions/DefaultEmbeddingModel.md) property will be used. |
| cancellationToken | The token to monitor for cancellation requests. |

Expand All @@ -26,23 +28,26 @@ The embeddings for the provided messages.
## See Also

* class [EmbeddingResponse](../../ChatGptNet.Models.Embeddings/EmbeddingResponse.md)
* class [EmbeddingParameters](../../ChatGptNet.Models.Embeddings/EmbeddingParameters.md)
* interface [IChatGptClient](../IChatGptClient.md)
* namespace [ChatGptNet](../../ChatGptNet.md)

---

# IChatGptClient.GenerateEmbeddingAsync method (2 of 2)

Generates embeddings for a message.
Generates embeddings for a text.

```csharp
public Task<EmbeddingResponse> GenerateEmbeddingAsync(string message, string? model = null,
public Task<EmbeddingResponse> GenerateEmbeddingAsync(string text,
EmbeddingParameters? parameters = null, string? model = null,
CancellationToken cancellationToken = default)
```

| parameter | description |
| --- | --- |
| message | The message to use for generating embeddings. |
| text | The text to use for generating embeddings. |
| parameters | An [`EmbeddingParameters`](../../ChatGptNet.Models.Embeddings/EmbeddingParameters.md) object used to override the default embedding parameters in the [`DefaultEmbeddingParameters`](../ChatGptOptions/DefaultEmbeddingParameters.md) property. |
| model | The name of the embedding model. If *model* is `null`, then the one specified in the [`DefaultEmbeddingModel`](../ChatGptOptions/DefaultEmbeddingModel.md) property will be used. |
| cancellationToken | The token to monitor for cancellation requests. |

Expand All @@ -59,6 +64,7 @@ The embeddings for the provided message.
## See Also

* class [EmbeddingResponse](../../ChatGptNet.Models.Embeddings/EmbeddingResponse.md)
* class [EmbeddingParameters](../../ChatGptNet.Models.Embeddings/EmbeddingParameters.md)
* interface [IChatGptClient](../IChatGptClient.md)
* namespace [ChatGptNet](../../ChatGptNet.md)

Expand Down
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@
| public type | description |
| --- | --- |
| class [EmbeddingData](./ChatGptNet.Models.Embeddings/EmbeddingData.md) | Represents an embedding. |
| class [EmbeddingParameters](./ChatGptNet.Models.Embeddings/EmbeddingParameters.md) | Represents embeddings parameters. |
| class [EmbeddingResponse](./ChatGptNet.Models.Embeddings/EmbeddingResponse.md) | Represents an embedding response. |
| static class [OpenAIEmbeddingModels](./ChatGptNet.Models.Embeddings/OpenAIEmbeddingModels.md) | Contains all the embedding models that are currently supported by OpenAI. |

Expand Down
2 changes: 1 addition & 1 deletion samples/ChatGptApi/Program.cs
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@
})
.WithOpenApi();

app.MapPost("/api/embeddings/CosineSimilarity", async (CosineSimilarityRequest request, IChatGptClient chatGptClient) =>
app.MapPost("/api/embeddings/cosine-similarity", async (CosineSimilarityRequest request, IChatGptClient chatGptClient) =>
{
var firstEmbeddingResponse = await chatGptClient.GenerateEmbeddingAsync(request.FirstMessage);
var secondEmbeddingResponse = await chatGptClient.GenerateEmbeddingAsync(request.SecondMessage);
Expand Down
11 changes: 7 additions & 4 deletions samples/ChatGptApi/appsettings.json
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
{
"ChatGPT": {
"Provider": "OpenAI", // Optional. Allowed values: OpenAI (default) or Azure
"Provider": "OpenaI", // Optional. Allowed values: OpenAI (default) or Azure
"ApiKey": "", // Required
//"Organization": "", // Optional, used only by OpenAI
//"Organization": "", // Optional, used only by OpenAI
"ResourceName": "", // Required when using Azure OpenAI Service
"ApiVersion": "2023-12-01-preview", // Optional, used only by Azure OpenAI Service (default: 2023-12-01-preview)
"ApiVersion": "2023-12-01-preview", // Optional, used only by Azure OpenAI Service (default: 2023-08-01-preview)
"AuthenticationType": "ApiKey", // Optional, used only by Azure OpenAI Service. Allowed values: ApiKey (default) or ActiveDirectory

"DefaultModel": "my-model",
"DefaultEmbeddingModel": "text-embedding-ada-002", // Optional, set it if you want to use embeddings
"MessageLimit": 20,
"MessageExpiration": "00:30:00",
"ThrowExceptionOnError": true, // Optional, default: true
"ThrowExceptionOnError": true // Optional, default: true
//"User": "UserName",
//"DefaultParameters": {
// "Temperature": 0.8,
Expand All @@ -21,6 +21,9 @@
// "FrequencyPenalty": 0,
// "ResponseFormat": { "Type": "text" }, // Allowed values for Type: text (default) or json_object
// "Seed": 42 // Optional (any integer value)
//},
//"DefaultEmbeddingParameters": {
// "Dimensions": 1536
//}
},
"Logging": {
Expand Down
5 changes: 4 additions & 1 deletion samples/ChatGptConsole/appsettings.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"DefaultEmbeddingModel": "text-embedding-ada-002", // Optional, it set if you want to use embeddings
"MessageLimit": 20,
"MessageExpiration": "00:30:00",
"ThrowExceptionOnError": true,
"ThrowExceptionOnError": true
//"User": "UserName",
//"DefaultParameters": {
// "Temperature": 0.8,
Expand All @@ -21,6 +21,9 @@
// "FrequencyPenalty": 0,
// "ResponseFormat": { "Type": "text" }, // Allowed values for Type: text (default) or json_object
// "Seed": 42 // Optional (any integer value)
//},
//"DefaultEmbeddingParameters": {
// "Dimensions": 1536
//}
},
"Logging": {
Expand Down
5 changes: 4 additions & 1 deletion samples/ChatGptFunctionCallingConsole/appsettings.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"DefaultEmbeddingModel": "text-embedding-ada-002", // Optional, it set if you want to use embeddings
"MessageLimit": 20,
"MessageExpiration": "00:30:00",
"ThrowExceptionOnError": true, // Optional, default: true
"ThrowExceptionOnError": true // Optional, default: true
//"User": "UserName",
//"DefaultParameters": {
// "Temperature": 0.8,
Expand All @@ -21,6 +21,9 @@
// "FrequencyPenalty": 0,
// "ResponseFormat": { "Type": "text" }, // Allowed values for Type: text (default) or json_object
// "Seed": 42 // Optional (any integer value)
//},
//"DefaultEmbeddingParameters": {
// "Dimensions": 1536
//}
},
"Logging": {
Expand Down
Loading
Loading