Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
liqul committed Dec 20, 2023
1 parent 4458d0f commit 0578900
Show file tree
Hide file tree
Showing 10 changed files with 13,644 additions and 3,341 deletions.
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,11 @@ data analytics tasks
### Installation
You can install TaskWeaver by running the following command:
```bash
# [optional] create a conda environment to isolate the dependencies
# conda create -n taskweaver python=3.10
# conda activate taskweaver

# clone the repository
git clone https://github.com/microsoft/TaskWeaver.git
cd TaskWeaver
# install the requirements
Expand Down
2 changes: 1 addition & 1 deletion taskweaver/planner/compression_prompt.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ content: |-
The summary is desired to be organized in the following format:
```json
{{
"ConversationHistory": "This part summarizes all the conversation rounds",
"ConversationSummary": "This part summarizes all the conversation rounds",
}}
```
Expand Down
76 changes: 76 additions & 0 deletions website/docs/compression.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Prompt Compression

After chatting for a few rounds, the chat history can become quite long, especially when we have code and execution results in it.
This can cause the problem of exceeding the context window of the LLMs.
To solve the problem, one way is to summarize the chat history a few rounds ago,
and only keep the latest rounds of the chat history.

Another way is to use a vector database to store the chat history entries, and only retrieve the last few rounds of the relevant
part given the current user request. However, in TaskWeaver, code is also part of the chat history.
It is not an option to skip some intermediate code and execution results in order to correctly
generate the code for the current user request. Therefore, we choose the first way to solve the problem.

The following figure shows the idea of chat history summarization where the chat history is divided into two parts:
- Rounds to compress: this part is summarized and only the summary is kept in the chat history. If the context_summary
already exists, a new summary is generated based on the previous summary adding the rounds to be summarized.
- Rounds to retain: this part is kept in the chat history without summarization.

```mermaid
flowchart LR
ConversationSummary-->Round1
subgraph Rounds to compress
Round1-->Round2
end
subgraph Rounds to retain
Round2-->Round3-->Round4-->Round5
end
```
Imagine that, at the beginning, the ConversationSummary is empty.
Once the chat history reaches the `rounds_to_compress` (default 2) rounds plus `rounds_to_retain` (default 3) rounds,
the ConversationSummary is generated based on the `rounds_to_compress` rounds and the `rounds_to_retain` rounds are kept in the chat history.
After that, there will be only `rounds_to_retain` rounds in the chat history.
The next time the chat history reaches the `rounds_to_compress` rounds plus `rounds_to_retain` rounds,
the ConversationSummary is generated based on the `rounds_to_compress` rounds and the previous ConversationSummary.
We use these two parameters to control the frequency of the chat history summarization.

An example of the chat history summarization in the Code Generator is shown below:

```json
{
"ConversationSummary": "The user requested the generation of 100 random numbers, which was successfully executed. Then, the user asked to show the top 5 largest numbers from the generated random numbers. The assistant provided a code snippet to sort the generated random numbers in descending order and select the top 5 largest numbers, which was also successfully executed. After that, the user requested to plot the distribution of the 100 numbers, which was successfully executed. The user then asked to count the frequency of numbers in each bin of the histogram and identify the bin with the most numbers for the 0.1 bin width, which was also successfully executed.",
"Variables": [
{
"name": "random_numbers_100",
"type": "numpy array",
"description": "An array containing 100 random numbers generated using np.random.rand()"
},
{
"name": "top_5_largest",
"type": "numpy array",
"description": "An array containing the top 5 largest numbers from the generated random numbers"
}
]
}
```
The JSON object has two fields:
- ConversationSummary: the summary of the chat history.
- Variables: the variables in the chat history that could be used in the current user request.

The chat history summary of the Planner has only the ConversationSummary field.

The actual code generated in the summarized rounds is ignored and only the variables are kept in the summary
so that the LLM can still refer the these variables in future code generation.

One thing to note is that chat history summarization requires call the LLM which incurs additional latency and cost.
The prompts for chat history summarization could be found for [planner](../taskweaver/planner/compression_prompt.yaml)
and [code generator](../taskweaver/code_interpreter/code_generator/compression_prompt.yaml).

## Configurations
As explained above, there are two parameters in controlling the chat history summarization:
`round_compressor.rounds_to_compress` (default 2) and `round_compressor.rounds_to_retain` (default 3).
To enable the chat history summarization, you need to set `planner.prompt_compression`
and `code_generator.prompt_compression` to `true`.




51 changes: 28 additions & 23 deletions website/docs/configurations.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,40 @@
# Configuration file
# Configuration File
The configuration file is located at `project/taskweaver_config.json`.
You can edit this file to configure TaskWeaver.
The configuration file is in JSON format. So for boolean values, use `true` or `false` instead of `True` or `False`.
For null values, use `null` instead of `None` or `"null"`. All other values should be strings in double quotes.
The following table lists the parameters in the configuration file:

| Parameter | Description | Default Value |
|-----------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------------|
| `llm.model` | The model name used by the language model. | gpt-4 |
| `llm.backup_model` | The model name used for self-correction purposes. | `null` |
| `llm.api_base` | The base URL of the OpenAI API. | `https://api.openai.com/v1` |
| `llm.api_key` | The API key of the OpenAI API. | `null` |
| `llm.api_type` | The type of the OpenAI API, could be `openai` or `azure`. | `openai` |
| `llm.api_version` | The version of the OpenAI API. | `2023-07-01-preview` |
| Parameter | Description | Default Value |
|-----------------------------------------------|----------------------------------------------------------------------------|----------------------------------------------------------------------------------------|
| `llm.model` | The model name used by the language model. | gpt-4 |
| `llm.backup_model` | The model name used for self-correction purposes. | `null` |
| `llm.api_base` | The base URL of the OpenAI API. | `https://api.openai.com/v1` |
| `llm.api_key` | The API key of the OpenAI API. | `null` |
| `llm.api_type` | The type of the OpenAI API, could be `openai` or `azure`. | `openai` |
| `llm.api_version` | The version of the OpenAI API. | `2023-07-01-preview` |
| `llm.response_format` | The response format of the OpenAI API, could be `json_object`, `text` or `null`. | `json_object` |
| `code_interpreter.code_verification_on` | Whether to enable code verification. | `false` |
| `code_interpreter.plugin_only` | Whether to turn on the plugin only mode. | `false` |
| `code_interpreter.allowed_modules` | The list of allowed modules to import in code generation. | `"pandas", "matplotlib", "numpy", "sklearn", "scipy", "seaborn", "datetime", "typing"` |
| `logging.log_file` | The name of the log file. | `taskweaver.log` |
| `logging.log_folder` | The folder to store the log file. | `logs` |
| `plugin.base_path` | The folder to store plugins. | `${AppBaseDir}/plugins` |
| `planner.example_base_path` | The folder to store planner examples. | `${AppBaseDir}/planner_examples` |
| `planner.prompt_compression` | Whether to compress the chat history for planner. | `false` |
| `planner.skip_planning` | Whether to skip LLM planning process and enable the default plan | `false` |
| `code_generator.example_base_path` | The folder to store code interpreter examples. | `${AppBaseDir}/codeinterpreter_examples` |
| `code_generator.prompt_compression` | Whether to compress the chat history for code interpreter. | `false` |
| `code_generator.enable_auto_plugin_selection` | Whether to enable auto plugin selection. | `false` |
| `code_generator.auto_plugin_selection_topk` | The number of auto selected plugins in each round. | `3` |
| `code_interpreter.code_verification_on` | Whether to enable code verification. | `false` |
| `code_interpreter.plugin_only` | Whether to turn on the plugin only mode. | `false` |
| `code_interpreter.allowed_modules` | The list of allowed modules to import in code generation. | `"pandas", "matplotlib", "numpy", "sklearn", "scipy", "seaborn", "datetime", "typing"` |
| `logging.log_file` | The name of the log file. | `taskweaver.log` |
| `logging.log_folder` | The folder to store the log file. | `logs` |
| `plugin.base_path` | The folder to store plugins. | `${AppBaseDir}/plugins` |
| `planner.example_base_path` | The folder to store planner examples. | `${AppBaseDir}/planner_examples` |
| `planner.prompt_compression` | Whether to compress the chat history for planner. | `false` |
| `planner.skip_planning` | Whether to skip LLM planning process and enable the default plan | `false` |
| `code_generator.example_base_path` | The folder to store code interpreter examples. | `${AppBaseDir}/codeinterpreter_examples` |
| `code_generator.prompt_compression` | Whether to compress the chat history for code interpreter. | `false` |
| `code_generator.enable_auto_plugin_selection` | Whether to enable auto plugin selection. | `false` |
| `code_generator.auto_plugin_selection_topk` | The number of auto selected plugins in each round. | `3` |
| `session.max_internal_chat_round_num` | The maximum number of internal chat rounds between Planner and Code Interpreter. | `10` |
| `session.code_interpreter_only` | Allow users to directly communicate with the Code Interpreter. | `false` |
| `session.code_interpreter_only` | Allow users to directly communicate with the Code Interpreter. | `false` |
|`round_compressor.rounds_to_compress` | The number of rounds to compress. | `2` |
|`round_compressor.rounds_to_retain` | The number of rounds to retain. | `3` |


> 💡 $\{AppBaseDir\} is the project directory.
> 💡 Up to 11/30/2023, the `json_object` and `text` options of `llm.response_format` is only supported by the OpenAI models later than 1106. If you are using an older version of OpenAI model, you need to set the `llm.response_format` to `null`.
> 💡 Read [this](compression.md) for more information for `planner.prompt_compression` and `code_generator.prompt_compression`.
5 changes: 3 additions & 2 deletions website/docs/example/example.md → website/docs/example.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
# Customizing Examples

There are two types of examples: (1) planning examples and (2) code interpreter examples.
Planning examples are used to demonstrate how to use TaskWeaver to plan for a specific task.
Code generation examples are used to demonstrate how to generate code or orchestrate plugins to perform a specific task.

#### Planning Examples
## Planning Examples

A planning example tells LLMs how to plan for a specific query from the user; talk to the code interpreter;
receive the execution result from the code interpreter; and summarize the execution result.
Expand Down Expand Up @@ -75,7 +76,7 @@ rounds:
content: 2. report the result to the user
```
#### Code Interpreter Examples
## Code Interpreter Examples
A code interpreter example tells LLMs how to generate code or orchestrate plugins to perform a specific task.
The task is from the planner. Before constructing the code interpreter example, we strongly encourage you to
Expand Down
2 changes: 1 addition & 1 deletion website/docs/run_pytest.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# running pytests
# Running pytest

## quickstart

Expand Down
9 changes: 7 additions & 2 deletions website/docusaurus.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,10 @@ const config = {
locales: ['en'],
},

markdown: {
mermaid: true,
},

presets: [
[
'classic',
Expand Down Expand Up @@ -134,8 +138,8 @@ const config = {
copyright: `Copyright © ${new Date().getFullYear()} TaskWeaver`,
},
prism: {
theme: prismThemes.github,
darkTheme: prismThemes.dracula,
darkTheme: prismThemes.github,
theme: prismThemes.dracula,
},
}),
themes: [
Expand All @@ -161,6 +165,7 @@ const config = {
hideSearchBarWithNoSearchContext: true,
}),
],
'@docusaurus/theme-mermaid'
],
};
export default config;
Loading

0 comments on commit 0578900

Please sign in to comment.