From 9c598a758b69a95b214815ce0c5b9cf9be6c6fa6 Mon Sep 17 00:00:00 2001 From: Yaliang Wu Date: Wed, 21 Feb 2024 14:54:26 -0800 Subject: [PATCH] add tutorial for chatbot with rag (#2141) * add tutorial for chatbot with rag Signed-off-by: Yaliang Wu * Apply suggestions from code review Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: Yaliang Wu * address comment Signed-off-by: Yaliang Wu --------- Signed-off-by: Yaliang Wu Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --- .../agent_framework/Chatbot_with_RAG.md | 317 ++++++++++++++++++ 1 file changed, 317 insertions(+) create mode 100644 docs/tutorials/agent_framework/Chatbot_with_RAG.md diff --git a/docs/tutorials/agent_framework/Chatbot_with_RAG.md b/docs/tutorials/agent_framework/Chatbot_with_RAG.md new file mode 100644 index 0000000000..e697ea2ee5 --- /dev/null +++ b/docs/tutorials/agent_framework/Chatbot_with_RAG.md @@ -0,0 +1,317 @@ +# Topic + +> Agent Framework is an experimental feature released in OpenSearch 2.12 and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161). + +> This tutorial doesn't explain what retrieval-augmented generation (RAG) is. + +One of the known limitations of large language models (LLMs) is that their knowledge base only contains information up to the time when the LLMs were trained. +LLMs have no knowledge of recent events or your internal data. +You can augment the LLM knowledge base by using retrieval-augmented generation (RAG). + +This tutorial explains how to build your own chatbot using the Agent Framework and RAG, supplementing the LLM knowledge with information contained in OpenSearch indexes. + +Note: Replace the placeholders that start with `your_` with your own values. + +# Steps + +## 0. Preparation + +Follow step 0 and step 1 of [RAG_with_conversational_flow_agent](./RAG_with_conversational_flow_agent.md) to set up +the `test_population_data` index to provide supplementary information to the LLM. The index contains population data of US cities. + +Note the embedding model ID, which you'll use in the next steps. + +Create an ingest pipeline: +``` +PUT /_ingest/pipeline/test_tech_news_pipeline +{ + "description": "text embedding pipeline for tech news", + "processors": [ + { + "text_embedding": { + "model_id": "your_text_embedding_model_id", + "field_map": { + "passage": "passage_embedding" + } + } + } + ] +} +``` + +Next, create another index named `test_tech_news`, which contains recent tech news: + +``` +PUT test_tech_news +{ + "mappings": { + "properties": { + "passage": { + "type": "text" + }, + "passage_embedding": { + "type": "knn_vector", + "dimension": 384 + } + } + }, + "settings": { + "index": { + "knn.space_type": "cosinesimil", + "default_pipeline": "test_tech_news_pipeline", + "knn": "true" + } + } +} +``` + +Ingest data. +- The first two documents are from Wikipedia([Apple Vision Pro](https://en.wikipedia.org/wiki/Apple_Vision_Pro), [LLaMA](https://en.wikipedia.org/wiki/LLaMA)). +- The third one is from [Amazon Bedrock documentation](https://aws.amazon.com/bedrock/) +``` +POST _bulk +{"index":{"_index":"test_tech_news"}} +{"c":"Apple Vision Pro is a mixed-reality headset developed by Apple Inc. It was announced on June 5, 2023, at Apple's Worldwide Developers Conference, and pre-orders began on January 19, 2024. It became available for purchase on February 2, 2024, in the United States.[10] A worldwide launch has yet to be scheduled. The Vision Pro is Apple's first new major product category since the release of the Apple Watch in 2015.[11]\n\nApple markets the Vision Pro as a \"spatial computer\" where digital media is integrated with the real world. Physical inputs—such as motion gestures, eye tracking, and speech recognition—can be used to interact with the system.[10] Apple has avoided marketing the device as a virtual reality headset, along with the use of the terms \"virtual reality\" and \"augmented reality\" when discussing the product in presentations and marketing.[12]\n\nThe device runs visionOS,[13] a mixed-reality operating system derived from iOS frameworks using a 3D user interface; it supports multitasking via windows that appear to float within the user's surroundings,[14] as seen by cameras built into the headset. A dial on the top of the headset can be used to mask the camera feed with a virtual environment to increase immersion. The OS supports avatars (officially called \"Personas\"), which are generated by scanning the user's face; a screen on the front of the headset displays a rendering of the avatar's eyes (\"EyeSight\"), which are used to indicate the user's level of immersion to bystanders, and assist in communication.[15]"} +{"index":{"_index":"test_tech_news"}} +{"passage":"LLaMA (Large Language Model Meta AI) is a family of autoregressive large language models (LLMs), released by Meta AI starting in February 2023.\n\nFor the first version of LLaMA, four model sizes were trained: 7, 13, 33, and 65 billion parameters. LLaMA's developers reported that the 13B parameter model's performance on most NLP benchmarks exceeded that of the much larger GPT-3 (with 175B parameters) and that the largest model was competitive with state of the art models such as PaLM and Chinchilla.[1] Whereas the most powerful LLMs have generally been accessible only through limited APIs (if at all), Meta released LLaMA's model weights to the research community under a noncommercial license.[2] Within a week of LLaMA's release, its weights were leaked to the public on 4chan via BitTorrent.[3]\n\nIn July 2023, Meta released several models as Llama 2, using 7, 13 and 70 billion parameters.\n\nLLaMA-2\n\nOn July 18, 2023, in partnership with Microsoft, Meta announced LLaMA-2, the next generation of LLaMA. Meta trained and released LLaMA-2 in three model sizes: 7, 13, and 70 billion parameters.[4] The model architecture remains largely unchanged from that of LLaMA-1 models, but 40% more data was used to train the foundational models.[5] The accompanying preprint[5] also mentions a model with 34B parameters that might be released in the future upon satisfying safety targets.\n\nLLaMA-2 includes both foundational models and models fine-tuned for dialog, called LLaMA-2 Chat. In further departure from LLaMA-1, all models are released with weights, and are free for many commercial use cases. However, due to some remaining restrictions, the description of LLaMA as open source has been disputed by the Open Source Initiative (known for maintaining the Open Source Definition).[6]\n\nIn November 2023, research conducted by Patronus AI, an artificial intelligence startup company, compared performance of LLaMA-2, OpenAI's GPT-4 and GPT-4-Turbo, and Anthropic's Claude2 on two versions of a 150-question test about information in SEC filings (e.g. Form 10-K, Form 10-Q, Form 8-K, earnings reports, earnings call transcripts) submitted by public companies to the agency where one version of the test required the generative AI models to use a retrieval system to locate the specific SEC filing to answer the questions while the other version provided the specific SEC filing to the models to answer the question (i.e. in a long context window). On the retrieval system version, GPT-4-Turbo and LLaMA-2 both failed to produce correct answers to 81% of the questions, while on the long context window version, GPT-4-Turbo and Claude-2 failed to produce correct answers to 21% and 24% of the questions respectively.[7][8]"} +{"index":{"_index":"test_tech_news"}} +{"passage":"Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon via a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI. Using Amazon Bedrock, you can easily experiment with and evaluate top FMs for your use case, privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG), and build agents that execute tasks using your enterprise systems and data sources. Since Amazon Bedrock is serverless, you don't have to manage any infrastructure, and you can securely integrate and deploy generative AI capabilities into your applications using the AWS services you are already familiar with."} + +``` +## 1. Create LLM + +Follow step 2 (Prepare LLM) of [RAG_with_conversational_flow_agent](./RAG_with_conversational_flow_agent.md) to set up the Bedrock Claude model. + +Note the model ID, you will use it in following steps. + +## 2. Create Agent +Create an agent of the `conversational` type. + +Both `conversational_flow` and `conversational` agents support conversation history. + +The `conversational_flow` and `conversational` agents differ in the following ways: +- `conversational_flow` agent runs tools sequentially, in a predefined order. +- `conversational` agent dynamically chooses which tool to run next. + +In this tutorial, the agent includes two tools that provide recent population data and tech news. + +Explanation: + +- `"max_iteration": 5`: Agent runs the LLM a maximum of 5 times. +- `"response_filter": "$.completion"` is to retrieve the LLM answer from the Bedrock Claude model response. +- `"doc_size": 3` (in `population_data_knowledge_base`) specifies to return the top 3 documents. +``` +POST _plugins/_ml/agents/_register +{ + "name": "Chat Agent with RAG", + "type": "conversational", + "description": "this is a test agent", + "llm": { + "model_id": "your_llm_model_id", + "parameters": { + "max_iteration": 5, + "response_filter": "$.completion" + } + }, + "memory": { + "type": "conversation_index" + }, + "tools": [ + { + "type": "VectorDBTool", + "name": "population_data_knowledge_base", + "description": "This tool provides population data of US cities.", + "parameters": { + "input": "${parameters.question}", + "index": "test_population_data", + "source_field": [ + "population_description" + ], + "model_id": "your_text_embedding_model_id", + "embedding_field": "population_description_embedding", + "doc_size": 3 + } + }, + { + "type": "VectorDBTool", + "name": "tech_news_knowledge_base", + "description": "This tool provides recent tech news.", + "parameters": { + "input": "${parameters.question}", + "index": "test_tech_news", + "source_field": [ + "passage" + ], + "model_id": "your_text_embedding_model_id", + "embedding_field": "passage_embedding", + "doc_size": 2 + } + } + ], + "app_type": "chat_with_rag" +} +``` +Note the agent ID; you will use it in the next step. + +## 3. Test Agent + +The `conversational` agent supports a `verbose` option. You can set `verbose` to `true` to obtain detailed steps. + +Alternatively, you can use the Get Trace Data API: +``` +GET _plugins/_ml/memory/message/message_id/traces +``` + +### 3.1 Start new conversation +- Example 1: Ask a question related to tech news: +``` +POST _plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "What's vision pro", + "verbose": true + } +} +``` +In the response, note that the agent runs the `tech_news_knowledge_base` tool to obtain the top 2 documents. The agent then passes these documents as context to the LLM. +The LLM uses the context to answer the question correctly. +``` +{ + "inference_results": [ + { + "output": [ + { + "name": "memory_id", + "result": "eLVSxI0B8vrNLhb9nxto" + }, + { + "name": "parent_interaction_id", + "result": "ebVSxI0B8vrNLhb9nxty" + }, + { + "name": "response", + "result": """{ + "thought": "I don't have enough context to answer the question directly. Let me check the tech_news_knowledge_base tool to see if it can provide more information.", + "action": "tech_news_knowledge_base", + "action_input": "{\"query\":\"What's vision pro\"}" +}""" + }, + { + "name": "response", + "result": """{"_index":"test_tech_news","_source":{"passage":"Apple Vision Pro is a mixed-reality headset developed by Apple Inc. It was announced on June 5, 2023, at Apple\u0027s Worldwide Developers Conference, and pre-orders began on January 19, 2024. It became available for purchase on February 2, 2024, in the United States.[10] A worldwide launch has yet to be scheduled. The Vision Pro is Apple\u0027s first new major product category since the release of the Apple Watch in 2015.[11]\n\nApple markets the Vision Pro as a \"spatial computer\" where digital media is integrated with the real world. Physical inputs—such as motion gestures, eye tracking, and speech recognition—can be used to interact with the system.[10] Apple has avoided marketing the device as a virtual reality headset, along with the use of the terms \"virtual reality\" and \"augmented reality\" when discussing the product in presentations and marketing.[12]\n\nThe device runs visionOS,[13] a mixed-reality operating system derived from iOS frameworks using a 3D user interface; it supports multitasking via windows that appear to float within the user\u0027s surroundings,[14] as seen by cameras built into the headset. A dial on the top of the headset can be used to mask the camera feed with a virtual environment to increase immersion. The OS supports avatars (officially called \"Personas\"), which are generated by scanning the user\u0027s face; a screen on the front of the headset displays a rendering of the avatar\u0027s eyes (\"EyeSight\"), which are used to indicate the user\u0027s level of immersion to bystanders, and assist in communication.[15]"},"_id":"lrU8xI0B8vrNLhb9yBpV","_score":0.6700683} +{"_index":"test_tech_news","_source":{"passage":"Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon via a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI. Using Amazon Bedrock, you can easily experiment with and evaluate top FMs for your use case, privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG), and build agents that execute tasks using your enterprise systems and data sources. Since Amazon Bedrock is serverless, you don\u0027t have to manage any infrastructure, and you can securely integrate and deploy generative AI capabilities into your applications using the AWS services you are already familiar with."},"_id":"mLU8xI0B8vrNLhb9yBpV","_score":0.5604863} +""" + }, + { + "name": "response", + "result": "Vision Pro is a mixed-reality headset developed by Apple that was announced in 2023. It uses cameras and sensors to overlay digital objects and information on the real world. The device runs an operating system called visionOS that allows users to interact with windows and apps in a 3D environment using gestures, eye tracking, and voice commands." + } + ] + } + ] +} +``` +Alternatively, you can check the detailed steps in the trace data: +``` +GET _plugins/_ml/memory/message/ebVSxI0B8vrNLhb9nxty/traces +``` + +- Example 2: Ask a question related to city population: +``` +POST _plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "What's the population of Seattle 2023", + "verbose": true + } +} +``` +In the response, note that the agent runs the `population_data_knowledge_base` tool to obtain the top 3 documents. The agent then passes these documents as context to the LLM. +Then LLM can answer the question correctly. +``` +{ + "inference_results": [ + { + "output": [ + { + "name": "memory_id", + "result": "l7VUxI0B8vrNLhb9sRuQ" + }, + { + "name": "parent_interaction_id", + "result": "mLVUxI0B8vrNLhb9sRub" + }, + { + "name": "response", + "result": """{ + "thought": "Let me check the population data tool to find the most recent population estimate for Seattle", + "action": "population_data_knowledge_base", + "action_input": "{\"city\":\"Seattle\"}" +}""" + }, + { + "name": "response", + "result": """{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the Seattle metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Seattle in 2023 is 3,519,000, a 0.86% increase from 2022.\\nThe metro area population of Seattle in 2022 was 3,489,000, a 0.81% increase from 2021.\\nThe metro area population of Seattle in 2021 was 3,461,000, a 0.82% increase from 2020.\\nThe metro area population of Seattle in 2020 was 3,433,000, a 0.79% increase from 2019."},"_id":"BxF5vo0BubpYKX5ER0fT","_score":0.65775126} +{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the Seattle metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Seattle in 2023 is 3,519,000, a 0.86% increase from 2022.\\nThe metro area population of Seattle in 2022 was 3,489,000, a 0.81% increase from 2021.\\nThe metro area population of Seattle in 2021 was 3,461,000, a 0.82% increase from 2020.\\nThe metro area population of Seattle in 2020 was 3,433,000, a 0.79% increase from 2019."},"_id":"7DrZvo0BVR2NrurbRIAE","_score":0.65775126} +{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."},"_id":"AxF5vo0BubpYKX5ER0fT","_score":0.56461215} +""" + }, + { + "name": "response", + "result": "According to the population data tool, the population of Seattle in 2023 is approximately 3,519,000 people, a 0.86% increase from 2022." + } + ] + } + ] +} +``` + +### 3.2 Continue a conversation +``` +POST _plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "What's the population of Austin 2023, compare with Seattle", + "memory_id": "l7VUxI0B8vrNLhb9sRuQ", + "verbose": true + } +} +``` +In the response, note that the `population_data_knowledge_base` doesn't return the population of Seattle. Instead, the agent learns the population of Seattle from historical messages: +``` +{ + "inference_results": [ + { + "output": [ + { + "name": "memory_id", + "result": "l7VUxI0B8vrNLhb9sRuQ" + }, + { + "name": "parent_interaction_id", + "result": "B7VkxI0B8vrNLhb9mxy0" + }, + { + "name": "response", + "result": """{ + "thought": "Let me check the population data tool first", + "action": "population_data_knowledge_base", + "action_input": "{\"city\":\"Austin\",\"year\":2023}" +}""" + }, + { + "name": "response", + "result": """{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the Austin metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Austin in 2023 is 2,228,000, a 2.39% increase from 2022.\\nThe metro area population of Austin in 2022 was 2,176,000, a 2.79% increase from 2021.\\nThe metro area population of Austin in 2021 was 2,117,000, a 3.12% increase from 2020.\\nThe metro area population of Austin in 2020 was 2,053,000, a 3.43% increase from 2019."},"_id":"BhF5vo0BubpYKX5ER0fT","_score":0.69129956} +{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the Austin metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Austin in 2023 is 2,228,000, a 2.39% increase from 2022.\\nThe metro area population of Austin in 2022 was 2,176,000, a 2.79% increase from 2021.\\nThe metro area population of Austin in 2021 was 2,117,000, a 3.12% increase from 2020.\\nThe metro area population of Austin in 2020 was 2,053,000, a 3.43% increase from 2019."},"_id":"6zrZvo0BVR2NrurbRIAE","_score":0.69129956} +{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."},"_id":"AxF5vo0BubpYKX5ER0fT","_score":0.61015373} +""" + }, + { + "name": "response", + "result": "According to the population data tool, the population of Austin in 2023 is approximately 2,228,000 people, a 2.39% increase from 2022. This is lower than the population of Seattle in 2023 which is approximately 3,519,000 people, a 0.86% increase from 2022." + } + ] + } + ] +} +``` \ No newline at end of file