Skip to content

Commit

Permalink
Merge pull request #103 from ymcui/privategpt
Browse files Browse the repository at this point in the history
Add privateGPT support
  • Loading branch information
ymcui authored Aug 9, 2023
2 parents 8c8506f + cbd188a commit b6aee5b
Show file tree
Hide file tree
Showing 5 changed files with 241 additions and 6 deletions.
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
- 🚀 针对Llama-2模型扩充了**新版中文词表**,开源了中文LLaMA-2和Alpaca-2大模型
- 🚀 开源了预训练脚本、指令精调脚本,用户可根据需要进一步训练模型
- 🚀 使用个人电脑的CPU/GPU快速在本地进行大模型量化和部署体验
- 🚀 支持[🤗transformers](https://github.com/huggingface/transformers), [llama.cpp](https://github.com/ggerganov/llama.cpp), [text-generation-webui](https://github.com/oobabooga/text-generation-webui), [LangChain](https://github.com/hwchase17/langchain), [vLLM](https://github.com/vllm-project/vllm)等LLaMA生态
- 🚀 支持[🤗transformers](https://github.com/huggingface/transformers), [llama.cpp](https://github.com/ggerganov/llama.cpp), [text-generation-webui](https://github.com/oobabooga/text-generation-webui), [LangChain](https://github.com/hwchase17/langchain), [privateGPT](https://github.com/imartinez/privateGPT), [vLLM](https://github.com/vllm-project/vllm)等LLaMA生态
- 目前已开源的模型:Chinese-LLaMA-2-7B, Chinese-Alpaca-2-7B (更大的模型可先参考[一期项目](https://github.com/ymcui/Chinese-LLaMA-Alpaca))

![](./pics/screencast.gif)
Expand Down Expand Up @@ -133,11 +133,10 @@
| [**仿OpenAI API调用**](https://platform.openai.com/docs/api-reference) | 仿OpenAI API接口的服务器Demo ||||||| [link](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/api_calls_zh) |
| [**text-generation-webui**](https://github.com/oobabooga/text-generation-webui) | 前端Web UI界面的部署方式 ||||||| [link](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/text-generation-webui_zh) |
| [**LangChain**](https://github.com/hwchase17/langchain) | 适合二次开发的大模型应用开源框架 | ✅<sup>†</sup> || ✅<sup>†</sup> |||| [link](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/langchain_zh) |
| [**privateGPT**](https://github.com/imartinez/privateGPT) | 基于LangChain的多文档本地问答框架 ||||||| [link](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/privategpt_zh) |

<sup>†</sup>: LangChain框架支持,但教程中未实现;详细说明请参考LangChain官方文档。

⚠️ 一代模型相关推理与部署支持将陆续迁移到本项目,届时将同步更新相关教程。


## 系统效果

Expand Down
5 changes: 2 additions & 3 deletions README_EN.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ This project is based on the Llama-2, released by Meta, and it is the second gen
- 🚀 New extended Chinese vocabulary beyond Llama-2, open-sourcing the Chinese LLaMA-2 and Alpaca-2 LLMs.
- 🚀 Open-sourced the pre-training and instruction finetuning (SFT) scripts for further tuning on user's data
- 🚀 Quickly deploy and experience the quantized LLMs on CPU/GPU of personal PC
- 🚀 Support for LLaMA ecosystems like [🤗transformers](https://github.com/huggingface/transformers), [llama.cpp](https://github.com/ggerganov/llama.cpp), [text-generation-webui](https://github.com/oobabooga/text-generation-webui), [LangChain](https://github.com/hwchase17/langchain), [vLLM](https://github.com/vllm-project/vllm) etc.
- 🚀 Support for LLaMA ecosystems like [🤗transformers](https://github.com/huggingface/transformers), [llama.cpp](https://github.com/ggerganov/llama.cpp), [text-generation-webui](https://github.com/oobabooga/text-generation-webui), [LangChain](https://github.com/hwchase17/langchain), [privateGPT](https://github.com/imartinez/privateGPT), [vLLM](https://github.com/vllm-project/vllm) etc.
- The currently open-source models are Chinese-LLaMA-2-7B and Chinese-Alpaca-2-7B (check our [first-gen project](https://github.com/ymcui/Chinese-LLaMA-Alpaca) for more models).

![](./pics/screencast.gif)
Expand Down Expand Up @@ -127,11 +127,10 @@ The models in this project mainly support the following quantization, inference,
| [**OpenAI API Calls**](https://platform.openai.com/docs/api-reference) | A server that implements OpenAI API ||||||| [link](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/api_calls_en) |
| [**text-generation-webui**](https://github.com/oobabooga/text-generation-webui) | A tool for deploying model as a web UI ||||||| [link](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/text-generation-webui_en) |
| [**LangChain**](https://github.com/hwchase17/langchain) | LLM application development framework, suitable for secondary development | ✅<sup>†</sup> || ✅<sup>†</sup> |||| [link](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/langchain_en) |
| [**privateGPT**](https://github.com/imartinez/privateGPT) | LangChain-based multi-document QA framework ||||||| [link](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/privategpt_en) |

<sup>†</sup>: Supported by LangChain, but not implemented in the tutorial. Please refer to the official LangChain Documentation for details.

⚠️ Inference and deployment support related to the first-generation model will be gradually migrated to this project, and relevant tutorials will be updated later.

## System Performance

### Generation Performance Evaluation
Expand Down
19 changes: 19 additions & 0 deletions scripts/privategpt/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
## privateGPT相关示例脚本

具体使用方法参考:https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/privategpt_zh

Detailed usage: https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/privategpt_en

The following codes are adapted from https://github.com/imartinez/privateGPT/blob/main/privateGPT.py

### privateGPT.py

嵌套Alpaca-2指令模板的主程序入口示例代码。由于第三方库更新频繁,请勿直接使用。建议对照教程自行修改。

Example with Alpaca-2 template. Please do not use this script directly, as third-party library may change over time. Please follow our wiki to adapt to new code.

### privateGPT_refine.py

使用`refine`策略的主程序入口示例代码。

Example that uses `refine` strategy.
99 changes: 99 additions & 0 deletions scripts/privategpt/privateGPT.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
#!/usr/bin/env python3
from dotenv import load_dotenv
from langchain.chains import RetrievalQA
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.vectorstores import Chroma
from langchain.llms import GPT4All, LlamaCpp
import os
import argparse
import time

load_dotenv()

embeddings_model_name = os.environ.get("EMBEDDINGS_MODEL_NAME")
persist_directory = os.environ.get('PERSIST_DIRECTORY')

model_type = os.environ.get('MODEL_TYPE')
model_path = os.environ.get('MODEL_PATH')
model_n_ctx = os.environ.get('MODEL_N_CTX')
model_n_batch = int(os.environ.get('MODEL_N_BATCH', 8))
target_source_chunks = int(os.environ.get('TARGET_SOURCE_CHUNKS', 4))

from constants import CHROMA_SETTINGS

def main():
# Parse the command line arguments
args = parse_arguments()
embeddings = HuggingFaceEmbeddings(model_name=embeddings_model_name)
db = Chroma(persist_directory=persist_directory, embedding_function=embeddings, client_settings=CHROMA_SETTINGS)
retriever = db.as_retriever(search_kwargs={"k": target_source_chunks})
# activate/deactivate the streaming StdOut callback for LLMs
callbacks = [] if args.mute_stream else [StreamingStdOutCallbackHandler()]
# Prepare the LLM
match model_type:
case "LlamaCpp":
llm = LlamaCpp(model_path=model_path, max_tokens=model_n_ctx, n_ctx=model_n_ctx,
n_gpu_layers=1, n_batch=model_n_batch, callbacks=callbacks, n_threads=8, verbose=False)
case "GPT4All":
llm = GPT4All(model=model_path, max_tokens=model_n_ctx, backend='gptj', n_batch=model_n_batch, callbacks=callbacks, verbose=False)
case _default:
# raise exception if model_type is not supported
raise Exception(f"Model type {model_type} is not supported. Please choose one of the following: LlamaCpp, GPT4All")

# The followings are specifically designed for Chinese-Alpaca-2
# For detailed usage: https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/privategpt_en
alpaca2_prompt_template = (
"[INST] <<SYS>>\n"
"You are a helpful assistant. 你是一个乐于助人的助手。\n"
"<</SYS>>\n\n"
"{context}\n\n{question} [/INST]"
)
from langchain import PromptTemplate
input_with_prompt = PromptTemplate(template=alpaca2_prompt_template, input_variables=["context", "question"])

qa = RetrievalQA.from_chain_type(
llm=llm, chain_type="stuff", retriever=retriever,
return_source_documents= not args.hide_source,
chain_type_kwargs={"prompt": input_with_prompt})

# Interactive questions and answers
while True:
query = input("\nEnter a query: ")
if query == "exit":
break
if query.strip() == "":
continue

# Get the answer from the chain
start = time.time()
res = qa(query)
answer, docs = res['result'], [] if args.hide_source else res['source_documents']
end = time.time()

# Print the result
print("\n\n> Question:")
print(query)
print(f"\n> Answer (took {round(end - start, 2)} s.):")
print(answer)

# Print the relevant sources used for the answer
for document in docs:
print("\n> " + document.metadata["source"] + ":")
print(document.page_content)

def parse_arguments():
parser = argparse.ArgumentParser(description='privateGPT: Ask questions to your documents without an internet connection, '
'using the power of LLMs.')
parser.add_argument("--hide-source", "-S", action='store_true',
help='Use this flag to disable printing of source documents used for answers.')

parser.add_argument("--mute-stream", "-M",
action='store_true',
help='Use this flag to disable the streaming StdOut callback for LLMs.')

return parser.parse_args()


if __name__ == "__main__":
main()
119 changes: 119 additions & 0 deletions scripts/privategpt/privateGPT_refine.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
#!/usr/bin/env python3
from dotenv import load_dotenv
from langchain.chains import RetrievalQA
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.vectorstores import Chroma
from langchain.llms import GPT4All, LlamaCpp
import os
import argparse
import time

load_dotenv()

embeddings_model_name = os.environ.get("EMBEDDINGS_MODEL_NAME")
persist_directory = os.environ.get('PERSIST_DIRECTORY')

model_type = os.environ.get('MODEL_TYPE')
model_path = os.environ.get('MODEL_PATH')
model_n_ctx = os.environ.get('MODEL_N_CTX')
model_n_batch = int(os.environ.get('MODEL_N_BATCH', 8))
target_source_chunks = int(os.environ.get('TARGET_SOURCE_CHUNKS', 4))

from constants import CHROMA_SETTINGS

def main():
# Parse the command line arguments
args = parse_arguments()
embeddings = HuggingFaceEmbeddings(model_name=embeddings_model_name)
db = Chroma(persist_directory=persist_directory, embedding_function=embeddings, client_settings=CHROMA_SETTINGS)
retriever = db.as_retriever(search_kwargs={"k": target_source_chunks})
# activate/deactivate the streaming StdOut callback for LLMs
callbacks = [] if args.mute_stream else [StreamingStdOutCallbackHandler()]
# Prepare the LLM
match model_type:
case "LlamaCpp":
llm = LlamaCpp(model_path=model_path, max_tokens=model_n_ctx, n_ctx=model_n_ctx,
n_gpu_layers=1, n_batch=model_n_batch, callbacks=callbacks, n_threads=8, verbose=False)
case "GPT4All":
llm = GPT4All(model=model_path, max_tokens=model_n_ctx, backend='gptj', n_batch=model_n_batch, callbacks=callbacks, verbose=False)
case _default:
# raise exception if model_type is not supported
raise Exception(f"Model type {model_type} is not supported. Please choose one of the following: LlamaCpp, GPT4All")

# The followings are specifically designed for Chinese-Alpaca-2
# For detailed usage: https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/privategpt_en
alpaca2_refine_prompt_template = (
"[INST] <<SYS>>\n"
"You are a helpful assistant. 你是一个乐于助人的助手。\n"
"<</SYS>>\n\n"
"这是原始问题:{question}\n"
"已有的回答: {existing_answer}\n"
"现在还有一些文字,(如果有需要)你可以根据它们完善现有的回答。"
"\n\n{context_str}\n\n"
"请根据新的文段,进一步完善你的回答。 [/INST]"
)

alpaca2_initial_prompt_template = (
"[INST] <<SYS>>\n"
"You are a helpful assistant. 你是一个乐于助人的助手。\n"
"<</SYS>>\n\n"
"以下为背景知识:\n{context_str}\n"
"请根据以上背景知识,回答这个问题:{question} [/INST]"
)

from langchain import PromptTemplate
refine_prompt = PromptTemplate(
input_variables=["question", "existing_answer", "context_str"],
template=alpaca2_refine_prompt_template,
)
initial_qa_prompt = PromptTemplate(
input_variables=["context_str", "question"],
template=alpaca2_initial_prompt_template,
)
chain_type_kwargs = {"question_prompt": initial_qa_prompt, "refine_prompt": refine_prompt}
qa = RetrievalQA.from_chain_type(
llm=llm, chain_type="refine",
retriever=retriever, return_source_documents= not args.hide_source,
chain_type_kwargs=chain_type_kwargs)

# Interactive questions and answers
while True:
query = input("\nEnter a query: ")
if query == "exit":
break
if query.strip() == "":
continue

# Get the answer from the chain
start = time.time()
res = qa(query)
answer, docs = res['result'], [] if args.hide_source else res['source_documents']
end = time.time()

# Print the result
print("\n\n> Question:")
print(query)
print(f"\n> Answer (took {round(end - start, 2)} s.):")
print(answer)

# Print the relevant sources used for the answer
for document in docs:
print("\n> " + document.metadata["source"] + ":")
print(document.page_content)

def parse_arguments():
parser = argparse.ArgumentParser(description='privateGPT: Ask questions to your documents without an internet connection, '
'using the power of LLMs.')
parser.add_argument("--hide-source", "-S", action='store_true',
help='Use this flag to disable printing of source documents used for answers.')

parser.add_argument("--mute-stream", "-M",
action='store_true',
help='Use this flag to disable the streaming StdOut callback for LLMs.')

return parser.parse_args()


if __name__ == "__main__":
main()

0 comments on commit b6aee5b

Please sign in to comment.