Skip to content

Commit

Permalink
Update Retrievers README (#1233)
Browse files Browse the repository at this point in the history
* Update Retrievers README

Add a main README in comps/retrievers/README.md, and contains 9 READMEs
for each vectordb.

Signed-off-by: letonghan <[email protected]>
  • Loading branch information
letonghan authored Jan 24, 2025
1 parent f6d4601 commit c94020e
Show file tree
Hide file tree
Showing 11 changed files with 1,135 additions and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ The initially supported `Microservices` are described in the below table. More `
| ------------------------------------------------- | ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------- | ------ | ------------------------------------- |
| [Embedding](./comps/embeddings/src/README.md) | [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) | [TEI-Gaudi](https://github.com/huggingface/tei-gaudi) | Gaudi2 | Embedding on Gaudi2 |
| [Embedding](./comps/embeddings/src/README.md) | [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) | [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon | Embedding on Xeon CPU |
| [Retriever](./comps/retrievers/src/README.md) | [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) | [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon | Retriever on Xeon CPU |
| [Retriever](./comps/retrievers/README.md) | [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) | [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon | Retriever on Xeon CPU |
| [Reranking](./comps/rerankings/src/README.md) | [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | [TEI-Gaudi](https://github.com/huggingface/tei-gaudi) | Gaudi2 | Reranking on Gaudi2 |
| [Reranking](./comps/rerankings/src/README.md) | [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BBAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon | Reranking on Xeon CPU |
| [ASR](./comps/asr/src/README.md) | NA | [openai/whisper-small](https://huggingface.co/openai/whisper-small) | NA | Gaudi2 | Audio-Speech-Recognition on Gaudi2 |
Expand Down
36 changes: 36 additions & 0 deletions comps/retrievers/src/README.md → comps/retrievers/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,39 @@ This retriever microservice is a highly efficient search service designed for ha
The service primarily utilizes similarity measures in vector space to rapidly retrieve contentually similar documents. The vector-based retrieval approach is particularly suited for handling large datasets, offering fast and accurate search results that significantly enhance the efficiency and quality of information retrieval.

Overall, this microservice provides robust backend support for applications requiring efficient similarity searches, playing a vital role in scenarios such as recommendation systems, information retrieval, or any other context where precise measurement of document similarity is crucial.

## Retriever Microservice with Redis

For details, please refer to this [readme](src/README_redis.md)

## Retriever Microservice with Milvus

For details, please refer to this [readme](src/README_milvus.md)

## Retriever Microservice with Qdrant

For details, please refer to this [readme](src/README_qdrant.md)

## Retriever Microservice with PGVector

For details, please refer to this [readme](src/README_pgvector.md)

## Retriever Microservice with VDMS

For details, please refer to this [readme](src/README_vdms.md)

## Retriever Microservice with ElasticSearch

For details, please refer to this [readme](src/README_elasticsearch.md)

## Retriever Microservice with OpenSearch

For details, please refer to this [readme](src/README_opensearch.md)

## Retriever Microservice with neo4j

For details, please refer to this [readme](src/README_neo4j.md)

## Retriever Microservice with Pathway

For details, please refer to this [readme](src/README_pathway.md)
117 changes: 117 additions & 0 deletions comps/retrievers/src/README_elasticsearch.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Retriever Microservice

This retriever microservice is a highly efficient search service designed for handling and retrieving embedding vectors.
It operates by receiving an embedding vector as input and conducting a similarity search against vectors stored in a
VectorDB database. Users must specify the VectorDB's URL and the index name, and the service searches within that index
to find documents with the highest similarity to the input vector.

The service primarily utilizes similarity measures in vector space to rapidly retrieve contentually similar documents.
The vector-based retrieval approach is particularly suited for handling large datasets, offering fast and accurate
search results that significantly enhance the efficiency and quality of information retrieval.

Overall, this microservice provides robust backend support for applications requiring efficient similarity searches,
playing a vital role in scenarios such as recommendation systems, information retrieval, or any other context where
precise measurement of document similarity is crucial.

## 🚀1. Start Microservice with Python (Option 1)

To start the retriever microservice, you must first install the required python packages.

### 1.1 Install Requirements

```bash
pip install -r requirements.txt
```

### 1.2 Start TEI Service

```bash
model=BAAI/bge-base-en-v1.5
volume=$PWD/data
docker run -d -p 6060:80 -v $volume:/data -e http_proxy=$http_proxy -e https_proxy=$https_proxy --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 --model-id $model
```

### 1.3 Verify the TEI Service

Health check the embedding service with:

```bash
curl 127.0.0.1:6060/embed \
-X POST \
-d '{"inputs":"What is Deep Learning?"}' \
-H 'Content-Type: application/json'
```

### 1.4 Setup VectorDB Service

Please refer to this [readme](../../third_parties/elasticsearch/src/README.md).

### 1.5 Start Retriever Service

```bash
export TEI_EMBEDDING_ENDPOINT="http://${your_ip}:6060"
export RETRIEVER_COMPONENT_NAME="OPEA_RETRIEVER_ELASTICSEARCH"
python opea_retrievers_microservice.py
```

## 🚀2. Start Microservice with Docker (Option 2)

### 2.1 Setup Environment Variables

```bash
export EMBED_MODEL="BAAI/bge-base-en-v1.5"
export ES_CONNECTION_STRING="http://localhost:9200"
export INDEX_NAME=${your_index_name}
export TEI_EMBEDDING_ENDPOINT="http://${your_ip}:6060"
export RETRIEVER_COMPONENT_NAME="OPEA_RETRIEVER_ELASTICSEARCH"
```

### 2.2 Build Docker Image

```bash
cd ../../../
docker build -t opea/retriever:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/src/Dockerfile .
```

To start a docker container, you have two options:

- A. Run Docker with CLI
- B. Run Docker with Docker Compose

You can choose one as needed.

### 2.3 Run Docker with CLI (Option A)

```bash
docker run -d --name="retriever-elasticsearch" -p 7000:7000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e ES_CONNECTION_STRING=$ES_CONNECTION_STRING -e INDEX_NAME=$INDEX_NAME -e TEI_ENDPOINT=$TEI_ENDPOINT opea/retriever:latest
```

### 2.4 Run Docker with Docker Compose (Option B)

```bash
cd ../deployment/docker_compose
export service_name="retriever-elasticsearch"
docker compose -f compose.yaml up ${service_name} -d
```

## 🚀3. Consume Retriever Service

### 3.1 Check Service Status

```bash
curl http://localhost:7000/v1/health_check \
-X GET \
-H 'Content-Type: application/json'
```

### 3.2 Consume Embedding Service

To consume the Retriever Microservice, you can generate a mock embedding vector of length 768 with Python.

```bash
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://${your_ip}:7000/v1/retrieval \
-X POST \
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding}}" \
-H 'Content-Type: application/json'
```
112 changes: 112 additions & 0 deletions comps/retrievers/src/README_milvus.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# Retriever Microservice with Milvus

## 🚀Start Microservice with Python

### Install Requirements

```bash
pip install -r requirements.txt
```

### Start Milvus Server

Please refer to this [readme](../../third_parties/milvus/src/README.md).

### Setup Environment Variables

```bash
export no_proxy=${your_no_proxy}
export http_proxy=${your_http_proxy}
export https_proxy=${your_http_proxy}
export MILVUS_HOST=${your_milvus_host_ip}
export MILVUS_PORT=19530
export COLLECTION_NAME=${your_collection_name}
export TEI_EMBEDDING_ENDPOINT=${your_emdding_endpoint}
```

### Start Retriever Service

```bash
export TEI_EMBEDDING_ENDPOINT="http://${your_ip}:6060"
export RETRIEVER_COMPONENT_NAME="OPEA_RETRIEVER_MILVUS"
python opea_retrievers_microservice.py
```

## 🚀Start Microservice with Docker

### Build Docker Image

```bash
cd ../../
docker build -t opea/retriever:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/src/Dockerfile .
```

### Run Docker with CLI (Option A)

```bash
docker run -d --name="retriever-milvus-server" -p 7000:7000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e TEI_EMBEDDING_ENDPOINT=${your_emdding_endpoint} -e MILVUS_HOST=${your_milvus_host_ip} -e RETRIEVER_COMPONENT_NAME=$RETRIEVER_COMPONENT_NAME opea/retriever:latest
```

### Run Docker with Docker Compose (Option B)

```bash
cd ../deployment/docker_compose
export service_name="retriever-milvus"
docker compose -f compose.yaml up ${service_name} -d
```

## 🚀3. Consume Retriever Service

### 3.1 Check Service Status

```bash
curl http://${your_ip}:7000/v1/health_check \
-X GET \
-H 'Content-Type: application/json'
```

### 3.2 Consume Embedding Service

To consume the Retriever Microservice, you can generate a mock embedding vector of length 768 with Python.

```bash
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://${your_ip}:7000/v1/retrieval \
-X POST \
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding}}" \
-H 'Content-Type: application/json'
```

You can set the parameters for the retriever.

```bash
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://localhost:7000/v1/retrieval \
-X POST \
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"similarity\", \"k\":4}" \
-H 'Content-Type: application/json'
```

```bash
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://localhost:7000/v1/retrieval \
-X POST \
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"similarity_distance_threshold\", \"k\":4, \"distance_threshold\":1.0}" \
-H 'Content-Type: application/json'
```

```bash
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://localhost:7000/v1/retrieval \
-X POST \
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"similarity_score_threshold\", \"k\":4, \"score_threshold\":0.2}" \
-H 'Content-Type: application/json'
```

```bash
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://localhost:7000/v1/retrieval \
-X POST \
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"mmr\", \"k\":4, \"fetch_k\":20, \"lambda_mult\":0.5}" \
-H 'Content-Type: application/json'
```
94 changes: 94 additions & 0 deletions comps/retrievers/src/README_neo4j.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Retriever Microservice with Neo4J

This retrieval microservice is intended for use in GraphRAG pipeline and assumes a GraphRAGStore containing document graph, entity_info and Community Symmaries already exist. Please refer to the GenAIExamples/GraphRAG example.

Retrieval follows these steps:

- Uses similarty to find the relevant entities to the input query. Retrieval is done over the neo4j index that natively supports embeddings.
- Uses Cypher queries to retrieve the community summaries for all the communities the entities belong to.
- Generates a partial answer to the query for each community summary. This will later be used as context to generate a final query response. Please refer to [GenAIExamples/GraphRAG](https://github.com/opea-project/GenAIExamples).

## 🚀Start Microservice with Docker

### 1. Build Docker Image

```bash
cd ../../../
docker build -t opea/retriever:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/src/Dockerfile .
```

### 2. Install Requirements

```bash
pip install -r requirements.txt
```

### 3. Start Neo4j VectorDB Service

```bash
docker run \
-p 7474:7474 -p 7687:7687 \
-v $PWD/data:/data -v $PWD/plugins:/plugins \
--name neo4j-apoc \
-d \
-e NEO4J_AUTH=neo4j/password \
-e NEO4J_PLUGINS=\[\"apoc\"\] \
neo4j:latest
```

### 2. Setup Environment Variables

```bash
# Set private environment settings
export host_ip=${your_hostname IP} # local IP
export no_proxy=$no_proxy,${host_ip} # important to add {host_ip} for containers communication
export http_proxy=${your_http_proxy}
export https_proxy=${your_http_proxy}
export NEO4J_URI=${your_neo4j_url}
export NEO4J_USERNAME=${your_neo4j_username}
export NEO4J_PASSWORD=${your_neo4j_password}
export PYTHONPATH=${path_to_comps}
export OPENAI_KEY=${your_openai_api_key} # optional, when not provided will use smaller models TGI/TEI
export HUGGINGFACEHUB_API_TOKEN=${your_hf_token}
# set additional environment settings
export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
export OPENAI_EMBEDDING_MODEL="text-embedding-3-small"
export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
export OPENAI_LLM_MODEL="gpt-4o"
export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:6006"
export TGI_LLM_ENDPOINT="http://${host_ip}:6005"
export NEO4J_URL="bolt://${host_ip}:7687"
export NEO4J_USERNAME=neo4j
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6004/v1/dataprep"
export LOGFLAG=True
```

### 3. Run Docker with Docker Compose

Docker compose will start 5 microservices: retriever-neo4j-llamaindex, dataprep-neo4j-llamaindex, neo4j-apoc, tgi-gaudi-service and tei-embedding-service. Neo4j database supports embeddings natively so we do not need a separate vector store. Checkout the blog [Introducing the Property Graph Index: A Powerful New Way to Build Knowledge Graphs with LLMs](https://www.llamaindex.ai/blog/introducing-the-property-graph-index-a-powerful-new-way-to-build-knowledge-graphs-with-llms) for a better understanding of Property Graph Store and Index.

```bash
cd ../deployment/docker_compose
export service_name="retriever-neo4j"
docker compose -f compose.yaml up ${service_name} -d
```

## Invoke Microservice

### 3.1 Check Service Status

```bash
curl http://${host_ip}:7000/v1/health_check \
-X GET \
-H 'Content-Type: application/json'
```

### 3.2 Consume Retriever Service

If OPEN_AI_KEY is provided it will use OPENAI endpoints for LLM and Embeddings otherwise will use TGI and TEI endpoints. If a model name not provided in the request it will use the default specified by the set_env.sh script.

```bash
curl -X POST http://${host_ip}:7000/v1/retrieval \
-H "Content-Type: application/json" \
-d '{"model": "gpt-3.5-turbo","messages": [{"role": "user","content": "Who is John Brady and has he had any confrontations?"}]}'
```
Loading

0 comments on commit c94020e

Please sign in to comment.