Skip to content

Commit

Permalink
update llm endpoint validation commands (opea-project#869)
Browse files Browse the repository at this point in the history
Signed-off-by: letonghan <[email protected]>
  • Loading branch information
letonghan authored Nov 8, 2024
1 parent ca6a4e3 commit 75eb864
Show file tree
Hide file tree
Showing 5 changed files with 24 additions and 36 deletions.
8 changes: 4 additions & 4 deletions comps/llms/summarization/tgi/langchain/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,10 @@ docker run -p 8008:80 -v ./data:/data --name llm-docsum-tgi --shm-size 1g ghcr.i
### 1.3 Verify the TGI Service

```bash
curl http://${your_ip}:8008/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
-H 'Content-Type: application/json'
curl http://${your_ip}:8008/v1/chat/completions \
-X POST \
-d '{"model": ${your_hf_llm_model}, "messages": [{"role": "user", "content": "What is Deep Learning?"}], "max_tokens":17}' \
-H 'Content-Type: application/json'
```

### 1.4 Start LLM Service with Python Script
Expand Down
20 changes: 8 additions & 12 deletions comps/llms/text-generation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -270,23 +270,19 @@ curl http://${your_ip}:9000/v1/health_check\
#### 3.2.1 Verify the TGI Service

```bash
curl http://${your_ip}:8008/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
-H 'Content-Type: application/json'
curl http://${your_ip}:8008/v1/chat/completions \
-X POST \
-d '{"model": ${your_hf_llm_model}, "messages": [{"role": "user", "content": "What is Deep Learning?"}], "max_tokens":17}' \
-H 'Content-Type: application/json'
```

#### 3.2.2 Verify the vLLM Service

```bash
curl http://${your_ip}:8008/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": ${your_hf_llm_model},
"prompt": "What is Deep Learning?",
"max_tokens": 32,
"temperature": 0
}'
curl http://${host_ip}:8008/v1/chat/completions \
-X POST \
-H "Content-Type: application/json" \
-d '{"model": ${your_hf_llm_model}, "messages": [{"role": "user", "content": "What is Deep Learning?"}]}'
```

### 3.3 Consume LLM Service
Expand Down
8 changes: 4 additions & 4 deletions comps/llms/text-generation/tgi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,10 @@ docker run -p 8008:80 -v ./data:/data --name tgi_service --shm-size 1g ghcr.io/h
### 1.3 Verify the TGI Service

```bash
curl http://${your_ip}:8008/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
-H 'Content-Type: application/json'
curl http://${your_ip}:8008/v1/chat/completions \
-X POST \
-d '{"model": ${your_hf_llm_model}, "messages": [{"role": "user", "content": "What is Deep Learning?"}], "max_tokens":17}' \
-H 'Content-Type: application/json'
```

### 1.4 Start LLM Service with Python Script
Expand Down
12 changes: 4 additions & 8 deletions comps/llms/text-generation/vllm/langchain/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,14 +186,10 @@ OpenVINO best known configuration for GPU is:
And then you can make requests like below to check the service status:

```bash
curl http://${your_ip}:8008/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/Meta-Llama-3-8B-Instruct",
"prompt": "What is Deep Learning?",
"max_tokens": 32,
"temperature": 0
}'
curl http://${host_ip}:9009/v1/chat/completions \
-X POST \
-H "Content-Type: application/json" \
-d '{"model": "meta-llama/Meta-Llama-3-8B-Instruct", "messages": [{"role": "user", "content": "What is Deep Learning?"}]}'
```

## 🚀3. Set up LLM microservice
Expand Down
12 changes: 4 additions & 8 deletions comps/llms/text-generation/vllm/llama_index/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,14 +153,10 @@ OpenVINO best known configuration is:
And then you can make requests like below to check the service status:

```bash
curl http://${your_ip}:8008/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/Meta-Llama-3-8B-Instruct",
"prompt": "What is Deep Learning?",
"max_tokens": 32,
"temperature": 0
}'
curl http://${host_ip}:8008/v1/chat/completions \
-X POST \
-H "Content-Type: application/json" \
-d '{"model": "meta-llama/Meta-Llama-3-8B-Instruct", "messages": [{"role": "user", "content": "What is Deep Learning?"}]}'
```

## 🚀3. Set up LLM microservice
Expand Down

0 comments on commit 75eb864

Please sign in to comment.