Skip to content

Commit

Permalink
Document Sync by Tina
Browse files Browse the repository at this point in the history
  • Loading branch information
Chivier committed Nov 19, 2024
1 parent 51bb18e commit b36dea7
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 7 deletions.
5 changes: 5 additions & 0 deletions docs/stable/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ sidebar_position: 1

ServerlessLLM is a **fast** and **easy-to-use** serving system designed for **affordable** multi-LLM serving, also known as LLM-as-a-Service. ServerlessLLM is ideal for environments with multiple LLMs that need to be served on limited GPU resources, as it enables efficient dynamic loading of LLMs onto GPUs. By elastically scaling model instances and multiplexing GPUs, ServerlessLLM can significantly reduce costs compared to traditional GPU-dedicated serving systems while still providing low-latency (Time-to-First-Token, TTFT) LLM completions.

ServerlessLLM now supports NVIDIA and AMD GPUs, including following hardware:
* NVIDIA GPUs: Compute Capability 7.0+ (e.g, V100, A100, RTX A6000, GeForce RTX 3060)
* AMD GPUs: ROCm 6.2.0+ (tested on MI100s and MI200s)

## Documentation

### Getting Started
Expand All @@ -25,6 +29,7 @@ ServerlessLLM is a **fast** and **easy-to-use** serving system designed for **af
### ServerlessLLM Store

- [Quickstart](./store/quickstart.md)
- [ROCm Installation(Experimental)](./store/installation_with_rocm.md)

### ServerlessLLM CLI

Expand Down
16 changes: 9 additions & 7 deletions docs/stable/store/installation_with_rocm.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,15 @@ sidebar_position: 1

# Installation with ROCm (Experimental)

## Latest Tested Version
+ v0.5.1

## Tested Hardware
+ OS: Ubuntu 22.04
+ ROCm: 6.2
+ PyTorch: 2.3.0
+ GPU: MI100s (gfx908), MI200s (gfx90a)

## Build the wheel from source and install
ServerlessLLM Store (`sllm-store`) currently provides experimental support for ROCm platform. Due to an internal bug in ROCm, serverless-llm-store may face a GPU memory leak in ROCm before version 6.2.0, as noted in [issue](https://github.com/ROCm/HIP/issues/3580).

Expand Down Expand Up @@ -162,13 +171,6 @@ cd ServerlessLLM/sllm_store/build
ctest --output-on-failure
```
## Tested Hardware
+ OS: Ubuntu 22.04
+ ROCm: 6.2
+ PyTorch: 2.3.0
+ GPU: MI100s (gfx908), MI200s (gfx90a)
## Known issues
1. GPU memory leak in ROCm before version 6.2.0.
Expand Down

0 comments on commit b36dea7

Please sign in to comment.