diff --git a/distributions/meta-reference-gpu/README.md b/distributions/meta-reference-gpu/README.md new file mode 100644 index 0000000000..951120da58 --- /dev/null +++ b/distributions/meta-reference-gpu/README.md @@ -0,0 +1,33 @@ +# Meta Reference Distribution + +The `llamastack/distribution-meta-reference-gpu` distribution consists of the following provider configurations. + + +| **API** | **Inference** | **Agents** | **Memory** | **Safety** | **Telemetry** | +|----------------- |--------------- |---------------- |-------------------------------------------------- |---------------- |---------------- | +| **Provider(s)** | meta-reference | meta-reference | meta-reference, remote::pgvector, remote::chroma | meta-reference | meta-reference | + + +### Start the Distribution (Single Node GPU) + +> [!NOTE] +> This assumes you have access to GPU to start a TGI server with access to your GPU. + +> [!NOTE] +> For GPU inference, you need to set these environment variables for specifying local directory containing your model checkpoints, and enable GPU inference to start running docker container. +``` +export LLAMA_CHECKPOINT_DIR=~/.llama +``` + +> [!NOTE] +> `~/.llama` should be the path containing downloaded weights of Llama models. + + +To download and start running a pre-built docker container, you may use the following commands: + +``` +docker run -it -p 5000:5000 -v ~/.llama:/root/.llama --gpus=all llamastack/llamastack-local-gpu +``` + +### Alternative (Build and start distribution locally via conda) +- You may checkout the [Getting Started](../../docs/getting_started.md) for more details on starting up a meta-reference distribution. diff --git a/docs/getting_started.md b/docs/getting_started.md index 599331e8bc..e3db908a74 100644 --- a/docs/getting_started.md +++ b/docs/getting_started.md @@ -49,7 +49,7 @@ docker run -it -p 5000:5000 -v ~/.llama:/root/.llama --gpus=all llamastack/llama ``` > [!TIP] -> Pro Tip: We may use `docker compose up` for starting up a distribution with remote providers (e.g. TGI) using [llamastack-local-cpu](https://hub.docker.com/repository/docker/llamastack/llamastack-local-cpu/general). You can checkout [these scripts](../llama_stack/distribution/docker/README.md) to help you get started. +> Pro Tip: We may use `docker compose up` for starting up a distribution with remote providers (e.g. TGI) using [llamastack-local-cpu](https://hub.docker.com/repository/docker/llamastack/llamastack-local-cpu/general). You can checkout [these scripts](../distributions/) to help you get started. #### Build->Configure->Run Llama Stack server via conda You may also build a LlamaStack distribution from scratch, configure it, and start running the distribution. This is useful for developing on LlamaStack.