From 0a62ffc9f52c45824745c4ffb30a186f688d7026 Mon Sep 17 00:00:00 2001 From: Domenic Barbuzzi Date: Mon, 5 Aug 2024 10:42:32 -0400 Subject: [PATCH] Update examples/quantization_24_sparse_w4a16 README (#52) Fix filenames/paths --- examples/quantization_24_sparse_w4a16/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/examples/quantization_24_sparse_w4a16/README.md b/examples/quantization_24_sparse_w4a16/README.md index 38535d170..6e006d9db 100644 --- a/examples/quantization_24_sparse_w4a16/README.md +++ b/examples/quantization_24_sparse_w4a16/README.md @@ -19,7 +19,7 @@ pip install -e . The example includes an end-to-end script for applying the quantization algorithm. ```bash -python3 llama2_24sparse_example.py +python3 llama7b_sparse_w4a16.py ``` @@ -29,7 +29,7 @@ This example uses LLMCompressor and Compressed-Tensors to create a 2:4 sparse an The model is calibrated and trained with the ultachat200k dataset. At least 75GB of GPU memory is required to run this example. -Follow the steps below, or to run the example as `python examples/llama7b_sparse_quantized/llama7b_sparse_w4a16.py` +Follow the steps below, or to run the example as `python examples/quantization_24_sparse_w4a16/llama7b_sparse_w4a16.py` ## Step 1: Select a model, dataset, and recipe In this step, we select which model to use as a baseline for sparsification, a dataset to