From 37f6b3f4d10348c5a4daae0b1f52c54cf836c390 Mon Sep 17 00:00:00 2001 From: tpoisonooo Date: Thu, 17 Aug 2023 20:39:47 +0800 Subject: [PATCH] docs(README): add lmdeploy --- README.md | 1 + README_EN.md | 1 + 2 files changed, 2 insertions(+) diff --git a/README.md b/README.md index 3aa8f48..776a895 100644 --- a/README.md +++ b/README.md @@ -140,6 +140,7 @@ | 工具 | 特点 | CPU | GPU | 量化 | GUI | API | vLLM | 教程 | | :----------------------------------------------------------- | ---------------------------- | :--: | :--: | :--: | :--: | :--: | :--: | :----------------------------------------------------------: | +| [**lmdeploy**](https://github.com/internlm/lmdeploy) | GPU 服务端极致优化,支持多 batch、w4 和 kv8 量化 | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | [link](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/lmdeploy_zh) | | [**llama.cpp**](https://github.com/ggerganov/llama.cpp) | 丰富的量化选项和高效本地推理 | ✅ | ✅ | ✅ | ❌ | ✅ | ❌ | [link](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/llamacpp_zh) | | [**🤗Transformers**](https://github.com/huggingface/transformers) | 原生transformers推理接口 | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | [link](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/inference_with_transformers_zh) | | [**Colab Demo**](https://colab.research.google.com/drive/1yu0eZ3a66by8Zqm883LLtRQrguBAb9MR?usp=sharing) | 在Colab中启动交互界面 | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | [link](https://colab.research.google.com/drive/1yu0eZ3a66by8Zqm883LLtRQrguBAb9MR?usp=sharing) | diff --git a/README_EN.md b/README_EN.md index 60a3a27..8b62207 100644 --- a/README_EN.md +++ b/README_EN.md @@ -134,6 +134,7 @@ The models in this project mainly support the following quantization, inference, | Tool | Features | CPU | GPU | Quant | GUI | API | vLLM | Tutorial | | :----------------------------------------------------------- | ------------------------------------------------------- | :--: | :--: | :---: | :--: | :--: | :--: | :----------------------------------------------------------: | +| [**lmdeploy**](https://github.com/internlm/lmdeploy) | Extremely optimized on GPU and supports multi-batch, w4 and kv8 quantization | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | [link](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/lmdeploy_zh) | | [**llama.cpp**](https://github.com/ggerganov/llama.cpp) | Rich quantization options and efficient local inference | ✅ | ✅ | ✅ | ❌ | ✅ | ❌ | [link](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/llamacpp_en) | | [**🤗Transformers**](https://github.com/huggingface/transformers) | Native transformers inference interface | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | [link](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/inference_with_transformers_en) | | [**Colab Demo**](https://colab.research.google.com/drive/1yu0eZ3a66by8Zqm883LLtRQrguBAb9MR?usp=sharing) | Running a Gradio web demo in Colab | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | [link](https://colab.research.google.com/drive/1yu0eZ3a66by8Zqm883LLtRQrguBAb9MR?usp=sharing) |