From 561f646ded33d8a23fc44d258d3beb6658ea6f5a Mon Sep 17 00:00:00 2001 From: Zihao Ye Date: Thu, 2 Jan 2025 22:47:46 -0800 Subject: [PATCH] misc: add bibtex reference (#712) This pull request includes an update to the `README.md` file to add a new citation section. The most important change is the addition of a citation format for users who find FlashInfer helpful in their projects or research. Documentation update: * [`README.md`](diffhunk://#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R144-R169): Added a new "Citation" section with a BibTeX entry for citing the FlashInfer paper. --- README.md | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/README.md b/README.md index 1f70dd38..a453b5f8 100644 --- a/README.md +++ b/README.md @@ -141,3 +141,29 @@ We are thrilled to share that FlashInfer is being adopted by many cutting-edge p ## Acknowledgement FlashInfer is inspired by [FlashAttention 1&2](https://github.com/dao-AILab/flash-attention/), [vLLM](https://github.com/vllm-project/vllm), [stream-K](https://arxiv.org/abs/2301.03598), [cutlass](https://github.com/nvidia/cutlass) and [AITemplate](https://github.com/facebookincubator/AITemplate) projects. + +## Citation + +If you find FlashInfer helpful in your project or research, please consider citing our [paper](https://arxiv.org/abs/2501.01005): + +```bibtex +@article{ye2025flashinfer, + title = {FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving}, + author = { + Ye, Zihao and + Chen, Lequn and + Lai, Ruihang and + Lin, Wuwei and + Zhang, Yineng and + Wang, Stephanie and + Chen, Tianqi and + Kasikci, Baris and + Grover, Vinod and + Krishnamurthy, Arvind and + Ceze, Luis + }, + journal = {arXiv preprint arXiv:2501.01005}, + year = {2025}, + url = {https://arxiv.org/abs/2501.01005} +} +```