v0.0.5
What's new in 0.0.5 (2023-07-19)
These are the changes in inference v0.0.5.
New features
- FEAT: support pytorch models by @pangyoki in #157
- FEAT: support vicuna-v1.3 33B by @Bojun-Feng in #192
- FEAT: support baichuan-chat pytorch model by @pangyoki in #190
- FEAT: pytorch model support MPS backend by @pangyoki in #198
- FEAT: Embedding by @jiayini1119 in #194
- FEAT: LLaMA-2 by @UranusSeven in #203
Enhancements
- ENH: Implement RESTful API stream generate by @jiayini1119 in #171
- ENH: set default device to
mps
on MacOS by @pangyoki in #205 - ENH: Set default mlock to true and mmap to false by @RayJi01 in #206
- ENH: add Gradio ChatInterface chatbot to example by @Bojun-Feng in #208
Bug fixes
- BUG: fix pytorch int8 by @pangyoki in #197
- BUG: RuntimeError when launching model using kwargs whose value is of type int by @jiayini1119 in #209
- BUG: Fix some gradio issues by @aresnow1 in #200
Documentation
- DOC: sphinx init by @UranusSeven in #189
- DOC: chinese readme by @UranusSeven in #191
Full Changelog: v0.0.4...v0.0.5