What's new in 1.2.2 (2025-02-08)
These are the changes in inference v1.2.2.
New features
- FEAT: support qwen2.5-vl-instruct by @qinxuye in #2788
- FEAT: Support internlm3 by @Jun-Howie in #2789
- FEAT: support deepseek-r1-distill-llama by @qinxuye in #2811
- FEAT: Support Kokoro-82M by @codingl2k1 in #2790
- FEAT: vllm support for qwen2.5-vl-instruct by @qinxuye in #2821
Bug fixes
- BUG: fix llama-cpp when some quantizations have multiple parts by @qinxuye in #2786
- BUG: Use
Cache
class instead of rawtuple
for transformers continuous batching, compatible with latesttransformers
by @ChengjieLi28 in #2820
Documentation
- DOC: Update multimodal doc by @codingl2k1 in #2785
- DOC: update model docs by @qinxuye in #2792
- DOC: fix docs by @qinxuye in #2793
- DOC: Fix a couple of typos by @Paleski in #2817
New Contributors
Full Changelog: v1.2.1...v1.2.2