Releases: ggerganov/llama.cpp
Releases · ggerganov/llama.cpp
b4623
b4621
HIP: fix flash_attn_stream_k_fixup warning (#11604)
b4620
CUDA/HIP: add support for selectable warp size to mmv (#11519) CUDA/HIP: add support for selectable warp size to mmv
b4619
HIP: add GGML_CUDA_CC_IS_* for amd familys as increasing cc archtectu…
b4618
nit: more informative crash when grammar sampler fails (#11593)
b4617
CUDA: use mma PTX instructions for FlashAttention (#11583) * CUDA: use mma PTX instructions for FlashAttention * __shfl_sync workaround for movmatrix * add __shfl_sync to HIP Co-authored-by: Diego Devesa <[email protected]>
b4616
Name colors (#11573) It's more descriptive, use #define's so we can use compile-time concatenations. Signed-off-by: Eric Curtin <[email protected]>
b4615
`tool-call`: support Command R7B (+ return tool_plan "thoughts" in AP…
b4614
Fix exotic ci env that lacks ostringstream::str (#11581)
b4613
sampling : support for llguidance grammars (#10224) * initial porting of previous LLG patch * update for new APIs * build: integrate llguidance as an external project * use '%llguidance' as marker to enable llg lark syntax * add some docs * clarify docs * code style fixes * remove llguidance.h from .gitignore * fix tests when llg is enabled * pass vocab not model to llama_sampler_init_llg() * copy test-grammar-integration.cpp to test-llguidance.cpp * clang fmt * fix ref-count bug * build and run test * gbnf -> lark syntax * conditionally include llguidance test based on LLAMA_LLGUIDANCE flag * rename llguidance test file to test-grammar-llguidance.cpp * add gh action for llg test * align tests with LLG grammar syntax and JSON Schema spec * llama_tokenizer() in fact requires valid utf8 * update llg * format file * add $LLGUIDANCE_LOG_LEVEL support * fix whitespace * fix warning * include <cmath> for INFINITY * add final newline * fail llama_sampler_init_llg() at runtime * Link gbnf_to_lark.py script; fix links; refer to llg docs for lexemes * simplify #includes * improve doc string for LLAMA_LLGUIDANCE * typo in merge * bump llguidance to 0.6.12