Releases: ggerganov/llama.cpp
Releases · ggerganov/llama.cpp
b2950
rpc : track allocated buffers (#7411) * rpc : track allocated buffers ref: #7407 * rpc : pack rpc_tensor tightly
b2949
server : fix temperature + disable some tests (#7409) * server : fix temperature * server : disable tests relying on parallel determinism * ci : change server Debug -> RelWithDebInfo
b2948
[SYCL] Update SYCL upscale operation (#7321) * Update SYCL upscale operation * Formatting * Remove messages
b2946
ggml-opencl, llama: using reserve() if count already known (#7272)
b2945
ggml : add loongarch lsx and lasx support (#6454) * add loongarch lsx and lasx optimize code * Add loongarch compilation support to makefile * revert stb_image.h * opt bytes_from_nibbles_32 and sum_i16_pairs_float * fix undeclared * format code * update * update 2 --------- Co-authored-by: Jinyang He <[email protected]>
b2943
server : return error on too large embedding input (#7389)
b2941
Add provisions for windows support for BF16 code including CMake prov…
b2940
llama : remove MPI backend (#7395)
b2939
quantize : fix --keep-split check (#7374)
b2938
Vulkan Embedding Fix (#7360) * Fix empty Vulkan host buffers Add fp32 fp16 matmul shader Fix matmul shader alignment * Remove deprecated tensor->backend uses * Fix Vulkan validation errors on embedding models with no offloaded layers * Fix Vulkan llava segfault when not offloading layers