-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update llama_cpp: Sync LLAMA_API names with llama.cpp mainline. Needs more testing #1901
Conversation
Fix deprecated llama.cpp function call [llama_token_is_eog]
fix llama-cpp-python[server] issues
Cant use GPUi have tested and build your branch, i tried to run with nvidia rtx, but it seem cant use gpu. CODE
LOG
|
@Kar-Su Did you use the CMAKE_ARGS="-DGGML_CUDA=on" to compile the pip wheel? maybe you just lost the cmake params |
…ma_cpp.llama_model_get_vocab
[FIX] llama_chat_format.py: Update llama.llama_model_get_vocab -> lla…
yes i tried with that. i have tried 2 time with different method. Did my method wrong? first
second
|
@Kar-Su sometimes it can get "stuck"
|
Hi @abetlen , could you check if this commit request is eligible for merging into the mainline? |
still not work, maybe need to wait an update official with merging this pull request. Thankyou guys for helping me. I hope this pull request get approval |
ggerganov/llama.cpp#11381 llama: refactor llama_decode_imp |
abetlen has adapted the new version of llama.cpp, which is good. This submission of the temporary fix is closed first. |
temporary fix code move to https://github.com/JamePeng/llama-cpp-python/tree/1091-branch |
Sync LLAMA_API names with llama.cpp mainline. Needs more testing