Replies: 1 comment
-
I found the issue here, it was a problem with the conversion to GGUF that caused this issue |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi I have a Mac with an M2. I installed the llama-cpp-python with the latest version installed (0.1.83) using:
'CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir'
When I run against a gguf formatted llama 2 model, using:
export N_GQA=8 && python3 -m llama_cpp.server --model $MODEL --n_gpu_layers 38
I get the error:
error loading model: create_tensor: tensor 'blk.0.attn_k.weight' has wrong shape; expected 8192, 8192, got 8192, 1024, 1, 1 llama_load_model_from_file: failed to load model
Anyone know how to resolve this?
Beta Was this translation helpful? Give feedback.
All reactions