Replies: 1 comment
-
I'm facing the same issue, and I was using the ggml-model-f16.bin model, which is indeed in FP16 format. I'm not sure where the problem is coming from. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am running on the llama2 llama-2-7b-chat-codeCherryPop.ggmlv3.q2_K.bin model for embedding using
LlamaCppEmbeddings
documents and store them in FAISS vector store. I am using the Ubuntu OS and I am using the latest llama-cpp-python and other libraries.embedding = LlamaCppEmbeddings(model_path=model_path, n_gpu_layers=50, n_batch=256, n_threads=96, n_ctx=4096 )
vectordb = FAISS.from_documents(docs,embedding=embedding)
when I start the program it start consuming the GPU power, however after 10 to 15 minutes, it aborts the process.
please find the last log line.
does anyone has an idea bout this issue ?
Beta Was this translation helpful? Give feedback.
All reactions