Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when using LlamaCppServerProvider #80

Open
PredyDaddy opened this issue Sep 18, 2024 · 0 comments
Open

Error when using LlamaCppServerProvider #80

PredyDaddy opened this issue Sep 18, 2024 · 0 comments

Comments

@PredyDaddy
Copy link

Hello,
I have trouble when I start using this llama-cpp-agent
I run serve using followed command

python3 -m llama_cpp.server --model /app/vlm_weights/Qwen2-0___5B-Instruct-GGUF/qwen2-0_5b-instruct-fp16.gguf --n_gpu_layers -1

then I can use followed code the request

from openai import OpenAI

client = OpenAI(base_url="http://127.0.0.1:8000/v1", api_key="sk-xxx")
response = client.chat.completions.create(
    model="qwen2",  
    messages=[
        {
            "role": "user",
            "content": "hello"  # 这是文本内容
        }
    ],
)
print(response)

with excepted output

root@ubuntu:/app/inside_container/llama_python_demo# python3 demo.py 
ChatCompletion(id='chatcmpl-64ea1c64-a6fb-46b2-ba36-8942e9a17540', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Hello! How can I assist you today?', refusal=None, role='assistant', function_call=None, tool_calls=None))], created=1726667242, model='qwen2', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=9, prompt_tokens=20, total_tokens=29, completion_tokens_details=None))

then when I used following code try to start using llama-cpp-agent

from llama_cpp_agent import LlamaCppAgent
from llama_cpp_agent import MessagesFormatterType
from llama_cpp_agent.providers import LlamaCppServerProvider

provider = LlamaCppServerProvider("http://127.0.0.1:8000", llama_cpp_python_server=True)

agent = LlamaCppAgent(
    provider,
    system_prompt="You are a helpful assistant.",
    predefined_messages_formatter_type=MessagesFormatterType.CHATML,
)

settings = provider.get_provider_default_settings()
settings.n_predict = 512
settings.temperature = 0.65

while True:
    user_input = input(">")
    if user_input == "exit":
        break
    agent_output = agent.get_chat_response(user_input, llm_sampling_settings=settings)
    print(f"Agent: {agent_output.strip()}")

I see the error

root@ubuntu:/app/inside_container/llama_python_demo# python3 demo.py 
>hello
Traceback (most recent call last):
  File "/app/inside_container/llama_python_demo/demo.py", line 21, in <module>
    agent_output = agent.get_chat_response(user_input, llm_sampling_settings=settings)
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp_agent/llm_agent.py", line 334, in get_chat_response
    for out in completion:
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp_agent/providers/llama_cpp_server.py", line 279, in generate_text_chunks
    new_data = json.loads(decoded_chunk.replace("data:", ""))
  File "/usr/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 3 (char 2)

Could you help me look at it?

Best wishes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant