[Issue]: <title> Warning: Error decoding faulty json, attempting repair #1078

240839785 · 2024-09-03T01:55:18Z

Do you need to file an issue?

I have searched the existing issues and this bug is not already filed.
My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

Describe the issue

I am encountering a JSON decoding error when using the glm-4-flash tool to generate knowledge graphs. The error message is as follows:

Steps to reproduce

No response

GraphRAG Config Used

# Paste your config here

encoding_model: cl100k_base
skip_workflows: []
llm:
api_key: $$$
type: openai_chat # or azure_openai_chat
model: glm-4-flash
model_supports_json: true # recommended if this is available for your model.
max_tokens: 4000
request_timeout: 600.0
api_base: https://open.bigmodel.cn/api/paas/v4

api_version: 2024-02-15-preview

organization: <organization_id>

deployment_name: <azure_model_deployment_name>

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

concurrent_requests: 5 # the number of parallel inflight requests that may be made

temperature: 0 # temperature for sampling

top_p: 1 # top-p sampling

n: 1 # Number of completions to generate

parallelization:
stagger: 0.3

num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:

parallelization: override the global parallelization settings for embeddings

async_mode: threaded # or asyncio

target: required # or all

batch_size: 16 # the number of documents to send in a single request

batch_max_tokens: 8191 # the maximum number of tokens to send in a single request

llm:
api_key: $$$
type: openai_embedding # or azure_openai_embedding
#model: ChristianAzinn/mxbai-embed-large-v1-gguf
model: embedding-2
api_base: https://open.bigmodel.cn/api/paas/v4
# api_version: 2024-02-15-preview
# organization: <organization_id>
# deployment_name: <azure_model_deployment_name>
# tokens_per_minute: 150_000 # set a leaky bucket throttle
# requests_per_minute: 10_000 # set a leaky bucket throttle
# max_retries: 10
# max_retry_wait: 10.0
# sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
concurrent_requests: 5 # the number of parallel inflight requests that may be made

chunks:
size: 600
overlap: 150
group_by_columns: [id] # by default, we don't allow chunks to cross documents

input:
type: file # or blob
file_type: text # or csv
base_dir: "input"
file_encoding: utf-8
file_pattern: ".*\.txt$"

cache:
type: file # or blob
base_dir: "cache"

connection_string: <azure_blob_storage_connection_string>

container_name: <azure_blob_storage_container_name>

storage:
type: file # or blob
base_dir: "output/${timestamp}/artifacts"

connection_string: <azure_blob_storage_connection_string>

container_name: <azure_blob_storage_container_name>

reporting:
type: file # or console, blob
base_dir: "output/${timestamp}/reports"

connection_string: <azure_blob_storage_connection_string>

container_name: <azure_blob_storage_container_name>

entity_extraction:

strategy: fully override the entity extraction strategy.

type: one of graph_intelligence, graph_intelligence_json and nltk

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

prompt: "prompts/entity_extraction.txt"
entity_types: [organization,person,geo,event]
max_gleanings: 1

summarize_descriptions:

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

prompt: "prompts/summarize_descriptions.txt"
max_length: 500

claim_extraction:

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

enabled: true
prompt: "prompts/claim_extraction.txt"
description: "Any claims or facts that could be relevant to information discovery."
max_gleanings: 1

community_reports:

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

prompt: "prompts/community_report.txt"
max_length: 2000
max_input_length: 8000

cluster_graph:
max_cluster_size: 10

embed_graph:
enabled: false # if true, will generate node2vec embeddings for nodes

num_walks: 10

walk_length: 40

window_size: 2

iterations: 3

random_seed: 597832

umap:
enabled: false # if true, will generate UMAP embeddings for nodes

snapshots:
graphml: false
raw_entities: false
top_level_nodes: false

local_search:

text_unit_prop: 0.5

community_prop: 0.1

conversation_history_max_turns: 5

top_k_mapped_entities: 10

top_k_relationships: 10

llm_temperature: 0 # temperature for sampling

llm_top_p: 1 # top-p sampling

llm_n: 1 # Number of completions to generate

max_tokens: 12000

global_search:

llm_temperature: 0 # temperature for sampling

llm_top_p: 1 # top-p sampling

llm_n: 1 # Number of completions to generate

max_tokens: 12000

data_max_tokens: 12000

map_max_tokens: 1000

reduce_max_tokens: 2000

concurrency: 32

Logs and screenshots

09:49:42,176 graphrag.llm.openai.utils INFO Warning: Error decoding faulty json, attempting repair
09:49:42,177 graphrag.llm.base.rate_limiting_llm INFO perf - llm.chat "create_community_report" with 0 retries took 27.0612059480045. input_tokens=3804, output_tokens=908
09:49:45,152 httpx INFO HTTP Request: POST https://open.bigmodel.cn/api/paas/v4/chat/completions "HTTP/1.1 200 OK"
09:49:45,154 graphrag.llm.openai.utils INFO Warning: Error decoding faulty json, attempting repair
09:49:45,154 graphrag.llm.base.rate_limiting_llm INFO perf - llm.chat "create_community_report" with 0 retries took 30.045606669002154. input_tokens=4367, output_tokens=725
09:49:49,146 httpx INFO HTTP Request: POST https://open.bigmodel.cn/api/paas/v4/chat/completions "HTTP/1.1 200 OK"
09:49:49,151 graphrag.llm.openai.utils INFO Warning: Error decoding faulty json, attempting repair
09:49:49,152 graphrag.llm.base.rate_limiting_llm INFO perf - llm.chat "create_community_report" with 0 retries took 18.514149362999888. input_tokens=3017, output_tokens=740
09:49:49,876 httpx INFO HTTP Request: POST https://open.bigmodel.cn/api/paas/v4/chat/completions "HTTP/1.1 200 OK"

Additional Information

GraphRAG Version:
Operating System:
Python Version:
Related Issues:

The text was updated successfully, but these errors were encountered:

240839785 · 2024-09-03T02:22:19Z

Despite the JSON decoding error warnings, all workflows completed successfully. The final output indicates that the knowledge graph generation was successful.

madlogos · 2024-10-09T02:42:59Z

Refer to this post #575. It works for me.

@awaescher

Small update: I found out that my model always returned nosense like this:
python -m graphrag.query --root ./myfolder --method global "What are the main topics"
"The main topic is 'What are the main topics'"
I found out that my local Ollama instance (0.3.0) seemed to ignore the system prompt and I got it working by manually stitching together the two prompts into one:

File: /graphrag/query/structured_search/global_search/search.py , method: _map_response_single_batch
#search_messages = [
#  {"role": "system", "content": search_prompt},
#  {"role": "user", "content": query},
#]
search_messages = [ {"role": "user", "content": search_prompt + "\n\n### USER QUESTION ### \n\n" + query}

240839785 added the triage Default label assignment, indicates new issue needs reviewed by a maintainer label Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Issue]: <title> Warning: Error decoding faulty json, attempting repair #1078

[Issue]: <title> Warning: Error decoding faulty json, attempting repair #1078

240839785 commented Sep 3, 2024

240839785 commented Sep 3, 2024

madlogos commented Oct 9, 2024

[Issue]: <title> Warning: Error decoding faulty json, attempting repair #1078

[Issue]: <title> Warning: Error decoding faulty json, attempting repair #1078

Comments

240839785 commented Sep 3, 2024

Do you need to file an issue?

Describe the issue

Steps to reproduce

GraphRAG Config Used

api_version: 2024-02-15-preview

organization: <organization_id>

deployment_name: <azure_model_deployment_name>

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

temperature: 0 # temperature for sampling

top_p: 1 # top-p sampling

n: 1 # Number of completions to generate

num_threads: 50 # the number of threads to use for parallel processing

parallelization: override the global parallelization settings for embeddings

target: required # or all

batch_size: 16 # the number of documents to send in a single request

batch_max_tokens: 8191 # the maximum number of tokens to send in a single request

connection_string: <azure_blob_storage_connection_string>

container_name: <azure_blob_storage_container_name>

connection_string: <azure_blob_storage_connection_string>

container_name: <azure_blob_storage_container_name>

connection_string: <azure_blob_storage_connection_string>

container_name: <azure_blob_storage_container_name>

strategy: fully override the entity extraction strategy.

type: one of graph_intelligence, graph_intelligence_json and nltk

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

num_walks: 10

walk_length: 40

window_size: 2

iterations: 3

random_seed: 597832

text_unit_prop: 0.5

community_prop: 0.1

conversation_history_max_turns: 5

top_k_mapped_entities: 10

top_k_relationships: 10

llm_temperature: 0 # temperature for sampling

llm_top_p: 1 # top-p sampling

llm_n: 1 # Number of completions to generate

max_tokens: 12000

llm_temperature: 0 # temperature for sampling

llm_top_p: 1 # top-p sampling

llm_n: 1 # Number of completions to generate

max_tokens: 12000

data_max_tokens: 12000

map_max_tokens: 1000

reduce_max_tokens: 2000

concurrency: 32

Logs and screenshots

Additional Information

240839785 commented Sep 3, 2024

madlogos commented Oct 9, 2024