You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I have a directory containing several hundred HTML files. When I parse them with theload_data in the Python llama_parse package, the code fails with the UNSUPPORTED_FILE_TYPE exception.
Each time, it happens with different files
If I check the status quickly with web UI within a few seconds, I see "failed" and the error. But If I refresh the page, it lists success. The results seems correct.
One can also see successful job status using the API
lp = llama_parse.LlamaParse(
api_key=lp,
result_type='markdown',
verbose=False,
use_vendor_multimodal_model=True,
vendor_multimodal_model_name='openai-gpt4o',
[... other options..]
)
lp.load_data(files) <-- files is a list of paths
File ".../llama_parse/base.py", line 654, in _get_job_result
raise Exception(exception_str)
Exception: Job ID: 3e3b9c18-8059-4c4d-bf84-a138d73e2208 failed with status: ERROR, Error code: UNSUPPORTED_FILE_TYPE, Error message: Unsupported file type.
[...]
% curl -X 'GET' \
'https://api.cloud.llamaindex.ai/api/parsing/job/3e3b9c18-8059-4c4d-bf84-a138d73e2208' \
-H 'accept: application/json' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"
{"id":"3e3b9c18-8059-4c4d-bf84-a138d73e2208","status":"SUCCESS"}%
Job ID
3e3b9c18-8059-4c4d-bf84-a138d73e2208
Client:
Please remove untested options:
Python Library
The text was updated successfully, but these errors were encountered:
The workaround was to keep retrying while relying on cached results. I ran it a few more times. Each time, a different file would cause the error and then be marked as successful. After this, the process completed.
Describe the bug
I have a directory containing several hundred HTML files. When I parse them with the
load_data
in the Pythonllama_parse
package, the code fails with theUNSUPPORTED_FILE_TYPE
exception.Job ID
3e3b9c18-8059-4c4d-bf84-a138d73e2208
Client:
Please remove untested options:
The text was updated successfully, but these errors were encountered: