-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In-progress data is not cached. #57
Comments
You are correct, at the moment we do not support any LLM caching. It is something we should definitely implement. |
I see. What does the README mean when it says "The next time you initialize fast-graphrag from the same working directory, it will retain all the knowledge automatically"? |
It means that the graph you create is saved in memory, so once you insert document A, that will be persistently saved into the graph (but as you pointed out, if something goes wrong during insertion you need to start over for that document as we do not cache any LLM computation), so that if you try to query the graph at a second occasion the knowledge will be retained. Hope that this answers your question :) |
Yes, thank you! I would like to try and fix this as the tool seems very useful but I haven't yet been able to get it to run successfully. I usually start getting error responses from my LLM due to rate limiting and I don't think they are retried. Then the embedding computation seems to keep hanging. It seems that it gets stuck so I would like to try and fix that and make it more robust to failures and retries. Please can you point to roughly where you'd want this functionality implemented and I will try to take a look when I get time? |
Sure, I can draft something, in the meanwhile can you elaborate on the llm/embedding problems? What services are you using? |
Ideally, the BaseLLMService and BaseEmbeddingService should implement a "cache" function within
Then the state_manager should do the following:
So at the end of state_manager._insert_done there should be a "clear cache files" and all the insertion process should be wrapped in a try catch that on catch does something like |
I am using AWS Bedrock but the models I have access to are very severely rate-limited for non-production uses so I have set the concurrent limits to 2. What happens is that I get around half way through the extraction phase and start getting lots of 500s (which are actually rate limits, I don't know why they aren't 429). Fast-graphrag seems to skip through the failed requests and then starts doing the embeddings. These consistently get to 43% and then hang. Perhaps it is getting stuck when trying to embed the failed extraction requests? I have left it running overnight and it didn't progress so I know it is hanging and not just very slow. I will try with concurrent requests = 1 and see how that works. I may also try async batching requests (see AWS documentation here, OpenAI here) as it seems a good fit for this work. |
It could be that Another user reported that the process gets stuck at 43%, I will further investigate on this. |
The 500 errors were coming from the Bedrock proxy I am using and updating to use 429s seems to fix the retry issue. Regarding getting stuck at 43%, the issue is that here
And I get a response from my test above! |
I see, nice catch! I'd appreciate if you could issue a pull request for that since you spotted it (in case, can you make the |
I can't do a PR, sorry, as this is my work machine. I haven't changed any config apart from the concurrency settings. |
Describe the bug
The README says "The next time you initialize fast-graphrag from the same working directory, it will retain all the knowledge automatically". However, this does not seem to work.
To Reproduce
Steps to reproduce the behavior:
pyproject.toml
add:fast_graphrag/test.py
add:poetry install
poetry run test-app
Expected behaviour
If extraction is killed mid progress, I would expect it to restart roughly where it left off. If the extraction completes and the app hangs during embedding computation, I would expect it to skip extraction and restart embedding computation.
Example app run
In this test run I left it running and it froze during the embedding computation. I killed the process and restarted but I can see via network that it is redoing the entire extraction. The
book_example
directory has pickle files that are only a few hundred bytes in size.Note that I am using Bedrock Access Gateway to proxy requests to AWS Bedrock. This doesn't seem to be related but I include for completeness.
Note that the same occurs if my LLM starts returning 429 or 500 errors. I would expect the failed requests to be cached and incrementally completed if I kill and restart the process.
The text was updated successfully, but these errors were encountered: