-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remote repository degraded performance #93
Comments
I didn't look at the code right now, but IIRC the find call might be to support overwriting already present data while a nesting depth change is done (e.g. changing from nesting depth 1 to nesting depth 2). |
I think a solution would need to be found for this because at the moment it would prevent proper parallelization of the upload process. By bypassing the self.find (since I know none of those blobs even exist) and adding some parallelization code for the rclone backend I am able to upload upwards of 150 Mbps to a server with a latency of ~50ms. When one considers that before doing an upload a 50ms pre-request needs to be done to check a path while uploading chunks of 1 MiB, you end up with at most 10 chunks per second and considering compression and deduplication you will end up with an upload speed of way less than 10 MiB/s. |
Can borgstore benefit somehow from the list of chunk hashes that each archive has? Another issue that I am running into is that if the borg create/import-tar process is not completed, the chunk list is not known and as such the create/import-tar behaves as if none of the blobs exist. A possible solution could be uploading a temporary chunk list every n new chunks or every x minutes to allow the process to continue from where it left off or close enough. |
Avoid putting different issues into same github issue. :-) About your "other issue": guess you use latest beta. iirc that is already fixed in master branch (at least for borg create, not sure about import-tar) by uploading increments for the chunks cache. |
Hello,
Due to how borgstore::store works there is a huge performance hit if the repository is not within the local network.
borgstore/src/borgstore/store.py
Line 208 in a38a1a7
Basically, before we store the file we do a find first to figure out the path I believe.
Using a local cache for the "find" call would speed up the process at the cost of uploading a file twice(?) or maybe in the wrong place(?). Honestly I am not entirely sure what find has to do with store.
What would the worst case be if we used a local cache to check if a blob already exists? We write it twice?
The text was updated successfully, but these errors were encountered: