Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for resuming the download #28

Open
RootLUG opened this issue Aug 16, 2024 · 0 comments
Open

Support for resuming the download #28

RootLUG opened this issue Aug 16, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@RootLUG
Copy link

RootLUG commented Aug 16, 2024

Is your feature request related to a problem? Please describe.
dsnap should support a feature to resume a partial download. This is a big problem in coldsnap where large snapshot is being downloaded just to fail at the last moment with only few blocks left or cause hangup when being unable to download last remaining blocks (Relate to bug #27)

Describe the solution you'd like
dsnap should store metadata alongside the snapshot download file that marks which blocks were already downloaded and what arguments where used to launch dsnap. If dsnap is re-launched again with same arguments then it would look into this metadata file to skip already download blocks that were marked as completed and would resume for those still remaining.
As the number of blocks can get quite large, the list may be compacted instead of listing the number of each block that finished the download it may be collapsed into both ranges such as 0-1000 or individual list of blocks.
dsnap should commit to FS this list on periodic basis (such as every 1min) or after X amount of blocks were completed to ensure consistent point of recovery.

Current behaviour is that it will error out when destination file already exists, instead it should prompt a user if he/she wish to resume the partial download or perhaps just do that automatically when partial download is detected.

Describe alternatives you've considered
There aren't many alternatives available. coldsnap is the only other tool AFAIK that allows direct snapshot download and the biggest problem there is the consistency for network failures where it fails at the last moment even if the block that failed to download is at the beginning and it then requires to re-download the whole snapshot. This is to simply put very inconvenient to repeat for half a day to finally get uncorrupted snapshot to download.

Additional context
Additionally the EBS block access also provides a feature to return SHA checksum of the block being currently downloaded in HTTP response headers, this can be used for strong consistency during the download so that the block is commit as completed into the log only after it was written to file to the specific location and the checksum was verified as well.

For the metadata file something like <output_file>.dsnap_metadata may be used, for inspiration, ddrescue logfile can be used as it serves the same purpose when performing an image of the drive

@RootLUG RootLUG added the enhancement New feature or request label Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant