-
-
Notifications
You must be signed in to change notification settings - Fork 756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
borg2: it's coming! #6602
Comments
I do not mind breaking for the better at all, but some of the outlined details do not qualify for that IMHO. When it comes to crypto, breakage should not occur to replace one algorithm with a limited life span with another one with a limited life span and thus planning with breakage every few years. Instead breakage should be done to end up with a repo format that does support multiple algorithms and easy and feasable changing of keys as well as used algorithms. That could e.g. be by at least temporarily allowing multiple algorithms to be "active" in a repo at the same time. When it comes to repo format, a breakage should not be the excuse to just dump a bit of code to still support reading PUTs besides PUT2s, but question the format as a whole and try to address issues such as the current limitations of append-only as well as secure multi-client usage, infeasible (with huge repos) compaction.
When it comes to compression, what really should go is the auto mode - or be reimplemented with useful parameters, whcih IMO are hard to come up with in the light of ZSTD performance. About "scp syntax": |
Crypto: AES-CTR does not have a limited timespan. Why we are doing this is to get rid of the fundamental counter management issues:
There's also a slight ugliness of only storing a part of the IV within the old format, but that is just a minor detail. The new AEAD algorithms with session keys solve that. We could have all 3 crypto algorithms in parallel in the borg code (but currently not in same repo), but there are other things on the above list that are best solved with tar-export/import or borg transfer and a new repo and IF ones does that anyway, one can as well go for the better crypto in one go (instead of having to do the export/import again some time later). I don't think it would be a good idea to use different encryption algorithms in the same repo and especially not with the same key - so if we would go for the complexity of supporting repos with that, we would need multiple (master) keys for one repo, making it more complex for borg and also for the users. You also can't just "change the keys / algos" in the same repo. Due to dedup, a lot of data would be still encrypted by old key and old algorithm. To get really rid of it you'ld need some global migration, touching a lot of data and needing some management for the case of interruptions of that process. That's about as much I/O and time needed as the export/import, just with much more complexity. |
Repository: It's not just about the "reading PUTs" - it is at quite some places, including borg check (which is already quite complex). I can imagine doing some more and even radical changes to the repo format if we re-start with new repos and require export/import anyway. I am not too happy with the complexities of segment file handling either. In the end this will depend on some developers architecting and implementing it though and we should try to not make the scope too big though or it'll never get releasable. Repos: interesting ideas. Needs more analysis I guess, esp. since we likely want to keep the transactional behaviour and maybe also the LOG like behaviour. Segmentless repos: if everybody had a great repo filesystem and enough storage, I guess that could be done (but it would mean that if the source has a million files, the repo could have XX million chunks). Super simple for borg, but a huge load on the repo fs (did that within my zborg experiment back then). Could also be quite slower due to more random accesses and more file opening and use a lot more space due to fs allocation overheads if one has a significant amount of small files. Cloud storage: I don't want to maintain such code myself, that's just a rabbit hole I don't want to get into. So, for me it is "local directory" as the repo (plus some method of remoting that, not necessarily the hard to debug current remote.py code). |
Compression: auto mode should go? do we have a ticket about that? |
@elho thanks for the detailled feedback btw! This ticket is primarily meant for the to-break-or-not-to-break decision. Once we decide to do a breaking release, requiring new repos, key, export/import, we can do a lot of changes and need to discuss the details in more specific tickets. We should somehow try to limit the scope though, so it won't take forever. |
@ThomasWaldmann if instead of segments something like git pack's could be used, then with the new encryption session stuff it may even turn feasible to push packs instead of archives between repos without necessarily requiring de/encryption |
Potentially this would also enable potentially dumb remotes like s3, sshfs, with the caveat of having more pain with post prune gc and repacking |
@RonnyPfannschmidt encrypted chunks can be transferred between related repos using the same key material, there is a ticket about that already. I don't know the git pack format, so not sure how that is relevant for (re-)encrypting. But if we want to transfer a full "pack", there might be requirements due to that (opposed to just transferring a single chunk). |
I would be happy with a borg1.3 that on first use of serve on (or direct local access to) a v1 repo would start out (maybe after some confirmation) by iterating over all segments, for each creating a new replacement segment file, filling it with the same content except for using PUT2 whenever a PUT is read from the old one, doing some sort of verify pass maknig sure the new segment as arrived on disk has the same data as the old one and only then atomically mv the new over the old one. When having done the last segment file without being interrupted, switch repo version from v1 to v2. |
Note: I updated the topmost post with feedback from you all (thanks!) and also with new insights. I also edited some other posts to remove duplicate / outdated information to keep this issue short. |
I think especially SBCs will stay 32bit for a while, because the savings in having a smaller pointer width are relevant on low-memory platforms. Aren't there clock system calls which return a 64-bit wide integer even on 32-bit ABIs? |
Well, it's not just like borg needs to get the 64bit time by doing a call, it rather is the whole system of kernel / libc / python needing to work with timestamps of reasonable length. E.g. timestamps in |
Changing the module name from Both, to eventually play with potential (meanwhile obsoleted already) export/import tar magic, but also to be able to test 1.2 in parallel with 1.1 in production across all my systems in a sane manner, I went on the surprisingly painful adventure to create myself a variant of the distribution's package that can be installed and used in parallel with the stock 1.1 one. In a hackish manner, one could install borg below a different path, but that is nothing any distribution would do, I went the painful way to do such a rename in there. For the original idea of export-import migration this would be a requirement, here it is not, but in practice, for people backing up to multiple repos, scenarios like migrating the local one to 2.0 while still waiting an undefined time for the borg storage provider the external one resides on to support 2.0 could be very common. |
Guess it is not just about the module name, but also the cli cmd name. OTOH, I'ld dislike to put the version number into the cli cmd name. For testing, one could also use the fat binary and rename that to borg2. |
It is, but the command name is something that can just be changed without requiring any modification of the command itself to keep it working, and on the other hand is something distributions have support for. Aware wrappers that censequently have an idea of the configured repo(s) being version 1 or 2 would know to invoke according versioned command name in all cases.
Testing as in "is this for me" or "does this work at all", yes. But not for testing as in "let me run this in parallel to 1.1 for a couple months and see whether any issues arise before ditching 1.1", ie. a point where 1.2 can be regarded to be at currently. |
The So the kernel can (probably; I saw patches for utimes64, not sure if those have been applied, it hasn't been mentioned in that post above) do it. I'm not sure what the current status is on the glibc side of things (the page looks a bit unclear on progress), but it may be worth pushing python on 32bit architectures to use it if glibc is ready. All I'm saying: don't drop support for 32-bit architectures, but go for dropping support for 32-bit timestamps, which don't have to be the same thing anymore this time and age. |
Note: i updated the top post with the current progress and also released 2.0.0a3 - if no one is holding me back with negative testing results, I'll soon merge the |
@RubenKelevra well, I see what you mean, but that is not how "borg create" works. But maybe check the issue tracker if we have a ticket about this and if not, create a new one, so we can collect ideas there. |
Interesting, can you elaborate or point me to the part which is different than I think, so I can take a look? 🤔
Will do |
There shouldn't be any need to drop 32-bit support to be y2038-clean. 32-bit platforms can still have a 64-bit time_t, and most of them do, and have done for 5-10 years at least. |
Have there been any major complaints / pain points with the JSON API? The only things I've found are (a) (largely hypothetical) encoding woes when involving file names (obviously file names don't have to be representable in unicode regardless of locale) and on weird systems (#2273) and |
@enkore the json encoding issues for e.g.
Especially on samba servers this is not at all hypothetical, but a very practical issue, because the servers existing since some decades already collected all sorts of historical |
@ThomasWaldmann my vote is on delaying the release and only doing one breaking change. Otherwise, your users will have to migrate v1-v2 with breaking changes, and then within a "short" time (6-12 months?) have to migrate v2-v3. Some users will be on v1, so you'd also have to build out v1-v3 upgrade paths and checks. Borg v1 works great, we've waited this long, we can wait a little longer to just have to pay the pain of migration once. Everyone, feel free to thumbs up / thumbs down this comment to express your opinion. |
I think that all is a matter of timing. How much is "a lot"? If is 6 months, merge them. If it is a fundamental rewrite and will take 2 or more years to be stable, do two separate releases. |
my two cents: if it's ready - it's ready. some changes will happen eventually, there is no problem to do small updates in scripts. New version has a lot of benefits, why wait to us it? |
@tmm360 What I have in mind is a big change (not even sure how big), my and other contributors' free time is a bit hard to predict, so it makes the overall time needed somehow unpredictable. Maybe forking off some new borg-ng branch from master and just starting that development there, while fixing bugs and missing stuff in master branch would be an option. Depending on more insights developing over time, a release could be made from either branch. |
@ThomasWaldmann at this point I've no doubt it should be another release, and keep time to develop it without need to hurry. It looks something of huge, and if borg2 is ready, my idea it should be released as is. |
Speaking of pro and contra I'd also add that migration might also have a desirable side-effect of backup verification. However, I understand that the process might require twice as much of storage space under certain conditions. |
The commit history tells it is actively worked on: https://github.com/borgbackup/borg/commits/master/ |
#8332 has some more radical changes needing review. @elho maybe have a look, close to your 3rd item in #6602 (comment) . The borg 2.0 code will still need to deal with reading borg 1.x archives for borg transfer to migrate them into borg 2 repo, thus we have to be a bit careful not to tear down some stuff we still need. |
Can it be released in two steps. |
#8332 experiment was successful (AFAIK) and was merged into master, I will update top post here accordingly. everybody can help beta testing this huge change in 2.0.0b10+. |
Via the borgstore rclone backend, borg just got cloud storage support (for 100+ cloud storage providers). |
update: as there was no negative feedback from alpha testing, borg2 branch was merged into master, thus that big change in form of a major / breaking borg 2.0 release is coming.
read below about what's planned and what's already done.
what could be done if we decide to make a breaking release (2.0) that:
putting all the breaking stuff into 1 release is good for users (1 time effort), but will take quite some time to test and release.
After borg 2.0, we'll make a N+1 release (2.1? 3.0?) that drops all the legacy stuff from the codebase, including the converter for borg < 2.0 repos.
borg 2.0 general comments
DONE: offer a
borg transfer
command, #6663, that transforms old stuff only to stuff that will still be supported by borg N+1.N+1 general comments
much of the stuff described here has own tickets, see "breaking" label / add issue links here.
2.0 crypto
N+1 crypto
borg transfer
(not needing to re-hash)2.0 repo
borgstore
, use borgstore and other big changes #8332borg transfer
, and/orborg transfer
(in that case the old repo would be served by an old borg version)N+1 repo
borg transfer
, not needing to re-chunk content!2.0 indexes / cache
legacy_cleanup
functionN+1 indexes / cache
2.0 msgpack
N+1 msgpack
2.0 archive / item
borg transfer
) (note:Archive.save()
adds that)size
N+1 archive / item
size
2.0 or N+1 checksums
2.0 compression
N+1 compression
2.0 upgrade
N+1 archiver
2.0 remote
borg serve
. implement unix domain socket support #76152.0 cli
2.0 locking
y2038 and requiring 64bit
stuff that is out of scope
as you see above, there is already a huge scope of what should be done.
to not grow the scope even further, some stuff shall not be done (now):
The text was updated successfully, but these errors were encountered: