compact: free space per pack with --threshold#9801
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #9801 +/- ##
==========================================
- Coverage 84.78% 84.77% -0.01%
==========================================
Files 92 92
Lines 15251 15292 +41
Branches 2286 2298 +12
==========================================
+ Hits 12930 12964 +34
- Misses 1621 1629 +8
+ Partials 700 699 -1 ☔ View full report in Codecov by Harness. |
|
Also please rebase on current master (I have merged #9800 now). |
|
Have a look at that, maybe you can implement |
|
I did a test run with slightly modified code (N=2) and got this. So guess the 100% display issue is not fixed yet? Also: even without -v, it is very verbose: I suggest you try this manually to get a feel of it and find such issues yourself. |
Group objects by their pack and act per pack: drop fully-unused packs, and rewrite mixed packs whose unused bytes reach --threshold (default 40%) by copying the survivors forward with compact_pack. Two index scans keep the memory use bounded. refs borgbackup#8572 borgbackup#8514
Rewrite a pack only when its unused bytes reach --threshold percent; --dry-run reports the space compact would free without changing the repository (borgbackup#9379).
464c91a to
b1862f3
Compare
ThomasWaldmann
left a comment
There was a problem hiding this comment.
no manifest talk please, we do not have a manifest (in the original meaning) any more.
| self.manifest.archives.nuke_by_id(id) | ||
| except self.repository.ObjectNotFound: | ||
| logger.warning(f"Soft-deleted archive {name} {hex_id} not found.") | ||
| if not self.dry_run: # nuking soft-deleted archives mutates the manifest; skip on a dry run |
There was a problem hiding this comment.
i already mentioned it before: the manifest does not have the list of archives any more, that was a borg 1.x thing. stop talking about it in documentation like if that was still the case.
|
About my feedback that it is verbose even without -v: my fault, guess I forgot that I had tweaked borg's default via a settings file of jsongargparse. So, ignore that. |
|
I have seen that Is there a difference now with and without it? IIRC, I introduced that back then because it was much slower with stats than without. |
Description
Moves
borg compactfrom deleting single objects to compacting whole packs, so it keeps working once a pack holds more than one object (N>1).For each pack:
--thresholdpercent (default 40), copying the survivors into a new pack viaRepository.compact_packand dropping the old one. Below the threshold the pack is left alone, so a large pack is not rewritten just to reclaim a few bytes.The chunk index is scanned twice to keep memory bounded: first only per-pack byte counts to decide each pack's fate, then the object ids of just the packs that change. The #9748 crash-safety order is preserved: cached chunk indexes are invalidated before the first store change.
At N=1 every pack holds one object, so mixed packs never occur and the behavior matches before. The rewrite path is covered by a test that forces max_count > 1.
This recycles the approach from #9777, which can be closed.
refs #8572 #8514
Checklist
master