Commit Graph

2687 Commits

Author SHA1 Message Date
Michael Eischer 78b3411076 check: consider split pack index entries as repository damage 2026-05-31 15:58:33 +02:00
Michael Eischer 5b39ad861e repository: repair index: correctly handle split index entries
In restic <0.10.0, it was possible that the blobs of a pack file were
split across multiple indexes. `MasterIndex.Rewrite` however assumed
that each an index always contains the full description of a pack file.
Therefore, further index entries for a pack were filtered out as
duplicates. Now, the code also checks the blobs contained in the index
entry while filtering out duplicates.
2026-05-31 15:58:29 +02:00
Michael Eischer f86307d223 Merge pull request #21827 from MichaelEischer/fix-pack-repair
repair packs: correctly handle packs with missing/incomplete index entry
2026-05-31 15:57:50 +02:00
Michael Eischer 77a6bf3bb7 Merge pull request #21797 from MichaelEischer/always-include-explicit-targets
backup: prevent exclude of backup targets
2026-05-31 15:42:26 +02:00
Michael Eischer c95ef18afb repository: fix error handling in repair pack if blob upload fails 2026-05-31 15:40:15 +02:00
Michael Eischer 640b2489f6 repository: repair pack: test handling of not indexed packs 2026-05-31 15:40:15 +02:00
Michael Eischer 6a3f447327 repository: repair pack: correctly handle incomplete index 2026-05-31 15:40:15 +02:00
Michael Eischer ce24640d75 backup: prevent hang using --stdin-from-command if upload fails (#21829) 2026-05-31 15:27:05 +02:00
Michael Eischer 32bcd92f60 archiver: test that explicit backup paths ignore excludes 2026-05-30 22:30:30 +02:00
Michael Eischer 02cf8e5f23 backup: prevent exclude of backup targets
Track backup targets explicitly specified by the user and prevent
excluding them. This for example ensures that `restic backup
--exclude-if-present .git /home/user/data` backs up the `data` folder
even if there is a `.git` folder in `/home/user`.

Note that this does not suffice for commands like `restic backup --exclude data /home/user/data`
as the exclude pattern will still match every single file within `data`.
2026-05-30 22:30:30 +02:00
Yaroslav Halchenko 451cc6c048 Add codespell support with configuration and fixes (#21807)
Co-authored-by: Claude Code 2.1.142 / Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 20:15:13 +00:00
Michael Eischer c221cd06ad repository: forget id of unreadable packs on index rebuild (#21826) 2026-05-30 22:09:33 +02:00
Michael Eischer e7d3a4ae51 repository: repair pack: move pack reupload to helper 2026-05-30 10:15:16 +02:00
Michael Eischer 2a95591b48 index: replace custom EachByPackResult datatype 2026-05-30 10:11:18 +02:00
Michael Eischer c669cc7a7d introduce restic.Blobs type with sort method 2026-05-30 10:10:39 +02:00
Michael Eischer f000da3b35 Return helpful error if subfolder syntax fails on Windows (#21813) 2026-05-20 22:55:01 +02:00
Michael Eischer ccfb31b5fa filter: correctly stop pattern validation on first invalid part (#21812) 2026-05-20 22:42:26 +02:00
Michael Eischer a639b8d711 Merge pull request #21811 from MichaelEischer/misc-fixes
Address various code smells, outdated comments and nits
2026-05-20 22:38:36 +02:00
Winfried Plappert 990329013e prune more aggresively (#21803)
Co-authored-by: Michael Eischer <michael.eischer@fau.de>
2026-05-16 15:49:08 +00:00
Michael Eischer 10645ccd2a fix comment and variable name typos 2026-05-16 17:05:33 +02:00
Michael Eischer fa1a318780 fs: drop outdated comment regarding UNC paths on Windows 2026-05-16 17:05:33 +02:00
Michael Eischer 8b6eff5a47 filter: fix comment for validatePatterns 2026-05-16 17:05:33 +02:00
Michael Eischer 3cc592463f Don't lower case case-insensitive patterns in place 2026-05-16 15:35:42 +02:00
Michael Eischer c04a1d857d backend/retry: debug log correct error on failed file removal 2026-05-16 15:35:42 +02:00
Michael Eischer bd945df2ea archiver: note that fileSaver.Save expects non-blocking callbacks 2026-05-16 15:35:42 +02:00
Michael Eischer 5105015f5d check: filter packs while holding blobRefs lock
Not an actual problem with the current usage. But for consistency always
take the lock when interacting with blobRefs.
2026-05-16 15:35:42 +02:00
Michael Eischer d4397926cc restorer: drop redundant nil check 2026-05-16 15:35:42 +02:00
Michael Eischer 5caa33e7b9 repository/pack: prevent packer usage after error
In-depth hardening to prevent packer reuse after an error.
2026-05-16 15:35:42 +02:00
Michael Eischer ef750c4c5d archiver: reuse buffer if reading from file failed 2026-05-16 15:35:42 +02:00
Michael Eischer 3148494a92 archiver: check chunker error before updating the node
This is actually just a cosmetic issue as chunk.Length is 0 if the
chunker returned an error.
2026-05-16 12:03:43 +02:00
Michael Eischer 265d070255 ui: Fix data race and minor API cleanup (#21801) 2026-05-15 21:23:24 +02:00
Michael Eischer bf14a94600 Merge pull request #21784 from jtru/fuse-mount-hardlink-count
mount: Ensure a hard link count > 0 for all files
2026-05-14 11:25:20 +02:00
Michael Eischer 4547fd7b18 fuse: tweak comment 2026-05-14 11:18:16 +02:00
Michael Eischer d494e37dc1 ui/termstatus: reorder findUnchangedLines function 2026-05-14 10:42:13 +02:00
Michael Eischer 59697213f9 ui/termstatus: cleanup test code 2026-05-14 10:42:13 +02:00
Michael Eischer df2d65bb88 ui/termstatus: test skipping of unchanged lines 2026-05-14 10:42:13 +02:00
Michael Eischer bd8aad3b9b ui/termstatus: deduplicate error handling 2026-05-14 10:42:13 +02:00
Michael Eischer cf34130a05 ui/termstatus: simplify status tracking 2026-05-14 10:42:13 +02:00
Donggyu Kim e33bcede2f terminal: Do not write unchanged status lines
Check if each line of status is changed, and write
the line to the terminal only if it has changed
2026-05-14 10:42:13 +02:00
Michael Eischer 4f781b69f9 add hardlink test 2026-05-13 22:29:34 +02:00
Michael Eischer f3854cf299 Merge pull request #21796 from restic/go-1.25
Bump minimum go version to 1.25 & update dependencies
2026-05-12 18:56:09 +02:00
Michael Eischer a213daef29 windows: improve randomness of temp file name
Creating two temporary files at nearly the same time could result in a
filename collision.
2026-05-11 22:50:48 +02:00
Michael Eischer e3f065ad54 windows: ignore temporary access denied error on temp file creation 2026-05-11 22:42:51 +02:00
Michael Eischer 4c94678d7d fix linter and compilation issues 2026-05-10 17:53:29 +02:00
Michael Eischer a241652787 windows: fix hang while reading from directory 2026-05-10 17:53:29 +02:00
Michael Eischer 5c935e71fa index: also preallocate hashed array tree 2026-05-10 00:35:17 +02:00
Michael Eischer 934c615e51 index: support index preallocation 2026-05-10 00:35:17 +02:00
Michael Eischer ba638b6602 indexmap: use bloom filter to drastically speed up check for unknown blobs
Only in use on 64-bit systems. Use the upper 28bits of the id of an
index entry as bloom filter. This allows skipping the index entry
traversal most of the time if an id is not stored in the hashmap.

The bloom filter embedded in the index entry id is check each time
before following a reference to an index entry. This further reduces
the risk of false positives. The bloom filter itself is basically for
free on modern CPUs.

The main performance cost of checking for unknown blobs in the index are
the essentially random RAM accesses for the initial bucket lookup as
well as following the next pointer in the index entries. With the bloom
filter most of the time only the initial bucket lookup is necessary.

This speeds up checking for unknown blobs by a factor 5 (!), while
having no effect on the lookup of known blobs:

$ benchstat no-bloom with-bloom
name                old time/op  new time/op  delta
IndexHasUnknown-16  49.0ms ± 2%   9.9ms ± 7%  -79.70%  (p=0.000 n=10+10)
IndexHasKnown-16    48.0ms ± 3%  47.9ms ± 3%     ~     (p=0.968 n=10+9)

This bloom filter parameters m=28 k=1 were derived empirically, while
also leaving sufficient room for very large repositories. Before this
commit, the final merge index step took roughly 1 second per million
index entries. With the chosen bloom filter parameters, it would
currently take 19 hours to just merge such an index. It is safe to
assume that such large repositories don't exist.

Comparison with other parameter sets:

$ m=28 k=1 versus m=32 k=1
name                old time/op  new time/op  delta
IndexHasUnknown-16  49.0ms ± 2%   9.7ms ±16%  -80.17%  (p=0.000 n=10+10)
IndexHasKnown-16    48.0ms ± 3%  48.4ms ± 3%     ~     (p=0.436 n=10+10)

$ m=28 k=1 versus m=24 k=1
name                old time/op  new time/op  delta
IndexHasUnknown-16  49.0ms ± 2%  10.8ms ±13%  -77.90%  (p=0.000 n=10+10)
IndexHasKnown-16    48.0ms ± 3%  47.9ms ± 3%     ~     (p=0.684 n=10+10)

$ m=28 k=1 versus m=28 k=2
name                old time/op  new time/op  delta
IndexHasUnknown-16  49.0ms ± 2%  24.9ms ± 5%  -49.27%  (p=0.000 n=10+10)
IndexHasKnown-16    48.0ms ± 3%  48.0ms ± 4%     ~     (p=1.000 n=10+10)

`k=2` outright wrecks the performance. This is most likely the case as
it performs worse on longer index entry chains, which also happen to be
the expensive ones to process.

`m=32` yields diminishing returns, while getting within an order of
magnitude of the largest known restic repositories.

Design alternatives:

In principle it would be possible to add a single large bloom filter
instead of embedding them in the index entry ids. However, this bloom
filter would necessarily incur additional random memory accesses and
thus slow things down overall.
2026-05-10 00:35:17 +02:00
Michael Eischer 320f709fbc index: modernize masterindex tests
`b.Loop()` drastically shortens benchmark execution times for tests with
an expensive initialization phase as it only has to happen once now.
2026-05-10 00:35:17 +02:00
Michael Eischer e33ed5d0c1 index: make tests more representative 2026-05-10 00:35:17 +02:00