restic

mirror of https://github.com/restic/restic.git synced 2026-05-11 21:15:23 +00:00

Author	SHA1	Message	Date
Michael Eischer	5c935e71fa	index: also preallocate hashed array tree	2026-05-10 00:35:17 +02:00
Michael Eischer	934c615e51	index: support index preallocation	2026-05-10 00:35:17 +02:00
Michael Eischer	ba638b6602	indexmap: use bloom filter to drastically speed up check for unknown blobs Only in use on 64-bit systems. Use the upper 28bits of the id of an index entry as bloom filter. This allows skipping the index entry traversal most of the time if an id is not stored in the hashmap. The bloom filter embedded in the index entry id is check each time before following a reference to an index entry. This further reduces the risk of false positives. The bloom filter itself is basically for free on modern CPUs. The main performance cost of checking for unknown blobs in the index are the essentially random RAM accesses for the initial bucket lookup as well as following the next pointer in the index entries. With the bloom filter most of the time only the initial bucket lookup is necessary. This speeds up checking for unknown blobs by a factor 5 (!), while having no effect on the lookup of known blobs: $ benchstat no-bloom with-bloom name old time/op new time/op delta IndexHasUnknown-16 49.0ms ± 2% 9.9ms ± 7% -79.70% (p=0.000 n=10+10) IndexHasKnown-16 48.0ms ± 3% 47.9ms ± 3% ~ (p=0.968 n=10+9) This bloom filter parameters m=28 k=1 were derived empirically, while also leaving sufficient room for very large repositories. Before this commit, the final merge index step took roughly 1 second per million index entries. With the chosen bloom filter parameters, it would currently take 19 hours to just merge such an index. It is safe to assume that such large repositories don't exist. Comparison with other parameter sets: $ m=28 k=1 versus m=32 k=1 name old time/op new time/op delta IndexHasUnknown-16 49.0ms ± 2% 9.7ms ±16% -80.17% (p=0.000 n=10+10) IndexHasKnown-16 48.0ms ± 3% 48.4ms ± 3% ~ (p=0.436 n=10+10) $ m=28 k=1 versus m=24 k=1 name old time/op new time/op delta IndexHasUnknown-16 49.0ms ± 2% 10.8ms ±13% -77.90% (p=0.000 n=10+10) IndexHasKnown-16 48.0ms ± 3% 47.9ms ± 3% ~ (p=0.684 n=10+10) $ m=28 k=1 versus m=28 k=2 name old time/op new time/op delta IndexHasUnknown-16 49.0ms ± 2% 24.9ms ± 5% -49.27% (p=0.000 n=10+10) IndexHasKnown-16 48.0ms ± 3% 48.0ms ± 4% ~ (p=1.000 n=10+10) `k=2` outright wrecks the performance. This is most likely the case as it performs worse on longer index entry chains, which also happen to be the expensive ones to process. `m=32` yields diminishing returns, while getting within an order of magnitude of the largest known restic repositories. Design alternatives: In principle it would be possible to add a single large bloom filter instead of embedding them in the index entry ids. However, this bloom filter would necessarily incur additional random memory accesses and thus slow things down overall.	2026-05-10 00:35:17 +02:00
Michael Eischer	320f709fbc	index: modernize masterindex tests `b.Loop()` drastically shortens benchmark execution times for tests with an expensive initialization phase as it only has to happen once now.	2026-05-10 00:35:17 +02:00
Michael Eischer	e33ed5d0c1	index: make tests more representative	2026-05-10 00:35:17 +02:00
Michael Eischer	39084a912e	Merge pull request #5700 from MichaelEischer/err-invalid-env	2026-05-10 00:18:40 +02:00
Michael Eischer	4c0dc9e202	index: support incremental index loading Do not require a full index reload if only a few additional index files have been added. This can drastically speed up loading the index in the mount command.	2026-05-07 22:52:03 +02:00
Michael Eischer	7077500a3b	Have `backup -vv` mention compressed size of added files (#5669 ) ui: mention compressed size of added files in `backup -vv` This is already shown for modified files, but the added files message wasn't updated when compression was implemented in restic. Co-authored-by: Ilya Grigoriev <ilyagr@users.noreply.github.com>	2026-02-18 21:24:29 +01:00
Michael Eischer	d1937a530b	clarify pack ID in decryption error (#5710 ) pack ID is included in full. In addition, the error message now says that it is a pack file.	2026-02-18 20:43:10 +01:00
Michael Eischer	5be6d9c73f	fail of RESTIC_READ_CONCURRENCY or RESTIC_COMPRESSION are invalid	2026-02-01 15:57:07 +01:00
gunar	7101f11133	Fail fast for invalid RESTIC_PACK_SIZE env values (#5592 ) Co-authored-by: Michael Eischer <michael.eischer@fau.de>	2026-02-01 15:45:31 +01:00
Michael Eischer	67c13c643d	Merge pull request #5691 from MichaelEischer/fix-rewriter-error	2026-02-01 12:09:52 +01:00
Michael Eischer	ef1d525f22	rewriter: return correct error if tree iteration fails	2026-01-31 22:07:07 +01:00
Michael Eischer	74d60ad223	rewriter: test KeepEmptyDirectory option	2026-01-31 22:01:23 +01:00
Michael Eischer	0d71f70a22	minor cleanups and typos	2026-01-31 22:01:23 +01:00
Winfried Plappert	ee154ce0ab	restic rewrite --include added function Count() to the *TreeWriter methods	2026-01-31 19:58:29 +00:00
Winfried Plappert	5148608c39	restic rewrite include - based on restic 0.18.1 cmd/restic/cmd_rewrite.go: introduction of include filters for this command: - add include filters, add error checking code - add new parameter 'keepEmptyDirectoryFunc' to 'walker.NewSnapshotSizeRewriter()', so empty directories have to be kept to keep the directory structure intact - add parameter 'keepEmptySnapshot' to 'filterAndReplaceSnapshot()' to keep snapshots intact when nothing is to be included - introduce helper function 'gatherIncludeFilters()' and 'gatherExcludeFilters()' to keep code flow clean cmd/restic/cmd_rewrite_integration_test.go: add several new tests around the 'include' functionality internal/filter/include.go: this is where is include filter is defined internal/walker/rewriter.go: - struct RewriteOpts gains field 'KeepEmtpyDirectory', which is a 'NodeKeepEmptyDirectoryFunc()' which defaults to nil, so that al subdirectories are kept - function 'NewSnapshotSizeRewriter()' gains the parameter 'keepEmptyDirecoryFilter' which controls the management of empty subdirectories in case of include filters active internal/data/tree.go: gains a function Count() for checking the number if node elements in a newly built tree internal/walker/rewriter_test.go: function 'NewSnapshotSizeRewriter()' gets an additional parameter nil to keeps things happy cmd/restic/cmd_repair_snapshots.go: function 'filterAndReplaceSnapshot()' gets an additional parameter 'keepEmptySnapshot=nil' doc/045_working_with_repos.rst: gets to mention include filters changelog/unreleased/issue-4278: the usual announcement file git rebase master -i produced this restic rewrite include - keep linter happy cmd/restic/cmd_rewrite_integration_test.go: linter likes strings.Contain() better than my strings.Index() >= 0	2026-01-31 19:42:56 +00:00
Michael Eischer	ce7c144aac	data: add support for unknown keys to treeIterator While not planned, it's also not completely impossible that a tree node might get additional top-level fields. As the tree iterator is built with a strict expectation of the top-level fields, this would result in a parsing error. Future-proof the code by simply skipping unknown fields.	2026-01-31 20:03:38 +01:00
Michael Eischer	81948937ca	data: test DualTreeIterator	2026-01-31 20:03:38 +01:00
Michael Eischer	fa8889eec4	data: test LoadTree+SaveTree cycle	2026-01-31 20:03:38 +01:00
Michael Eischer	6de64911fb	data: test TreeFinder	2026-01-31 20:03:38 +01:00
Michael Eischer	17688c2313	data: move TestTreeMap to data package to allow reuse	2026-01-31 20:03:38 +01:00
Michael Eischer	e1a5550a27	test: use generics in Equal function signature This simplifies comparing a typed value against nil. Previously it was necessary to case nil into the proper type.	2026-01-31 20:03:38 +01:00
Michael Eischer	24d56fe2a6	diff: switch to efficient DualTreeIterator The previous implementation stored the whole tree in a map and used it for checking overlap between trees. This is now replaced with the DualTreeIterator, which iterates over two trees in parallel and returns the merge stream in order. In case of overlap between both trees, it returns both nodes at the same time. Otherwise, only a single node is returned.	2026-01-31 20:03:38 +01:00
Michael Eischer	350f29d921	data: replace Tree with TreeNodeIterator The TreeNodeIterator decodes nodes while iterating over a tree blob. This should reduce peak memory usage as now only the serialized tree blob and a single node have to alive at the same time. Using the iterator has implications for the error handling however. Now it is necessary that all loops that iterate through a tree check for errors before using the node returned by the iterator. The other change is that it is no longer possible to iterate over a tree multiple times. Instead it must be loaded a second time. This only affects the tree rewriting code.	2026-01-31 20:03:38 +01:00
Michael Eischer	1e183509d4	data: rework StreamTrees to use synchronous callbacks The tree.Nodes will be replaced by an iterator to loads and serializes tree node ondemand. Thus, the processing moves from StreamTrees into the callback. Schedule them onto the workers used by StreamTrees for proper load distribution.	2026-01-31 20:03:38 +01:00
Michael Eischer	25a5aa3520	dump: fix missing error handling if tree cannot be read	2026-01-31 19:18:36 +01:00
Michael Eischer	278e457e1f	data: use data.TreeWriter to serialize&write data.Tree Always serialize trees via TreeJSONBuilder. Add a wrapper called TreeWriter which combines serialization and saving the tree blob in the repository. In the future, TreeJSONBuilder will have to upload tree chunks while the tree is still serialized. This will a wrapper like TreeWriter, so add it right now already. The archiver.treeSaver still directly uses the TreeJSONBuilder as it requires special handling.	2026-01-31 19:18:36 +01:00
Michael Eischer	f84d398989	repository: prevent test deadlock within WithBlobUploader Calling t.Fatal internally triggers runtime.Goexit . This kills the current goroutine while only running deferred code. Add an extra context that gets canceled if the go routine exits while within the user provided callback.	2026-01-31 19:18:36 +01:00
Michael Eischer	d82ea53735	data: fix invalid trees used in test cases data.TestCreateSnapshot which is used in particular by TestFindUsedBlobs and TestFindUsedBlobs could generate trees with duplicate file names. This is invalid and going forward will result in an error.	2026-01-31 19:18:36 +01:00
Michael Eischer	880b08f9ec	Merge pull request #5627 from MichaelEischer/faster-files-writer restore: tune fileswriter	2026-01-26 21:45:49 +01:00
Ilya Grigoriev	79c37f3d1a	ui: mention compressed size of added files in `backup -vv` This is already shown for modified files, but the added files message wasn't updated when compression was implemented in restic.	2026-01-15 18:39:16 -08:00
Michael Eischer	ebc51e60c9	Merge pull request #5626 from MichaelEischer/lazy-status ui: only redraw status bar if it has not changed	2025-12-03 21:29:35 +01:00
Michael Eischer	1e6ed458ff	remove old // +build comments	2025-11-30 11:53:23 +01:00
Michael Eischer	760d0220f4	restorer: scale file cache with workers count	2025-11-30 11:01:01 +01:00
Michael Eischer	24fcfeafcb	restore: cache file descriptors This avoid opening and closing files after each single blob write	2025-11-30 10:56:15 +01:00
Michael Eischer	0ee9360f3e	restore: reduce contention while writing files	2025-11-29 23:09:04 +01:00
Michael Eischer	ae6d6bd9a6	ui: only redraw status bar if it has not changed	2025-11-29 22:09:41 +01:00
Aneesh N	b9afdf795e	Fix: Correctly restore ACL inheritance state (#5465 ) * Fix: Correctly restore ACL inheritance state When restoring a file or directory on Windows, the `IsInherited` property of its Access Control Entries (ACEs) was always being set to `False`, even if the ACEs were inherited in the original backup. This was caused by the restore process calling the `SetNamedSecurityInfo` API without providing context about the object's inheritance policy. By default, this API applies the provided Discretionary Access Control List (DACL) as an explicit set of permissions, thereby losing the original inheritance state. This commit fixes the issue by inspecting the `Control` flags of the saved Security Descriptor during restore. Based on whether the `SE_DACL_PROTECTED` flag is present, the code now adds the appropriate `PROTECTED_DACL_SECURITY_INFORMATION` or `UNPROTECTED_DACL_SECURITY_INFORMATION` flag to the `SetNamedSecurityInfo` API call. By providing this crucial inheritance context, the Windows API can now correctly reconstruct the ACL, ensuring the `IsInherited` status of each ACE is preserved as it was at the time of backup. * Fix: Correctly restore ACL inheritance flags This commit resolves an issue where the ACL inheritance state (`IsInherited` property) was not being correctly restored for files and directories on Windows. The root cause was that the `SECURITY_INFORMATION` flags used in the `SetNamedSecurityInfo` API call contained both the `PROTECTED_DACL_SECURITY_INFORMATION` and `UNPROTECTED_DACL_SECURITY_INFORMATION` flags simultaneously. When faced with this conflicting information, the Windows API defaulted to the more restrictive `PROTECTED` behavior, incorrectly disabling inheritance on restored items. The fix modifies the `setNamedSecurityInfoHigh` function to first clear all existing inheritance-related flags from the `securityInfo` bitmask. It then adds the single, correct flag (`PROTECTED` or `UNPROTECTED`) based on the `SE_DACL_PROTECTED` control bit from the original, saved Security Descriptor. This ensures that the API receives unambiguous instructions, allowing it to correctly preserve the inheritance state as it was at the time of backup. The accompanying test case for ACL inheritance now passes with this change. * Fix inheritance flag handling in low-privilege security descriptor restore When restoring files without admin privileges, the IsInherited property of Access Control Entries (ACEs) was not being preserved correctly. The low-privilege restore path (setNamedSecurityInfoLow) was using a static PROTECTED_DACL_SECURITY_INFORMATION flag, which always marked the restored DACL as explicitly set rather than inherited. This commit updates setNamedSecurityInfoLow to dynamically determine the correct inheritance flag based on the SE_DACL_PROTECTED control flag from the original security descriptor, matching the behavior of the high-privilege path (setNamedSecurityInfoHigh). Changes: - Update setNamedSecurityInfoLow to accept control flags parameter - Add logic to set either PROTECTED_DACL_SECURITY_INFORMATION or UNPROTECTED_DACL_SECURITY_INFORMATION based on the original SD - Add TestRestoreSecurityDescriptorInheritanceLowPrivilege to verify inheritance is correctly restored in low-privilege scenarios This ensures that both admin and non-admin restore operations correctly preserve the inheritance state of ACLs, maintaining the original permissions flow on child objects. Addresses review feedback on PR for issue #5427 * Refactor security flags into separate backup/restore variants Split highSecurityFlags into highBackupSecurityFlags and highRestoreSecurityFlags to avoid runtime bitwise operations. This makes the code cleaner and more maintainable by using appropriate flags for GET vs SET operations. Addresses review feedback on PR for issue #5427 --------- Co-authored-by: Aneesh Nireshwalia <anireshw@akamai.com>	2025-11-28 19:22:47 +00:00
Winfried Plappert	ce57961f14	restic check with snapshot filters (#5469 ) --------- Co-authored-by: Michael Eischer <michael.eischer@fau.de>	2025-11-28 19:12:38 +00:00
Michael Eischer	f3a89bfff6	Merge pull request #5612 from MichaelEischer/repository-async-saveblob repository: add async blob upload method	2025-11-26 21:34:35 +01:00
Michael Eischer	5cc8636047	Merge pull request #5614 from MichaelEischer/fix-lookupblobsize repository: fix LookupBlobSize to also return pending blobs	2025-11-26 21:24:32 +01:00
Michael Eischer	6769d26068	archiver: improve test reliability	2025-11-26 21:21:16 +01:00
Michael Eischer	5607fd759f	repository: fix race condition for blobSaver shutdown wg.Go() may not be called after wg.Wait(). This prevents connecting two errgroups such that the errors are propagated between them if the child errgroup dynamically starts goroutines. Instead use just a single errgroup, and sequence the shutdown using a sync.WaitGroup. This is far simpler and does not require any "clever" tricks.	2025-11-26 21:18:22 +01:00
Michael Eischer	9f87e9096a	repository: add tests for SaveBlobAsync	2025-11-26 21:18:22 +01:00
Michael Eischer	d8dcd6d115	archiver: add buffer test	2025-11-26 21:18:22 +01:00
Michael Eischer	3f92987974	archiver: assert number of uploaded chunks in fileSaver test	2025-11-26 21:18:22 +01:00
Michael Eischer	7f6fdcc52c	archiver: convert buffer pool to use sync.Pool	2025-11-26 21:18:22 +01:00
Michael Eischer	dd6cb0dd8e	archiver: port to repository.SaveBlobAsync	2025-11-26 21:18:22 +01:00
Michael Eischer	046b0e711d	repository: add SaveBlobAsync method	2025-11-26 21:18:21 +01:00

1 2 3 4 5 ...

2641 Commits