Only in use on 64-bit systems. Use the upper 28bits of the id of an index entry as bloom filter. This allows skipping the index entry traversal most of the time if an id is not stored in the hashmap. The bloom filter embedded in the index entry id is check each time before following a reference to an index entry. This further reduces the risk of false positives. The bloom filter itself is basically for free on modern CPUs. The main performance cost of checking for unknown blobs in the index are the essentially random RAM accesses for the initial bucket lookup as well as following the next pointer in the index entries. With the bloom filter most of the time only the initial bucket lookup is necessary. This speeds up checking for unknown blobs by a factor 5 (!), while having no effect on the lookup of known blobs: $ benchstat no-bloom with-bloom name old time/op new time/op delta IndexHasUnknown-16 49.0ms ± 2% 9.9ms ± 7% -79.70% (p=0.000 n=10+10) IndexHasKnown-16 48.0ms ± 3% 47.9ms ± 3% ~ (p=0.968 n=10+9) This bloom filter parameters m=28 k=1 were derived empirically, while also leaving sufficient room for very large repositories. Before this commit, the final merge index step took roughly 1 second per million index entries. With the chosen bloom filter parameters, it would currently take 19 hours to just merge such an index. It is safe to assume that such large repositories don't exist. Comparison with other parameter sets: $ m=28 k=1 versus m=32 k=1 name old time/op new time/op delta IndexHasUnknown-16 49.0ms ± 2% 9.7ms ±16% -80.17% (p=0.000 n=10+10) IndexHasKnown-16 48.0ms ± 3% 48.4ms ± 3% ~ (p=0.436 n=10+10) $ m=28 k=1 versus m=24 k=1 name old time/op new time/op delta IndexHasUnknown-16 49.0ms ± 2% 10.8ms ±13% -77.90% (p=0.000 n=10+10) IndexHasKnown-16 48.0ms ± 3% 47.9ms ± 3% ~ (p=0.684 n=10+10) $ m=28 k=1 versus m=28 k=2 name old time/op new time/op delta IndexHasUnknown-16 49.0ms ± 2% 24.9ms ± 5% -49.27% (p=0.000 n=10+10) IndexHasKnown-16 48.0ms ± 3% 48.0ms ± 4% ~ (p=1.000 n=10+10) `k=2` outright wrecks the performance. This is most likely the case as it performs worse on longer index entry chains, which also happen to be the expensive ones to process. `m=32` yields diminishing returns, while getting within an order of magnitude of the largest known restic repositories. Design alternatives: In principle it would be possible to add a single large bloom filter instead of embedding them in the index entry ids. However, this bloom filter would necessarily incur additional random memory accesses and thus slow things down overall.
Introduction
restic is a backup program that is fast, efficient and secure. It supports the three major operating systems (Linux, macOS, Windows) and a few smaller ones (FreeBSD, OpenBSD).
For detailed usage and installation instructions check out the documentation.
You can ask questions in our Discourse forum.
Quick start
Once you've installed restic, start off with creating a repository for your backups:
$ restic init --repo /tmp/backup
enter password for new backend:
enter password again:
created restic backend 085b3c76b9 at /tmp/backup
Please note that knowledge of your password is required to access the repository.
Losing your password means that your data is irrecoverably lost.
and add some data:
$ restic --repo /tmp/backup backup ~/work
enter password for repository:
scan [/home/user/work]
scanned 764 directories, 1816 files in 0:00
[0:29] 100.00% 54.732 MiB/s 1.582 GiB / 1.582 GiB 2580 / 2580 items 0 errors ETA 0:00
duration: 0:29, 54.47MiB/s
snapshot 40dc1520 saved
Next you can either use restic restore to restore files or use restic mount to mount the repository via fuse and browse the files from previous
snapshots.
For more options check out the online documentation.
Backends
Saving a backup on the same machine is nice but not a real backup strategy. Therefore, restic supports the following backends for storing backups natively:
- Local directory
- sftp server (via SSH)
- HTTP REST server (protocol, rest-server)
- Amazon S3 (either from Amazon or using the Minio server)
- OpenStack Swift
- BackBlaze B2
- Microsoft Azure Blob Storage
- Google Cloud Storage
- And many other services via the rclone Backend
Design Principles
Restic is a program that does backups right and was designed with the following principles in mind:
-
Easy: Doing backups should be a frictionless process, otherwise you might be tempted to skip it. Restic should be easy to configure and use, so that, in the event of a data loss, you can just restore it. Likewise, restoring data should not be complicated.
-
Fast: Backing up your data with restic should only be limited by your network or hard disk bandwidth so that you can backup your files every day. Nobody does backups if it takes too much time. Restoring backups should only transfer data that is needed for the files that are to be restored, so that this process is also fast.
-
Verifiable: Much more important than backup is restore, so restic enables you to easily verify that all data can be restored.
-
Secure: Restic uses cryptography to guarantee confidentiality and integrity of your data. The location the backup data is stored is assumed not to be a trusted environment (e.g. a shared space where others like system administrators are able to access your backups). Restic is built to secure your data against such attackers.
-
Efficient: With the growth of data, additional snapshots should only take the storage of the actual increment. Even more, duplicate data should be de-duplicated before it is actually written to the storage back end to save precious backup space.
Reproducible Builds
The binaries released with each restic version starting at 0.6.1 are reproducible, which means that you can reproduce a byte identical version from the source code for that release. Instructions on how to do that are contained in the builder repository.
News
You can follow the restic project on Mastodon @resticbackup or subscribe to the project blog.
License
Restic is licensed under BSD 2-Clause License. You can find the
complete text in LICENSE.
Sponsorship
Backend integration tests for Google Cloud Storage and Microsoft Azure Blob Storage are sponsored by AppsCode!
