Multi-server backup for the home with Restic, NFS and rclone/Backblaze
Everyone needs a backup system. If you’re tinkering a lot with Linux systems and self-hosting, you get to a point where you’re the sole person in control of your data, which means you can’t rely on Google or Dropbox to take care of backups for you.
As for me, I use a two-prong approach to backing up: everything that is system configuration, I handle with Ansible
, so systems can be brought back up in the same state as before from a fresh install (give or take a bit of tinkering as systems and packages evolve); and everything that’s pure data, I back up with Restic, with a single backup repository for all my servers, and redundancy between the local repo and a remote copy. This setup complies with the famous 3-2-1 rule (3 copies in 2 places, 1 of them off-site), and it is the setup we’ll go into in this article!
Overview of the setup
I’ll assume that you are in a situation similar to mine: running various Linux servers, virtual or not, all at home. We’ll rely on one of the servers running an NFS or some other kind of local file server, with each server having a remote directory already mounted for our purposes. In my case it’s going to be /mnt/restic
on all servers.
The servers are all running Debian-based OSes, although I don’t think you’ll struggle much if you use something else. I have a couple of Raspberry Pi’s running Raspberry Pi OS, and a couple of x86 virtual machines running Debian.
Here’s how the setup will work:
- we’ll have a single Restic backup repository in the shared location
- every night, each server will push incremental data changes from the locations it monitors to the repository
- one of the servers will act as the master backup server and run its own update 1 hour later, to be sure that all the others are done already. After completing its own nightly update it will take care of backup maintenance, namely:
- every few days, running a
prune
operation on the repository to clear space - using
rclone
to sync (every night!) the local repository to a remote bucket provided by BackBlaze (see more further down for why I choose Backblaze; there are other options)
- every few days, running a
In case you’re wondering why we use a local repository and don’t back up directly to a remote location, here’s my answer: actually you could, and I initially did. But I also ran into corrupted backup issues. So I preferred to play it safe and have a local repository with nearly no latency, rather than rely on a much slower and fragile remote connection directly. That way, if there are connectivity issues and in the worst case the remote repo gets corrupted, I’m likely to catch that and fix that by running a repository check from time to time (more on that later), and simply re-syncing to the remote. Also, having the local repository ensures we have more redundancy, which is always good when it comes to backing up.
Why Restic?
In the Linux world, there’s backup solutions for all tastes. Personally I’m a big fan of Restic for a few reasons.
Multi-host and multi-path
Restic works on a backup repository that you initialise and then push stuff to. It keeps track of who pushes what (which server) and from where (which path on the server). So if I want to back up /home/pierric/stuff
on server1
and /home/pierric/stuff
on server2
, I can do that and there will be no conflict. When I look at the list of items that are backed up, I can look at them by origin server and by path as I want (and also by tag, as one can tag each backup action as needed).
Encrypted
You can push confidential data to your Restic repository without worry: everything is encrypted and nothing happens without the (strong!) password you choose. If people obtain read access to your backup repository, it’s useless to them without that password.
Easily synced
The backup repo is just a folder, so you can sync it further any way you want. In this article we’ll push incremental changes to the folder to a remote bucket using rclone
, but you have a lot of options. On top of that Restic supports some options natively (including the bucket option).
Mount to explore or restore
While the best way to restore missing files is to use the restore
command indicating which path needs to be restored (which could be an entire backed up folder, a subfolder or a single file), Restic offers a nice way to explore a backup to check which files existed or what was in a file previously without restoring it. It has a mount
command, which does what it sounds like: you can mount the backup repository to a folder of your choice, giving you a folder structure that you can explore by server and snapshot time (or tags). I’ve found this has some limitations, and using it to run a cp -ar
command to restore deep folders sometimes causes weird errors about symlinks (where there are no symlinks…). So use this for exploration, but do full restore operations using the restore
command. But it’s incredibly useful to quickly look for an older version of a file, especially when you don’t know when you made the change exactly. All the versions are out there for you to compare!
Let’s do this, then
Installing restic
On Debian-based systems this should be as simple as
sudo apt update && sudo apt install restic
But a nice touch is to follow up with
sudo restic self-update
I recently realised that the Debian-packaged version was quite far behind the latest version, and this self-update process is a breeze: run it, wait for it, and you have the latest version working. I’ve noticed that reverting to an older version might cause backups made with a newer version to not work (without damaging the backup; it is just unreadable until you upgrade again).
Creating the backup repo
Let’s start with this:
restic -r /mnt/backup init
This will create the repo in /mnt/backup
. That’s it, we’re done!
You can play around with the core commands:
# backup one folder with 2 tags
restic -r /mnt/backup backup --tag manual,testbackup /home/pierric/stuff
# see what would happen but don't change the repo (dry run)
restic -r /mnt/backup backup -n /etc
# restore a specific path to a folder, using the latest snapshot from which that path is available
restic -r /mnt/backup restore -i /home/pierric/stuff/folder1/file1.txt -t /home/pierric/recovery latest
# mount all the snapshots to browse them (then explore /mnt/temp/snapshots/server1/latest, for example)
sudo restic -r /mnt/backup mount /mnt/temp
Install and configure rclone for BackBlaze B2
I use BackBlaze (B2) as my remote cloud.
You could of course use any number of other options, S3 or rsync.net being 2 random examples. But I’ll now set up rclone
using the B2 example, which means I’ll have a bucket ID and a bucket secret key to refer to.
Let’s start by installing it.
sudo apt install rclone
Now let’s create the file /root/.config/rclone/rclone.conf
with the following contents:
[b2backup]
type = b2
account = <INSERT BUCKET ID>
key = <INSERT BUCKET SECRET KEY>
hard_delete = true
Replace the <PLACEHOLDERS>
accordingly.
That’s it, rclone
is set up. We could test it by pushing our repo manually with the command:
sudo rclone sync /mnt/restic b2backup:<INSERT BUCKET NAME>
But of course we’ll automate this step in the next phase.
A script to run the nightly update
Now that we have all the bits and pieces in place, we just need to write a script that will make a new incremental snapshot every night. That script will be slighlty different depending on the server:
- the master server will back up its data, then run maintenance and sync operations
- other servers will just back up their own data (earlier than the master server, so they’re done by the time the syncing happens)
Let’s start with the non-master script. We’ll create /root/.restic/runrestic.sh
(the choice of that folder is arbitrary, change it if you don’t like it) with the contents:
REMOTE_LOCATION=b2backup:<INSERT BUCKET NAME>
LOCAL_REPO=/mnt/restic
RESTIC_CONF=/root/.restic
RESTIC_PASSWORD=<INSERT PASSWORD>
function writelog() {
echo === $(date -Iseconds) $@ >>$RESTIC_CONF/backup.log
}
writelog starting backup process
restic -r $LOCAL_REPO backup --tag auto --files-from $RESTIC_CONF/includes.txt >>$RESTIC_CONF/backup.log 2>&1
writelog backup process complete
writelog All done for today!
This script contains your Restic password, so protect it with chmod 0700 runrestic.sh
and make sure it belongs to root
(chown root:root runrestic.sh
).
The script is basically running the one command to back files up every night, adding a tag of auto
(any other name would work) to identify those backups done via our automated process. We’ll need to create an includes.txt
file where we put all the paths we want to back up, one on each line:
/etc
/boot/*.txt
/home/pierric/.ssh
/home/pierric/.local/bin
And the master version will be the same script as above, but before the last line we insert a few more things:
# ...
writelog starting forget process
restic -r $LOCAL_REPO forget --tag auto --keep-daily 7 --keep-weekly 8 --keep-monthly 12 --keep-yearly 1 --group-by host >> $RESTIC_CONF/backup.log 2>&1
writelog forget process complete
if [[ $(date +%d) == "01" ]]; then
writelog starting prune
restic -r $LOCAL_REPO prune >> $RESTIC_CONF/backup.log 2>&1
else
writelog no prune today
fi
writelog done with prune process
writelog starting sync
rclone sync $LOCAL_REPO/ $REMOTE_LOCATION >> $RESTIC_CONF/backup.log 2>&1
writelog sync complete
writelog All done for today!
We’ve added 3 steps:
- the
forget
step deletes older backups, keeping some versions based on some smart rules. With the parameters I’ve included, it will ensure it always keeps at least 1 snapshot per year, 1 per month, 1 per week for the last 8 weeks, and 1 per day for the last 7 days. The rest, provided it’s marked with theauto
tag, will be deleted to clear space. - the
prune
step ensures that any blocks of data no longer used afterforget
are deleted. It’s a longer operation, which does not need to run every day, so we only run this once per week. - the
rclone sync
step pushes the current version of the repo to the remote bucket (rclone
will not copy everything every time, but push only the changes; the fact that the Restic repo does not reflect the original file structure is not a problem, and on routine usage the amount of data that is sent to the repo is very reasonable).
Now all that’s left is to create a cron
job to run this nightly. We’ll create /etc/cron.d/restic_nightly
with this one line, on all servers:
0 2 * * * root /root/.restic/runrestic.sh
For the master server, however, replace the 2
by a 3
. This way, the non-master servers will back up starting a 2am, and hopefully be done by 3am when the master server will do the same and then sync. Backup operations in restic can be done in parallel, so we don’t mind that all the other servers start at the same time; the only operation that is not permitted at the same time as other operations is the check
operation (see further down), which requires a complete lock on the repository.
Ansible version
As with most server-related things I’ll post around here, I’ll share the Ansible playbook extract inspired by my own server setup playbook. Note that it’s just that, an extract, not a whole playbook. Also, it assumes that the restic repo has been initialised manually already (I don’t automate this step since recovering the backup repo is part of my disaster recovery process, prior to running Ansible to set up my servers).
- name: install restic
apt:
name: restic
- name: upgrade restic
command: restic self-update
- name: create restic directory
file:
path: "{{ item }}"
state: directory
with_items:
- /root/.restic
- /root/.config/rclone
- name: copy restic provisioning files
template:
src: "{{ item }}"
dest: "/root/.restic/{{ item | basename | regex_replace('.j2$', '') }}"
owner: root
mode: "{% if 'runrestic.sh' in item %}0700{% else %}0600{% endif %}"
with_fileglob:
- templates/*.j2
#no_log: true
- name: Copy rclone config file
copy:
src: files/rclone.conf
dest: /root/.config/rclone/rclone.conf
mode: '0600'
owner: root
no_log: true
when: restic_is_master
- name: create cron job to run backup nightly
cron:
name: restic nightly run
cron_file: restic_nightly
user: root
minute: "0"
hour: "{{ '3' if restic_is_master else '2' }}"
job: "/root/.restic/runrestic.sh"
The files to provide as templates are includes.txt.j2
and runrestic.sh.j2
, with contents as per previous explanations. Similarly for rclone.conf
.
Pros and cons and recommendations
I’ll start with the few things that aren’t perfect from my point of view:
- Restic needs lots of RAM to operate. On my Raspberry Pi 3B+ with less than 1GB of RAM, if I don’t add another 1GB of swap file, it crashes very quickly when executing. Adding the swapfile solves for this, but then it runs quite slowly – which is fine since it runs in the middle of the night.
- Mounting the snapshots as explained earlier is a killer feature, but it’s also a bit slow to become available, on a Raspberry Pi 4 typically 20-30 seconds. Also, as mentioned before, sometimes the mounted folder does not work well to copy entire folders, and
restore
is better.
Those pain points are not major, mind you!
As for the things I find very neat, I’ve listed them already: I find it awesome to have such a powerful, yet flexible backup system. Except for the 2 caveats just mentioned, performance is great and backup speed is very good overall. I’ve been able to restore data in many ways already (never a major disaster, but files I lost accidentally, or data folders to bring back while reinstalling a server from scratch).
Extra tips
For peace of mind, I regularly check the backup’s health by using the check
command. Locally this looks like this:
sudo restic -r /mnt/backup check
Very simple. It will fully lock the repo, take some time to go through all the data, and hopefully tell you all is okay. If not, you may have the option to fix issues, or you might want to reset your backup immediately!
Additionally, I also check the remote repo, to ensure the sync process hasn’t corrupted it at some point. Because Restic supports B2 natively, we can do this in this way:
sudo \
B2_ACCOUNT_ID=<INSERT BUCKET ID> \
B2_ACCOUNT_KEY=<INSERT BUCKET SECRET KEY>\
restic -r b2:<INSERT BUCKET NAME> check
This will consume of bit of download bandwidth from the bucket provider, but if run only occasionally should not generate costs.
Finally, I recommend playing more with restic’s commands. The official docs are really good!
Join the conversation on this article over there !