Skip to content

How I backup my data using open source tools

Published:

A couple of weeks ago, I decided to setup a backup system for all my data. The goal was to setup an automated system which needs no intervention.

I self host a lot of things and have a lot of personal data. Popular tools like Google Drive are good for syncing, not for backup. It’s important to note the difference between the two here.

Backup software

Searching the internet led me towards many open source backup tools; notably Restic and Borg Backup. Borg required a server to be run at the backup location. This was a deal breaker since I wanted to save the backup to multiple locations with or without a server.

Restic seemed to perfectly fit the bill so I decided to give it a try. It provided all the features I wanted:

Backup location

The next step was to decide a backup location. I wanted to keep two copies of backup, one on-site and one off-site. For on-site I went ahead with a simple external SSD. For off-site, the decision required more effort.

There are a lot of popular cloud storage options. S3 based like Amazon S3, Backblaze B2, iDrive E2. Others like Hetzner Storage Box, Google Drive, Dropbox, etc.

I decided to go with Hetzner Storage Box. It seemed to fit the bill because:

I setup Hetzner Storage Box over WebDav with rClone. I used it as a backend in Restic to prepare the backup repository. This is a one-time process.

restic -r rclone:hetzner-webdav:restic-repo init

Restic will ask you to set a password to encrypt the repository. Make sure to setup a strong one.

After the repository is prepared, the process to backup is a single CLI command:

restic -r rclone:hetzner-webdav:restic-repo backup /path/to/local/directory

The backup will be saved as a snapshot. Since Restic supports deduplication, the storage and time cost of future snapshots is low.

Restic supports removing old snapshots according to a policy. This is the policy I wrote which is self explanatory:

restic forget --prune --keep-daily 10 --keep-weekly 8 --keep-monthly 12 --keep-yearly 5

Automation

I used Syncthing to sync data from all my devices to my Raspberry Pi 5. In the Pi, I wrote a simple shell script to backup the data and remove older snapshots as per the policy above. Then I setup a cron job to run the script every day at 4am.

0 4 * * * /path/to/backup/script.sh >> /path/to/log/file 2>&1

Once this is setup, the entire process is automatic. Syncthing syncs data in real time to the Pi. And the cron job makes sure the data is backed up every day.

Conclusion

I will evolve this system over time as my needs change. Till now, I am happy with this. I invite you to suggest improvements and share your backup strategy as well.