Backups Using a Network Attached Raspberry Pi

Backups Using a Network Attached Raspberry Pi

After setting up my Raspberry Pi as a NAS, I wanted to set up backups that are easy to run and check on. Initially, I wanted to set them up to run automatically, but another goal of the Pi set up was for me to turn my PC off more often. I’m currently thinking that if I’m going to turn the PC off, I will just run the backups manually since turning my PC on every Saturday night (or whenever I’d set it to run) isn’t really automated. If I go back on this and set up a cron job I’ll be sure to post about that as well.

One problem I haven’t been able to solve yet is how to backup Windows itself with this set up. Windows 7 backup tool fails, and I can’t see my network drives with the two free backup applications I tried for Windows. I can include specific folders from that drive in my backup script and/or occasionally switch my external drive to my PC to run that full backup, which is probably what I’ll end up doing.

Apart from Windows, I have a few different backups I want to run: 1. The SD card for my raspberry pi 2. A large hard drive with a lot of media files (movies, music, pictures, etc.) 3. A SSD that has all of my games on it. 4. Also on that SSD are folders for my personal software and game development projects. I want to also back these up to AWS.

Here is the script as it currently stands. I run it from Windows Subsystem for Linux on Windows 10. I have some notes below to explain the set up and why I chose to use the tools and configurations that I did.

#!/bin/bash
today=`date '+%Y_%m_%d'`;

#backup raspberry pi
ssh username@ipaddress "sudo dd if=/dev/mmcblk0 bs=1M | gzip - | dd of=/media/pi/HDDName/pibackup/pibackup$today.gz" > /mnt/e/rsynclogs/pibackuplog$today.txt

#backing up all of my development work including Unity to S3 for offsite backups. Have to add dates to the log files otherwise it overwrites the file
aws s3 sync /mnt/e/Development s3://developmentFolder/ --delete > /mnt/e/rsynclogs/S3DevOutput$today.txt

aws s3 sync /mnt/e/Unity s3://unityFolder/ --delete > /mnt/e/rsynclogs/S3UnityOutput$today.txt

#backup D drive excluding a few folders, and write logs out.
rsync -avP --delete --size-only --exclude-from '/mnt/d/rsynclogs/exclude.txt' --log-file=/mnt/d/rsynclogs/rsynclog$today.txt /mnt/d/ username@ipaddress:/media/pi/MediaBackup/

#backup E drive excluding a few folders, and write logs out.
rsync -avW --delete --size-only --exclude-from '/mnt/e/rsynclogs/exclude.txt' --log-file=/mnt/e/rsynclogs/rsynclog$today.txt /mnt/e/ username@ipaddress:/media/pi/GamesBackup/

The Raspberry Pi backup is modified from: https://johnatilano.com/2016/11/25/use-ssh-and-dd-to-remotely-backup-a-raspberry-pi/

I don't have access to the network drives from the terminal (or at least I don't know how to access them from WSL without ssh ing), so I needed the output to be relative to the Pi. The quotations enclose the commands that get sent to the Pi, so I had to extend them to include the output location. I also changed 'bs=1m' to 'bs=1M'. I believe the lowercase m is expected on Mac, but the uppercase is required on most flavors of Linux.

In order to run it from the script I had to set up my user to not require a password to execute the command, which I did by doing the following:

At a terminal on the Pi, enter 'sudo visudo', change the last line to: 'username ALL = NOPASSWD: ALL', where username is the username you are using to ssh. If you are doing this as the pi user, I don’t think this will be necessary. I'd kind of like to limit this to just the ‘dd’ command, but I'm not sure how to tell it where dd. I may update this in the future.

For setting up rsync with the correct flags, I used these two links: https://www.howtogeek.com/175008/the-non-beginners-guide-to-syncing-data-with-rsync/ https://www.thegeekstuff.com/2011/01/rsync-exclude-files-and-folders/?utm_source=feedburner

Two notes for the rsync commands:

By default, drives mount with the 'pi' user. Since I was setting up my backups to work with a different user, rsync was giving me errors about not being able to set the time on the files when I'd run the command. I’m pretty sure this was because the user didn’t have permissions on the drive. By adding the drives to fstab, it mounts them as root instead, which allows the user to access them since it has root permissions. I should have done this when setting up the drive as a NAS, but I only did it for the initial drive I was testing. See here for instructions on adding drives to fstab: https://www.howtogeek.com/139433/how-to-turn-a-raspberry-pi-into-a-low-power-network-storage-device/

For my E drive rsync, I tried initially with the same settings as the D drive, but the backup kept hanging on different files. I saw several recommendations of different flags that people claimed to be the culprit. I tried turning several off and on, but the one that seemed to fix things was swapping -P for -W (suggested here: https://github.com/Microsoft/WSL/issues/2138), which forces entire files to be transferred instead of partial files. I could probably add --progress back in, but -v for verbose gives me enough output to see where issues arise. I’d advise adding --progress back in if you encounter issues and need to check where things are going wrong..

You can find instructions for setting up the AWS CLI tools and using syncing with S3 in the AWS docs. I couldn’t find anything in there for logging, but StackOverflow had a good solution: https://stackoverflow.com/questions/35075668/output-aws-cli-sync-results-to-a-txt-file

The last thing I added to the script was a variable to grab the current date so I don’t overwrite the pi backup or the log files.

One thing I’d like to add is a way to clean up the pi backups. At ~3GB each, it isn’t a big issue currently, but eventually I’ll want to clean them up.