Transferring backups via rsync and cron

I have a backup server that coordinates backups across many servers. I want my backup server to grab Discourse backups from my forum’s server.

I gave some thought to how I’d allow the backup server to access the backup files on the forum’s server. The best way I could come up with is allowing remote access as the www-data user (who owns Discourse’s backups).

I didn’t want to allow the backup server to shell into the forum’s server as root (for standard sysadmin reasons). I also wanted to avoid doing anything that I thought could cause Discourse to choke during backups or restores. I also wanted to avoid hosting another service on forum server.

Anyways, here’s how I did it.

Allow remote access as the www-data user

  1. Edit /etc/passwd and replace www-data's shell with /bin/bash rather than /usr/sbin/nologin.
  2. Edit /etc/passwd again and replace www-data's home directory with /home/www-data rather than /var/www (optional, but appealing to me).
  3. Add the backup server’s SSH key to /home/www-data/.ssh/authorized_keys.

rsync

Finally, on the backup server, I added an hourly cron command that ran the following script:

#!/usr/bin/env bash

set -xe

HOST="$1"
DIR="$2"
if [ -z "$HOST" ] || [ ! -d "$DIR" ]; then
	echo "$0 HOST DIR"
	exit 1
fi

# --ignore-existing will have rsync ignore any backups that have already been
# copied.
# --delay-updates ensures that only complete backups ever make it into $DIR. If
# this isn't specified, partial backups could end up in $DIR, and because
# --ignore-existing won't perform any kind of equality check, the problem will
# not be corrected or detected.
rsync --ignore-existing --delay-updates "$HOST:/var/discourse/shared/standalone/backups/default/*" "$DIR"

Hopefully this proves useful to someone out there.

3 Likes