I have a backup server that coordinates backups across many servers. I want my backup server to grab Discourse backups from my forum’s server.
I gave some thought to how I’d allow the backup server to access the backup files on the forum’s server. The best way I could come up with is allowing remote access as the www-data
user (who owns Discourse’s backups).
I didn’t want to allow the backup server to shell into the forum’s server as root (for standard sysadmin reasons). I also wanted to avoid doing anything that I thought could cause Discourse to choke during backups or restores. I also wanted to avoid hosting another service on forum server.
Anyways, here’s how I did it.
Allow remote access as the www-data user
- Edit
/etc/passwd
and replacewww-data
’s shell with/bin/bash
rather than/usr/sbin/nologin
. - Edit
/etc/passwd
again and replacewww-data
’s home directory with/home/www-data
rather than/var/www
(optional, but appealing to me). - Add the backup server’s SSH key to
/home/www-data/.ssh/authorized_keys
.
rsync
Finally, on the backup server, I added an hourly cron command that ran the following script:
#!/usr/bin/env bash
set -xe
HOST="$1"
DIR="$2"
if [ -z "$HOST" ] || [ ! -d "$DIR" ]; then
echo "$0 HOST DIR"
exit 1
fi
# --ignore-existing will have rsync ignore any backups that have already been
# copied.
# --delay-updates ensures that only complete backups ever make it into $DIR. If
# this isn't specified, partial backups could end up in $DIR, and because
# --ignore-existing won't perform any kind of equality check, the problem will
# not be corrected or detected.
rsync --ignore-existing --delay-updates "$HOST:/var/discourse/shared/standalone/backups/default/*" "$DIR"
Hopefully this proves useful to someone out there.