Proper way to secure/backup Discourse on self-hosted server?

Hi,

On a self-hosted server, what are the best ways to prevent our forum to be gone forever? How to properly and securely backup our precious?

On a now deleted topic, @falco said:

Doing file system snapshots isn’t supported and may result in data loss.

Also, about the backup feature from Hetzner, the company says:

we recommend that you power down your server to ensure data consistency on the disk.

So I guess that’s not really a recommended solution then… Or is it?

On my forum, I use rclone and sync my local backup folders with a google drive folder.

If my server explodes, I have my weekly backups in gdrive.
If my local backups disappear and rclone deletes the backups in drive after syncing my now empty folder, my deleted backups will still be available, as they’ll be in gdrive trash.

So I feel that’s a reasonably good way to secure my forum data.

But is it really? Is there any other reliable solution easy to install?
About rclone: it is compatible with a lot of storage systems. Are some better choices for storing and syncing our backups?

2 Likes

There will never be a 100% secure way to store data, having that clear, Discourse has a really great backup process that runs in a schedulable basis.

If I don’t trust a lot of devices and I can increase monthly expenditure I’d start be offloading the backups to S3 enabling S3 replication, from there I’d have a script that copies that to my local machine and maybe once per month get everything offloaded to an external drive.

With that you have enough failure points that won’t fail all together. S3 reliability is pretty high, then you local machine should be in a pretty good state too as you use it daily and hasn’t failed (but might and surely quicker than a wide spread failure in S3).

As this secure approach doesn’t go by information security (encryption, etc). The best way is having multiple copies in multiple places.

2 Likes

If you remotely synce /var/discourse/containers and /var/discourse/shared/standalone/backupsyou'll be good. If your server goes away, you'll need only the containeryml` file(s) and the most recent backup. I recommend daily backups. If you’re especially clever and devoted, you could have some trimming process on your rsync destination that keeps weekly, monthly, yearly, backups.

3 Likes

I just wrote this: Best Practices for Backups

7 Likes

See also this:

5 Likes

Back up to amazon s3 which is automatic and built in.

4 Likes

We have used rsync for years and it works fine for us. We rsync out backups everyday to offsite backup we control and manage so if the datacenter meets with disaster, we have all the goods :slight_smile:

Also, when you think about backups and security, keep in mind that IT security consists of three key domains:

  • availability
  • integrity
  • confidentiality

When you backup your data, you need to consider all three of these domains.

If you have a high confidentiality requirement, backing up to third party solutions (and clouds, that are not under your strict administrative control and belong to others) may not be the best option for you.

Security is not one size fits all, and it is based on your unique risk management model. This is comprised of three key areas as well:

  • threat
  • vulnerability
  • criticality

It is the intersection of these three domains which help drive your backup and recovery strategy.

  • Some web sites are under threat more than others because of their content or domain (business model), others are not really of interest to bad guys.

  • Some people know how to host securely, install the latest patches, know how to secure their filesystem, etc. so they are less vulnerable than those who are not so knowledgable (or just lazy) in this area.

  • Some people run very mission critical web sites and forums. If the web site goes down, for example, they might lose a lot of money in a single day (or an hour) or their brand integrity will be tarnished.

  • Others, if the site goes down, maybe only a few people notice or care and no money is lost.

Hence, without making this fun subject into a security tome, you must understand your own risk management requirements based on your unique business model and risk factors, not other peoples risk management model.

One size does not fit all… and this is one of the most important lessons IT people can understand about IT security (but very few actually do understand). Backups and recovery is a key part of the equation.

FWIW: We never trust our backups to any third party (never) and always keep them in a safe location under our technical and administrative control.


As a side story, a friend of mine is one of the world’s top cave divers (explorers). When he dives and explores underwater caves, he has double and triple redundancy (gas, masks, computers, lights, batteries, knives, scooters, and more). I have seen him stage over 40 bottles of gas and carry with him at least two underwater scooters. He know how to manage risk underwater.

HOWEVER, this same world famous scuba diving cave explorer, he never backs up his desktop computer and often he goes online because his laptop crashed and he lost all his data. He says he does not care if he loses his powerpoint presentations… so that it is personal risk management strategy. He values his live much more than that of a few digital files.

Such is life…


So, to answer your question. We have self-hosted for nearly 30 years. We always keep our backups off site using rsync and even sftp on a server we have access too, and we have never had a problem in 30 years of having servers on the Internet. I even have an extra copy in my home network on a little Mac Mini as a private storage device. That is what I consider “secure”… for my risk management model.

4 Likes

Thank you for all these information :+1:t6:

I wonder why I didn’t even mention s3 :thinking: maybe I was unconsciously thinking about free backup methods… Even though I have a gdrive subscription :upside_down_face:

That said, how do I properly estimate the s3 cost regarding Discourse backup storage?
I’m not sure about how to fill the calculator fields:

image
In my case, my backups (with uploads) are about 1GB and I would do daily backups with up to 4-7 days of backup retention I guess.

Anothing thing I didn’t talk about is that I’d like my co-administratof to also have access to the remote backups.
Currently on my gdrive, I shared with him the directory my backups are stored in.
Is it possible to share the access to s3 backups as well?

1 Like

Expect costs of 7GB-months (plus some room to grow) per month with an additional transfer charge every time you need to actually get one of the backups.

1 Like

Sending or retrieving a backup = 1 request?