Multiple backup schedules

Not sure if reply here or new post is warranted: Feature request - separate schedules for ‘include attachments’ versus ‘don’t include attachments’.

Would really love daily small backups, possibly even multiple times per day since that database is small. Attachments I wouldn’t be devastated to lose a week’s worth since they’re much more limited and typically people can find their source again. This could increase the safety factor without overwhelming the storage.
I haven’t looked at the source but it would probably be a bit of an overhaul since the restores would have to be separate entities, or at least the ability to have different restore points for the 2 sources.

4 Likes

The suggested action for this request is usually to make the other database backups using an external tool of some sort.

If you move uploads to S3 then you can make database only backups and not worry about I uploads.

6 Likes

Fairly reasonable. As soon as I start considering external tools I think of ‘external ways to really mess things up’ since they’re typically capable of more havoc than the built in idiot-proof(ish) admin console.

Last time I, and others, asked same thing, the answers were same as earlier:

  • one daily backup is enough
  • use external tools, like scritps and cron

Well, one daily backup from database is not enough, hourly could be close.

Any external tool can do the job, that’s true. But every other apps offer decent backup natively, but not Discourse.

I really would like to know if the reason for that is

  • ”we just don’t want, that’s why nobody else doesn’t need it”
  • it is technically really hard and/or expensive

Well, and always there is third option: #marketplace

If you want lots of backups with WordPress (a popular web platform) you need to use a backup plugin that costs money, so maybe not every other app does that natively. At least that’s what I’m doing, though it was a long time ago that I made that decision, so maybe it was a bad one.

The reason is that having lots of backups is a way to fill up your disk, which is on of the most common reasons that a self-hosted site goes down (which I think puts it in the “expensive” category). So the idea is that if you have enough skills to manage a zlillion backups and manage your disk space then you can probably work this out any of a bunch of ways. And if you want hourly backups, then you need to have those be database-only backups rather than dozens or hundreds of copies of your uploads.

So hourly backups make sense only if you have uploads on S3 so that you could then do database-only backups, and probably push those to S3 so you’re not worried about your local disk. And then that’s a pretty small number of self-hosted sites that want that.

If yo have all of that in place, then a plugin that would do hourly database-only backups wouldn’t be more than an hour or two of work, or maybe 2-10 if you don’t know how to make a plugin and have to figure out how to make an hourly job.

1 Like

That’s true. Wordpress itself can’t do much. That’s why there is so many plugins — bad and good ones.

That’s not true. Only extras costs, not backuping itself.

Of course, There is no point backup files, or system itself, so often. Database itself is totally different ball game. It should be done at least every 15 minutes if there is more traffic.

The question is really easy: how much content you can loose.

If the maximum data loss you can afford is this small then you might consider using a Postgres replication solution instead of taking backups so frequently.

4 Likes

Is there some other data I’m not aware? Or do you use data in the larger meaning including all files; system, Docker/Discourse/etc, uploaded?

Those files can be retrieved easily or with small costs — well, except uploaded ones, but that’s why we have S3 :wink: .

No, I meant database data.

1 Like

Then it is mostly small in size, but it is the biggest if we think forum itself. But for some reason we are back to this:

I really would like to hear is the main issue here technically or mentally. Or is this actually part of business model and if backuping will be easy and working, hosting will loose one pitch — and I don’t know if there is such sales pitcihing at all. I’m just trying to uderstand why better backuping is so major question, even it is requested before.

There is no need to suspect that there is a strategy or evil plan behind this. I don’t think there is much interest for more frequent backups. If there was, someone would have written a plugin for it. That would be a few hours of work. I don’t see #marketplace flooded with requests for it.

I think it boils down to:

  • for small forums, you won’t lose much data anyway because there is not much new in a day, so more frequent backups are not really worth the effort
  • for larger forums, more frequent backups will take performance and storage
  • for really large forums, you want to look at different solutions (like replication to a hot standby database server)

Don’t forget that the odds of actually needing a backup are really small as well. In the history of Communiteq (over 8 years), we only needed to restore a backup once* and that was only because we were impatient and did not want to wait a few hours for a file system recovery.

*) (excluding restores on request of the client where they just wanted to roll back changes, mostly in non-production forums, and excluding our monthly restore test)

4 Likes

So if I’m searching here I don’t find any topics where Discourse has crashed for some reason and the only solution was restoring from backup?

But nice to hear that Discourse is so solid it doesn’t need up-to-date backup. Well, we know that isn’t totally true.

So the circle is closed and we are back where I started:

And the third option. As long I’m de facto doing some testing for Discourse, here and by my own, I’m not willing to pay for such basic functionality.

Well, we are talking here without any comments from the team. Again :rofl:

How much do you think it would cost to develop? Depending on the price, I’d be willing to fund it right now if you’re interested. :dollar:

That issue has already been addressed by CDCK. Just give them some time to respond, that’s all.

This is the way. Please follow Running Discourse with a separate PostgreSQL server to run your own PostgreSQL instance and handle backup and high availability as you see fit if the daily backup is not enough for your use case.

3 Likes

I’m thinking $250-500, depending on how configurable it is and how much work is needed on the front end. I haven’t actually looked at what it would take, though. @RGJ might do it for less; he’s often surprised me with how quickly he can do things.

EDIT: OH, this topic is closed. If you’re interested, you can contact me or post in #marketplace .