Maximum backup setting deletes objects on S3 too?


(Jesse Perry) #1

Perhaps this is intended — but it’s not what I expected, so I wanted to open up to discussion.

My set up is that I have “Maximum backups” set to 15 days. I also have S3 backup bucket enabled. That’s all working great.

My intention/thought was that this set up would mean after 15 days Discourse would delete a 16 day old backup off my Discourse server. In reality, it deletes the backup off my Discourse server and the 16 day old backup off S3 as well.

I propose that instead, it only deletes the backup off the Discourse server and leaves the S3 data untouched. Why? Because AWS/S3 has it’s own lifecycle management settings and that way I can offload historical data to S3 and manage it how I want it there for cheaper, and have only the latest backups hosted on my Discourse server.

(Dean Taylor) #2

Personally I:

  • turn “versioning” on for the bucket so it doesn’t matter if Discourse deletes the backup.
  • reduce the maximum backups down to a smaller number
  • then (as you have mentioned) use Amazon’s lifecycle management to move backups to glacier or remove them entirely after so long.

As with anything backup related - always check glacier and versioning is working as you expect, some lifecycle management options can leave you with just delete markers and no file content.

(Chris Saenz) #3

I would like this too… DeanMarkTaylor’s option above does look promising.

(Jesse Perry) #4

I didn’t realize that Versioning keeps records of deleted items too. Thanks! I’ll set that up.

(ljpp) #5

This is rather illogical behavior by Discourse. The S3 backups should be exclusively managed by the S3 life-cycle.

(Robin Ward) #6

The truth is actually I did not know about that feature of S3. Can you explain it in detail and how we could use it instead of our approach? That would help us get closer to a patch.

(Dean Taylor) #7

I would suggest adding a option to set Discouse not to delete previous backups uploaded to AWS.

This way life-cycle management is easier to configure on AWS, as currently with files being deleted by Discourse can leave you with just delete markers and no file content.

Would perhaps also avoid needing to give “delete” permission on the backup S3 bucket which is kinda scary from a backup point of view.

(ljpp) #8

@eviltrout The easy approach from your development point of view would simply be not to delete any S3 stored backups, as life-cycle management is an essential part of it’s core functionality. On/off tick box is an obvious option, but not sure if that is needed - I would drop the whole deletion from Discourse/client side.

Personally I keep one weeks worth of daily backups on my server and S3, and older are stuffed to Glacier, then deleted after couple of months. And that is most likely serious overkill, as you are likely to use one of the most recent backups, in case of a failure.

(Robin Ward) #9

Cool, thanks for filling me in. I’ve added a new site setting that will allow you to disable removal of backups from S3. Just check it off, install your own policy and you should be good to go :slightly_smiling:

edit: the setting is called s3 disable cleanup

Rename "s3 disable cleanup"?
(Robin Ward) closed #10