Automatic Backups on Backblaze B2
here’s how i have it set up for a hypothetical site hosted on example.com
- make an account on backblaze (atm, no need to enter payment for <10GB which is free)
-
create a bucket (backblaze > B2 Cloud Storage)
- name:
$sitename-discourse-$random
padded to 30char- in this example:
example-discourse-g87he56ht8vg
- discourse needs bucket name to be lowercase letters, numbers, and dashes only
- i suggest keeping it 30 char or less since that shows up nicely in backblaze’s webui without wrapping
- in this example:
- private bucket
- enable encryption (SSE-B2)
- enable object lock
- name:
- create an application key (backblaze > account > app keys)
- keyName:
example-discourse
- bucketName (Allow access to Bucket(s)):
example-discourse-g87he56ht8vg
- capabilities: read and write
- leave namePrefix and validDurationSeconds blank
- keyName:
- configure discourse B2 settings (discourse > admin > settings)
-
backup_location
:s3
-
s3_backup_bucket
:example-discourse-g87he56ht8vg
-
s3_endpoint
: this is shown on the bucket page – make sure to prepend withhttps://
-
s3_access_key_id
: (from previous step) -
s3_secret_access_key
: (from previous step)- backblaze only shows you the key once (at creation)!
- btw, you can also set these as env vars in your container yml instead. this would let you restore with only that file and nothing else:
-
env:
## Backblaze B2 Backups
# DISCOURSE_BACKUP_LOCATION: 's3' # uncomment to recover from cli
DISCOURSE_S3_ENDPOINT: 'https://....backblazeb2.com'
DISCOURSE_S3_BACKUP_BUCKET: 'example-discourse-g87he56ht8vg'
DISCOURSE_S3_ACCESS_KEY_ID: '...'
DISCOURSE_S3_SECRET_ACCESS_KEY: '...'
# DISCOURSE_DISABLE_EMAILS: 'non-staff' # uncomment to disable email during a test restore
## you can restore with no data beyond this container yml.
## uncomment DISCOURSE_BACKUP_LOCATION above, build container (./launcher rebuild ...),
## and then run this inside container (it will restore from B2 bucket):
## discourse enable_restore
## discourse restore <example-com-...tar.gz> # choose restore filename by browsing B2 webui
## remember to disable restore afterwards
- configure backup retention
- discourse:
-
backup_frequency
: 1 (daily backups in this example, but you could do weekly) -
maximum_backups
: disregard this setting – let backblaze handle it -
s3_disable_cleanup
: true (Prevent removal of old backups from S3 when there are more backups than the maximum allowed)
-
- backblaze (go to your bucket’s settings):
- Object Lock (Default Retention Policy): 7 days
- Lifecycle Settings (custom):
-
fileNamePrefix
:default/example-com
(optional) -
daysFromUploadingToHiding
: 8 days- this should be object lock + 1
-
daysFromHidingToDeleting
: 1 day
-
- discourse:
to summarize retention in this example:
- discourse creates backups every 1 day
- each backup file is immutable for 7 days after upload to B2 (object lock). this protects you against accidents, ransomware, etc.
- 8 days after upload, the object lock on the backup expires. since it’s mutable again, a lifecycle rule can hide the backup file
- the next part of the lifecycle rule deletes any file 1 day after it’s hidden
so you get daily backups. retention time is one week during which backups can’t be deleted no matter what. then backups are deleted 2 days later. so really a backup lives for 9 days or so.
hope that helps someone
on second thought, maybe it’s better to let discourse handle retention (maximum_backups
). that way, your backups won’t automatically start expiring if discourse is down. you wouldn’t want a clock ticking on them while trying to recover. if you went that way, you could set maximum_backups=8
and s3_disable_cleanup=false
in this example and not use a lifecycle policy in B2. you would still use the object lock policy (7 days), though.
edit: actually, i think you do still need a B2 lifecycle policy because i think files only get ‘hidden’ and not deleted when an S2 client deletes them. i’m using the “Keep only the last version of the file
” policy, which is equivalent to daysFromHidingToDeleting=1, daysFromUploadingToHiding=null
.
i guess think it over and decide which approach is right for you.
btw, i realize there’s some back in forth in this post. i think it’s informative as-is, but if someone wants, i could make another slightly simpler post with my actual recommendations.