Whenever I do an /admin/upgrade I see a lot of these throughout:
When it starts:
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Stopping 3 Unicorn worker(s), to free up memory
And then this towards the end:
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
Waiting for Unicorn to reload
I know that it’s normal - but I seem to see more of them than I used to
Instead of doing that I just did it from the Settings and reloaded the page and Dashboard stats work again - but I can’t rely on that as I need S3 backups… BUT this is a good workaround so that we can check stats until the full fix is in.
I’m not sure what’s actually causing the timeout…
How many objects are in your backup bucket? Does Admin -> Backups load without problems when backup_location is set to S3?
Our /admin/backups page (or rather then underlying request for .backups.json) takes about 50 seconds to come up each time, so it looks like the culprit of why the dashboard times out.
There are six entries in the ‘filename’ list for us, as we only keep that amount of backups rolling in S3 for 7 days.
At a guess, it looks like when calling S3 for a file list or file size then it’s taking about 10 seconds per entry. Perhaps something changed in the AWS S3 client library recently?
Hmm, I wonder if it’s because we store our backups and uploaded files/image in the same bucket? I can see S3 enumerating on the bucketlist and then taking a long time, due to all the root level keys.
Let me try giving it a separate clean bucket of it’s own and seeing if that works better.
Yes, that was it - problem solved. I created a clean new S3 bucket for the backup location and now the backup/index works quickly.
Problem Summary: With the introduction of direct backups to S3 using the new Backup Location feature, the listing of objects in S3 at the ‘S3 backup location’ setting can potentially take a long time, and cause a timeout over 30 seconds of the /admin and /admin/backup pages. This didn’t used to matter before, as the local file system was used as an interim storage location, but with a direct S3 backup the backup/index needed to list all the S3 objects it found.
Solution: Choose a S3 backup bucket with less existing objects in it, i.e. do not share your S3 Uploads location with your Backup Location S3 bucket.
Not sure if helpful, but one potential tweak to this would be to put the S3 backups with a ‘discoursebackup/’ prefix and then you could pass that here to filter the object.list call to AWS - that way it would filter it and return quickly without an empty bucket expected:
I think it’s just an unlucky edge-case Jeff, in that:
The S3 backups go in the root of the bucket chosen for backups, while ideally a location could be chosen within the bucket (a prefix for the S3 object keys).
Some Discourse stand-alone installs will have the same bucket specified as their S3 upload as well, and due to the nature of that Uploads storage hierarchy means, that there are 230+ existing root objects in it (the hashing of upload locations uses many root prefixes or ‘directories’ in filesystem speak).
S3 AWS .listObjects is notoriously slow without some sort of prefix filter.
The new Backup Location gets its info direct from S3 now, which is why it became an issue and not before.
I doubt there are many like us that kept both the backups and uploads in the same bucket, and it was super easy to just create a new S3 backup bucket anyway (which can then have it’s own version lifecycle, Glacier etc) which is probably the most sensible deployment.
s3_upload_bucket and s3_backup_bucket support prefixes. You can use my-bucket-for-discourse-stuff/backups without problems.
As you have experienced yourself, putting everything into the same bucket without using a prefix is a really bad idea. The solution is to either use the bucket exclusively for backups or to use a prefix for backups.
Aside from this issue, you’re effectively combining data you’re publishing to the web with backups of everything including personally identifying user data.
With all the work done to secure the download of backups, I would be quite open to the idea of Discourse explicitly blocking the same bucket from being used in both cases.
You should put that in the help text for the setting? At the moment it just says the bucket name, implying the opposite. “The Amazon S3 bucket name that files will be uploaded into. WARNING: must be lowercase, no periods, no underscores.”
I’m glad I figured out why ours (and others) Discourse dashboard’s were broken for so long.
Let me know when the fix is in that the listing of the backups doesn’t break the dashboard viewing.
AWS S3 bucket boundaries are no more or less secure than S3 object ACLs. If the backup files are not marked for public then there is not more or less security risk than using a different bucket with different ACLs.