Don’t see uploaded images – 404ing and uploads tar.gz does not show them

(Jane Jojo) #1

This issue was originally posted for a different issue

and while exploring I downloaded the backups, both from s3 and from local to find out that both “1x” and “optimized” folders don’t have all the images uploaded to the forum. Scares me. Here’s the screenshot of the folder and I know for sure there are a ton more images that were uploaded.

Any idea why I am missing them? What can I do to recover?


As I am digging more into the issue, I realized that almost all of my post images are showing, 404s. Here’s another example of a post with images uploaded 19 days back:

Edit II:

I version all my bucket according to this guideline:

Access denied errors from Amazon S3
Find all posts and replies that have images inserted/attached
(Rishabh Nambiar) #2

Did you try this step? It might help recover archived items (not expired ones).

After the Glacier expiry date, AWS will permanently delete your files.
The kind of versioning policy you followed makes a lot of sense for backups but not for images.

Did you apply this policy very recently?

(Jane Jojo) #3

I set this policy many 4-5 months back. The surprising part is, the images aren’t there even in the local backup tar.gz (the one before its uploaded).

I have made a video of the versioning policy

Did I screw up? Is there a way to recover these images?

Also, when I checked the s3 folders to initiate the restore, I couldn’t find the files

(Rishabh Nambiar) #4

Which is close to the 15 day backup + 90 day expiry. I think that you have probably deleted those images.

In the video, you can see that some files are marked as Glacier class storage. You can select all and initiate a restore so the few images that have not expired yet can be moved back to S3.
The images that have crossed the 90-day expiry cannot be restored :sweat:

I’m not sure about this, I will check but maybe another team member can help.

(Jane Jojo) #5

Thanks so much.

What do I add here for the number of days?

(Rishabh Nambiar) #6

For now you could set it to anything but remember that this restoration is only temporary.
After this time period, the restored data will be moved back to Glacier again.

One of these solutions can help:

Maybe you could restore the data, download all of the data that’s left and then re-upload to S3 after removing the expiration policies. I’m not 100% sure if this will work but the file names and folder structure must not change.
@zogstrip Will this work?

(Régis Hanol) #7

That will work as long as the associated Upload and PostUpload records are present in the database.

Also, since uploads are stored externally (ie. on S3), they aren’t backed up via the Discourse backup feature since that would mean downloading all the files from S3 before taking the backup. That would often result in a full disk…

(Jane Jojo) #8

@zogstripm @rishabhn How can I check if I can salvage the images on my machine/VM/docker (if there are any)?

(Régis Hanol) #9

The uploaded images should be in /var/www/discourse/public/uploads/default/.