Dear all,
after having searched the form at the best of my abilities without finding a solving answer, I’m asking support for an odd situation raised after a recent change of Digital Ocean Datacenter.
So, we had all our uploads stored on a Digiatal Ocean Spaces Bucket, in the ams3 datacenter.
After 2 huge HW issues and consequent service disruption in a little more than month, last weekend we decided to move all our files to fra1 datacenter.
Here are the steps I followed:
- In preparation for the transfer I uploaded all the files we had on ams3 (the 3 classic dir originals, optimized and tombstone) to the new bucket on fra1 using s3cmd.
- I went on the forum settings and set the new endpoint for attachments, cdnl and backup bucket.
- I launched a full post rebake, expecting it to fix all things in one go.
Unfotunately this was not the case. Most of the attachments were “ported” correctly, but a few hundreds were not. It’s not clear to me what happened, but these missing attachments were moved in the tombstone directory.
I thought that launching the rake task rake uploads:recover_from_tombstone
would have taken care of that, but nope. The files are seen, but at the end of the task no attachments are recovered, images are still not visible in posts.
I started to dig a bit deeper and I found out that by running UploadRecovery.new(dry_run: true).recover
(found digging in meta) in the rails console was giving me precious information, such as the post URL as well as the short or logn URL of the problematic image.
For the URLs returned in the short form, so I wrote bit of python code to “translate” the short upload filename into the long form, so that I could go and check for the presence of the file in the bucket.
I did, and I can confirm all the missing files are there, in the new bucket as well as in the old. Part of the missing uploads I found sitting in the tombstone
directory, as expected, but some others are oddly still in the original
directory. The files are not corrupted. If I access them from url they open correctly in both datacenters, and if I dump them locally on my linuxbox I can open them with no errors.
Somehow the upload recovery process fails to pick them up and fix whatever is messed up in the DB.
So my questions are:
- is there a way to understand why even if the uploads files are in
tombstone
(or inoriginal
), the rake task is failing to recover them? - what would be the correct set of steps to ensure that in case of bucket change or even transition from DO to another aws compatible environment, all attachments are moved and prepared correctly for the swap? More in general, what should one do, step by step, in such case? Clearly a simpe rebake is not enough.
- what does the task
posts:invalidate_broken_images
do? I mean, what does invalidate mean?
Thanks in advance, I am struggling with this since a week and I really need to put this to rest or I will get crazy
FYI suggestion to re-load all the 800+ attachments by hand is not considered a valid answer. There must be an algorithmic reason…