Summary
I believe the uploads:recover_from_tombstone
script is not restoring everything it should from the tombstone. My reasoning is as follows:
Recently, our Discourse forum had a problem where uploaded images would no longer appear:
And here is an example topic illustrating the issue.
Description
The problem began around December 30. I investigated yesterday, and events unfolded as follows:
-
I upgraded to the latest version of Discourse (v2.0.0.beta1 +123) hoping it would magically fix the issue. But sadly not.
-
We noticed that automatic backup was stuck in a pending state since two weeks. I canceled it via the web interface.
-
I noticed Sidekiq was clogged with a huge number (>200,000) of pending jobs—mainly digest mails. Jobs were being processed, but the queue was growing too fast to keep up. I decided to go for the nuclear option and
flushall
. Since then, the Sidekiq queue has been under control. -
I saw that the broken images still had
src="transparent.png"
anddata-orig-src="upload://..."
. I tried to run theuploads:recover_from_tombstone
script (described here) to see if it could fix things, but the problem persisted. -
I did some digging to find out whether the missing images were really gone:
- Examined HTML source of the example topic linked above; found base62-encoded string
upload://mJGteOa4aQSVkvJYnln9v1xx8lV
from post 16. - Decoded the string to get the SHA1 hex:
: curtis@sirius ~/code/tech/discourse (master) » echo 'require "Base62"\nprint Base62.decode("mJGteOa4aQSVkvJYnln9v1xx8lV").to_s(16)' | ruby 9f59dfb3553495e35970b917c7f79efd72f24997
- Then checked on the server running Discourse:
: curtis@tera /var/discourse (master) » find . -name '*9f59dfb3553495e35970b917c7f79efd72f24997*' ./shared/standalone/uploads/tombstone/default/original/2X/9/9f59dfb3553495e35970b917c7f79efd72f24997.jpg
So the file is there, but only in the tombstone.
- Examined HTML source of the example topic linked above; found base62-encoded string
-
I tried running
rake:recover_from_tombstone
again today; here is the abbreviated output:: curtis@tera /var/discourse (master) » sudo ./launcher enter app root@tera-app:/var/www/discourse# rake uploads:recover_from_tombstone 1 / 191 ( 0.5%) ... 39 / 191 ( 20.4%) Restoring /var/www/discourse/public/uploads/tombstone/default/original/2X/f/f984d493b9ec03b98bd8bf6d1fc28b7f77ea448e.jpg...Restored into /upl oads/default/original/2X/e/eb3f4c7aef16b46bbb835ac1d7ccea75bf94da7d.jpg Restoring /var/www/discourse/public/uploads/tombstone/default/original/2X/f/f984d493b9ec03b98bd8bf6d1fc28b7f77ea448e.jpg...Restored into /upl oads/default/original/2X/a/a14854b26712ca134dcd9951d28f9f14272f9e61.jpg Restoring /var/www/discourse/public/uploads/tombstone/default/original/2X/f/f984d493b9ec03b98bd8bf6d1fc28b7f77ea448e.jpg...Restored into /upl oads/default/original/2X/f/f984d493b9ec03b98bd8bf6d1fc28b7f77ea448e.jpg 40 / 191 ( 20.9%) ... 111 / 191 ( 58.1%) Restoring /var/www/discourse/public/uploads/tombstone/default/original/2X/9/9d714b6674072837fb08955b0b323bc067e15bc7.png...Restored into /uploads/default/original/2X/9/9d714b6674072837fb08955b0b323bc067e15bc7.png 112 / 191 ( 58.6%) ... 116 / 191 ( 60.7%) Restoring /var/www/discourse/public/uploads/tombstone/default/original/2X/9/9032d15fd85a5b49becce010babd99051a815411.png...Restored into /uploads/default/original/2X/9/9032d15fd85a5b49becce010babd99051a815411.png 117 / 191 ( 61.3%) ... 159 / 191 ( 83.2%) Restoring /var/www/discourse/public/uploads/tombstone/default/original/2X/c/cce47b09e1b3febb74685ce135c754ae4699e72a.png...Restored into /uploads/default/original/2X/5/5af08ab461c4220d300c6efbafcca454441523d8.png 160 / 191 ( 83.8%) ... 191 / 191 (100.0%)
So it did some stuff, but did not hit
9f59dfb3553495e35970b917c7f79efd72f24997
as I had hoped.
Questions
-
Is this a bug or limitation of the recovery script? Looking at the source of that routine, I guess it only checks the
src
of eachimg
and does not look at anydata-orig-src
attribute? Should it? Apologies if I am misreading things; my Ruby-fu is very weak. -
How should I proceed to fix my forum? I read on other threads that it is best not to manually copy things out of the tombstone. I am concerned that if I don’t fix this before Friday—when we migrate to hosted Discourse, woohoo! —these images will be permanently lost.
-
Is it normal that even since yesterday, some additional images were recovered from the tombstone today? Is this typical? Or is it indicative of some other bug? How can I check?
-
Why are these images ending up in the tombstone in the first place? Is there an outstanding bug here? Should I have been disabling Sidekiq during upgrades?