How to find any missing images?

I chose to move our site to a new host, and restored from backup after setting up a new DO droplet w 2gb ram and doing a stock install.

I didn’t see any errors, and didn’t think to check all the images before shutting down the host.

As it turns out, images were missing and no amount of rebaking would help. (Initially posted here & here)

The damage was more easily visible when I ran this command, to remove all empty folders:

find /var/discourse/shared/standalone/uploads/default -type d -empty -delete

About 1/2 the folders were were removed, then I uploaded from the backup. Once that was done, I see many of the missing images.

My question is… how do I find any other posts with missing images?

EDIT Quite possibly related to https://meta.discourse.org/t/my-forum-old-images-not-see/97123/

2 Likes

I know @tgxworld has worked on this a lot recently, maybe he can advise?

1 Like

Try running

./launcher enter app
rake posts:missing_uploads
rake uploads:missing
3 Likes

ok… this is what I got

# ./launcher enter app
root@community-2019-app:/var/www/discourse# rake posts:missing_uploads
xxxxxxxx.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.......x..x.....xx.....................
105 post uploads are missing.

103 uploads are missing.
100 of 103 are old scheme uploads.
70 of 2962 posts are affected.

root@community-2019-app:/var/www/discourse# rake uploads:missing
root@community-2019-app:/var/www/discourse# rake posts:missing_uploads
xxxxxxxx.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.......x..x.....xx.....................
105 post uploads are missing.

103 uploads are missing.
100 of 103 are old scheme uploads.
70 of 2962 posts are affected.


1 Like

Can you check if the files are in the tombstone of the previous host? That or a previous backup will be our only hope of recovering the images.

2 Likes

The old host is gone, I’ve uploaded folders from the numbered folders from /default, but not everything.

What other folders should be checked?

Do you have numbered folders from the tombstone folder?

3 Likes

Here’s a cropped screenshot of what I see

I know @vinothkannans is working on something that will recover the missing posts to uploads link. He’ll let you know here when it is ready.

4 Likes

Upgrade to the latest version of Discourse and run the task rake posts:missing_uploads again. The task itself will try to recover the missing old scheme uploads from local “uploads” and “tombstone” directories. At the end it will display the count of the uploads which are not found both in database and local storage.

5 Likes

Thanks, seems like a lot of progress

:/var/www/discourse# rake posts:missing_uploads
.............................................................................x.....xx.....................
3 post uploads are missing.

3 uploads are missing.
3 of 2965 posts are affected.

I’d be interested to know which 3, but that’s way better than the 105 which had problems before!

2 Likes

If you run the below command in rails console then you can get the list of missing uploads.

PostCustomField.where(name: Post::MISSING_UPLOADS)

7 Likes

here we go…

[1] pry(main)> PostCustomField.where(name: Post::MISSING_UPLOADS)
=> [#<PostCustomField:0x000055efb0edc790
  id: 337,
  post_id: 2396,
  name: "missing uploads",
  value: "[\"https://SITENAME/uploads/default/original/1X/da27c8c7525666c20b7e04bfd53ccb24647530c6.png\"]",
  created_at: Mon, 29 Apr 2019 20:50:02 UTC +00:00,
  updated_at: Mon, 29 Apr 2019 20:50:02 UTC +00:00>,
 #<PostCustomField:0x000055efb0ee30b8
  id: 338,
  post_id: 2785,
  name: "missing uploads",
  value: "[\"https://SITENAME/uploads/default/original/1X/022705388395c48c415d699d2332bc58d44346c0.png\"]",
  created_at: Mon, 29 Apr 2019 20:50:02 UTC +00:00,
  updated_at: Mon, 29 Apr 2019 20:50:02 UTC +00:00>,
 #<PostCustomField:0x000055efb0ee4238
  id: 339,
  post_id: 2792,
  name: "missing uploads",
  value: "[\"https://SITENAME/uploads/default/original/1X/27bfa704ffb22c4a1afe28c8956dc5ed41731bec.png\"]",
  created_at: Mon, 29 Apr 2019 20:50:02 UTC +00:00,
  updated_at: Mon, 29 Apr 2019 20:50:02 UTC +00:00>]
[3

This is very useful… I can probably work it out from here.

I guess the problem was that the move was never done, and the backup script stopped backing up the old location?

3 Likes

for the record, of the 3 remaining…

1 was really missing
2 had their images… i wonder if what was listed as missing was from previous edits (cropping, etc)

Edited… I had been looking at topics, not posts.

5 Likes

I’m interested to hear the experience of others… I found the image mentioned in the first entry ( da27c8c7525666c20b7e04bfd53ccb24647530c6.png) on my local backup.

It’s an image from another post ID 1604, not 2396. Post 1604 also references that same filename, and is not missing its image.

I don’t have 022705388395c48c415d699d2332bc58d44346c0.png on my local.

I do have the third image 27bfa704ffb22c4a1afe28c8956dc5ed41731bec.png on my local, and while I can’t guess what post it’s from, I don’t think we’d have deleted that post.

@vinothkannans what does post_id: 2396, in the output mean?

The post_id field give you the unique ID of the post. Probably the easiest way to find the post’s topic from the post_id is to enter

Post.find(2396)

in your rails console. That will return the post’s topic_id and post_number. You can use those values to find the post through your site’s UI.

5 Likes

Also, you can navigate to the post using that post_id in below URL

https://discourse.example.com/p/[POST_ID]

7 Likes

Ah, got it… I was confusing post_id with topic_id

If you expect more people will be using rake posts:missing_uploads more… printing the post’s position as site_fqdn/t/topic_id/post_number at the end of the output would help a lot.

PS this edit’s my reply earlier, but still, the remaining damage is quite small

1 Like

Thanks for continuing to work on this, @vinothkannans! :sparkles:

I just tried running the rake posts:missing_uploads task on my site (currently at v2.3.0.beta9 +484), and this is what I got:

# rake posts:missing_uploads
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.xxxxxx.xxxxxxxx.xxxxxxxxxx..........................x...........................................x.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................x.................................................................................................................................................................................................................
451 post uploads are missing.

255 uploads are missing.
244 of 255 are old scheme uploads.
139 of 8812 posts are affected.

In other words, it looks like I have:

  • 244 missing old scheme uploads
  • 11 missing uploads
  • 196 missing post uploads

Just as an aside…what’s the difference between a missing upload and a missing post upload? (If I’m reading the output correctly, I have 196 more of the latter.)

Anyway, because I appear to have a lot of missing images, I decided to do a bit more spelunking with PostCustomField.where(name: Post::MISSING_UPLOADS). Consider the first result:

#<PostCustomField:0x0000564c2cef9ee8
  id: 161,
  post_id: 43,
  name: "missing uploads",
  value:
   "[\"https://SITENAME/uploads/default/35/4608d96d1b27846f.png\",\"https://SITENAME/uploads/default/35/4608d96d1b27846f.png\"]",
  created_at: Fri, 17 May 2019 05:26:57 UTC +00:00,
  updated_at: Fri, 17 May 2019 05:26:57 UTC +00:00>

I navigated to https://SITENAME/p/43 and, sure enough, upon examining the post’s Markdown source I found a link to an image at /uploads/default/35/4608d96d1b27846f.png. Interestingly enough, though, the image displays fine and even exists at /var/discourse/shared/standalone/uploads/default/35/4608d96d1b27846f.png on the host server.

So…Discourse appears to think that this image is “missing”, but it’s not—it just hasn’t been migrated to the new upload scheme yet. Is this expected behavior and, if so, is there any way to migrate the other (200+) images like this to the new upload scheme?

Thanks!


Update: I think I may have identified the 11 missing uploads mentioned above…and they really do seem to be missing. Here’s what one of them looks like:

#<PostCustomField:0x0000564c2cf0b2d8
  id: 297,
  post_id: 2203,
  name: "missing uploads",
  value:
   "[\"https://SITENAME/uploads/default/original/1X/900eff3dd456ecaf5280568676c4717e27b46c85.jpg\"]",
  created_at: Fri, 17 May 2019 05:26:57 UTC +00:00,
  updated_at: Fri, 17 May 2019 05:26:57 UTC +00:00>

Note that this particular image was stored under /uploads/default/original/1X/ rather than /uploads/default/{SOME_NUMBER}/. It’s from a 2015 post and I can’t find it in any of my backups, so it—and the other 10 images like it—appear to be well and truly gone. :cry:

Still not sure why I have 196 more missing post uploads than uploads, though…

2 Likes