How to find any missing images?

I also have a lot of missing images:

root@xxxxx-app:/var/www/discourse# rake posts:missing_uploads

37 post uploads are missing.

34 uploads are missing.
5 of 34 are old scheme uploads.
33 of 1013268 posts are affected.

but after

root@xxxxx-app:/var/www/discourse# rake uploads:missing
.
.
.

/var/www/discourse/public/uploads/default/original/2X/3/3a9bf205dec2b6bd0b3cc35a3be1f69499960713.JPG
/var/www/discourse/public/uploads/default/original/3X/2/d/2db0ff326859b94824b64c4e0c2b156c562b7a99.jpg
/var/www/discourse/public/uploads/default/original/3X/e/f/ef271ac232c31e747206b47e4de7e0570de9e030.jpg
**10604** of 101083 uploads are missing

/var/www/discourse/public/uploads/default/optimized/2X/3/3d3efaa44fb43b99ec290b75e289080fc448f709_1_657x500.gif
/var/www/discourse/public/uploads/default/optimized/2X/4/4e3b01361d7c30c3df27a9606271175d91edff6d_1_200x200.jpeg
/var/www/discourse/public/uploads/default/optimized/2X/6/68fcd443312e7c500d25a6067c04c98aa1066686_1_200x200.JPG
/var/www/discourse/public/uploads/default/optimized/2X/1/16452ee2749e93e6b47d1388a20e34c1e3832ee1_1_100x100.jpg
/var/www/discourse/public/uploads/default/optimized/2X/0/0e843544885c0ee7a0c350d66bbe852dd4f0a497_1_135x135.jpeg
/var/www/discourse/public/uploads/default/optimized/2X/f/fb3d38e8d1b0e8ae25606156d37b8045f7cbc2b3_1_200x200.jpg
/var/www/discourse/public/uploads/default/optimized/2X/5/54a076f5be7cfabcf0fa34e57b836d85a33a879e_1_200x200.jpg
7 of 247116 optimized_images are missing

I’m having this problem as well on a forum running v2.3.0.beta9. I haven’t restored from backups. It’d be good to know why it’s happening and what we can do to restore the missing uploads. The good news is that the missing uploads seem to be in the tombstone, so hope isn’t lost…

So far, I’ve rebuilt the app and run rake uploads:missing, but that hasn’t helped.

root@forum-app:/var/www/discourse# rake posts:missing_uploads
Looking for missing uploads on: default

146 post uploads are missing.

142 uploads are missing.
81 of 58082 posts are affected.
root@forum-app:/var/www/discourse# rake uploads:missing
Looking for missing uploads on: default

146 post uploads are missing.

142 uploads are missing.
81 of 58082 posts are affected.

The missing uploads are from twoish weeks ago.

Problem solved! rake uploads:recover_from_tombstone did the trick. It’d still be good to understand why this happened…

4 Likes

@vinothkannans, @zogstrip Any updates on this?

Just checking in again. Does anyone know the answers to the following questions from a month ago?

  1. What’s the difference between a missing upload and a missing post upload? (Last I checked, my site appeared to have 196 more of the latter.)

  2. Why does Discourse report images that haven’t been migrated to the new upload scheme as “missing”?

  3. If this is expected behavior, how can I migrate the 200+ images like this on my site to the new upload scheme?

2 Likes

@vinothkannans can advise

  • missing upload - The upload record is found in DB but the file is missing in the file storage.
  • missing post upload - The upload record is not found in DB for an upload URL in the post.

I think those old scheme uploads are not found in the database and/or filesystem too.

You can do it by enabling the setting using the below command. It will migrate those upload if they’re found in DB and filesystem.

SiteSetting.migrate_to_new_scheme = true
4 Likes

Forgive me… where do we load these?

You need to enter that into the rails console as it’s a hidden site setting.

3 Likes

so:

root@site:~# cd /var/discourse/
root@site:/var/discourse# ./launcher enter app
root@site-app:/var/www/discourse# rails c
[1] pry(main)> SiteSetting.migrate_to_new_scheme = true

then exit, and rebuild?

EDIT:

any other steps… do we need to undo this after?

2 Likes

Looks good. No need to rebuild or undo.

4 Likes

Thanks, @vinothkannans. :bowing_man:t2:

How will I know that this command has been successful? Should I try running rake posts:missing_uploads again after enabling this hidden site setting in the Rails console? Or maybe PostCustomField.where(name: Post::MISSING_UPLOADS)?

Any updates on this?

It will run and migrate in the background job. After it’s done SiteSetting.migrate_to_new_scheme site setting value will revert back to false. Then run the rake posts:missing_uploads task again.

1 Like

Okay, I just followed your advice and the output does look hopeful. First, I entered the Discourse container:

$ cd /var/discourse
$ sudo ./launcher enter app

WARNING: We are about to start downloading the Discourse base image
This process may take anywhere between a few minutes to an hour, depending on your network speed

Please be patient

Unable to find image 'discourse/base:2.0.20190625-0946' locally
2.0.20190625-0946: Pulling from discourse/base
Digest: sha256:9899c60721649460283ac800836ac1ebecbc3ed8a97a496e514cf8c97f5b6d82
Status: Downloaded newer image for discourse/base:2.0.20190625-0946

Next, I ran rake posts:missing_uploads:

# rake posts:missing_uploads
Looking for missing uploads on: default
Fixing missing uploads: 
.........................................................................................................................................................................................................................................................
12 post uploads are missing.

12 uploads are missing.
1 of 12 are old scheme uploads.
3 of 8930 posts are affected.

(Only 12 missing post uploads this time! Nice!)

Finally, I set SiteSetting.migrate_to_new_scheme equal to true and exited the Rails console:

# rails c
[1] pry(main)> SiteSetting.migrate_to_new_scheme
=> false
[2] pry(main)> SiteSetting.migrate_to_new_scheme = true
=> true
[3] pry(main)> exit

After some time had passed, I confirmed that the value of SiteSetting.migrate_to_new_scheme had indeed changed to false and ran rake posts:missing_uploads again:

[1] pry(main)> SiteSetting.migrate_to_new_scheme
=> false
[2] pry(main)> exit
# rake posts:missing_uploads
Looking for missing uploads on: default
Fixing missing uploads: 
.
12 post uploads are missing.

12 uploads are missing.
1 of 12 are old scheme uploads.
3 of 8939 posts are affected.

The output is more or less the same, so I think that’s supposed to mean that all the posts using the old upload scheme have been migrated to the new upload scheme. However, the uploads directory still has a lot of numbered subfolders:

$ cd /var/discourse/shared/standalone/uploads/default/
$ ls
1    112  125  138  151  164  177  190  203  216  229  242  255  268  281  294  46  59  72  85  98
100  113  126  139  152  165  178  191  204  217  230  243  256  269  282  34   47  60  73  86  99
101  114  127  140  153  166  179  192  205  218  231  244  257  270  283  35   48  61  74  87  optimized
102  115  128  141  154  167  180  193  206  219  232  245  258  271  284  36   49  62  75  88  _optimized
103  116  129  142  155  168  181  194  207  220  233  246  259  272  285  37   50  63  76  89  original
104  117  130  143  156  169  182  195  208  221  234  247  260  273  286  38   51  64  77  90
105  118  131  144  157  170  183  196  209  222  235  248  261  274  287  39   52  65  78  91
106  119  132  145  158  171  184  197  210  223  236  249  262  275  288  40   53  66  79  92
107  120  133  146  159  172  185  198  211  224  237  250  263  276  289  41   54  67  80  93
108  121  134  147  160  173  186  199  212  225  238  251  264  277  290  42   55  68  81  94
109  122  135  148  161  174  187  200  213  226  239  252  265  278  291  43   56  69  82  95
110  123  136  149  162  175  188  201  214  227  240  253  266  279  292  44   57  70  83  96
111  124  137  150  163  176  189  202  215  228  241  254  267  280  293  45   58  71  84  97

What would be the easiest way to figure out which (if any) posts are referencing the images in those subfolders? A simple Rails console query would be fine.

Thanks!

2 Likes

It could have many empty directories too. I think you should ignore them all.

The below query can help you.

Post.where("cooked LIKE '%/uploads/default/%' AND cooked NOT LIKE '%/uploads/default/original/%' AND cooked NOT LIKE '%/uploads/default/optimized/%'")
3 Likes

that output is forever long, and I’m not seeing a way to easily pipe this to a file for processing. (Best i can do is hit spacebar to forward a page at time… and I’ve yet to see the end).

How could we get this output in to a file, or maybe have it skip the message body/attachments?

ok we worked out a way, one that skips the output of the three “most offending” columns:

"raw", "cooked", "raw_email

posts = Post.where("cooked LIKE '%/uploads/default/%' AND cooked NOT LIKE '%/uploads/default/original/%' AND cooked NOT LIKE '%/uploads/default/optimized/%'"); 42
CSV.open("/tmp/posts-to-review.csv", "wb") do |csv|
  csv << Post.attribute_names - ["raw", "cooked", "raw_email"]
  posts.each do |post|
    csv << post.attributes.except("raw", "cooked", "raw_email").values
  end
end
  • Skip printing the output by typing q,
  • Exit the ruby via exit
  • clear the screen, then use cat /tmp/posts-to-review.csv

I guess I could have channeled it to the shared folder… but this works.

(edited to add clearer steps)

4 Likes

Hmm…I just checked (using find . -maxdepth 1 -type d -empty | wc -l) and only found 19 empty numbered directories out of 262 total (using find . -maxdepth 1 -type d -iname '[0-9]*' | wc -l), or about 7%. So I don’t think I should ignore them entirely?

Great, thanks! :+1:

I noticed that this uses the “cooked” attribute—presumably this means that the Markdown source may continue to use the old upload scheme? For example, I found a post that has this image tag in the source:

<img src="/uploads/default/293/8d45810f8911c08c.jpg" width="666" height="500">

However, if you hover your mouse over the rendered image it links to the following URL:

/uploads/default/original/2X/0/0d1e04b9215f210faf1d8509e6bede9c3319e02b.jpeg

Should we be concerned that the image URLs in the Markdown source don’t match the image URLs in the cooked HTML (including the file hashes)? For example, could I safely delete the /var/discourse/shared/standalone/uploads/default/293 directory without worrying about breaking the link to the image above? Or, put another way, will Discourse always know that /uploads/default/293/8d45810f8911c08c.jpg is an alias to /uploads/default/original/2X/0/0d1e04b9215f210faf1d8509e6bede9c3319e02b.jpeg, and will this mapping be preserved when the site is restored from a backup?

2 Likes

Yes. It shouldn’t have different URL in raw and cooked. Can you paste the full-text content of the post’s raw and cooked columns here?

2 Likes