如何查找任何缺失的图片?

I also have a lot of missing images:

root@xxxxx-app:/var/www/discourse# rake posts:missing_uploads

37 post uploads are missing.

34 uploads are missing.
5 of 34 are old scheme uploads.
33 of 1013268 posts are affected.

but after

root@xxxxx-app:/var/www/discourse# rake uploads:missing
.
.
.

/var/www/discourse/public/uploads/default/original/2X/3/3a9bf205dec2b6bd0b3cc35a3be1f69499960713.JPG
/var/www/discourse/public/uploads/default/original/3X/2/d/2db0ff326859b94824b64c4e0c2b156c562b7a99.jpg
/var/www/discourse/public/uploads/default/original/3X/e/f/ef271ac232c31e747206b47e4de7e0570de9e030.jpg
**10604** of 101083 uploads are missing

/var/www/discourse/public/uploads/default/optimized/2X/3/3d3efaa44fb43b99ec290b75e289080fc448f709_1_657x500.gif
/var/www/discourse/public/uploads/default/optimized/2X/4/4e3b01361d7c30c3df27a9606271175d91edff6d_1_200x200.jpeg
/var/www/discourse/public/uploads/default/optimized/2X/6/68fcd443312e7c500d25a6067c04c98aa1066686_1_200x200.JPG
/var/www/discourse/public/uploads/default/optimized/2X/1/16452ee2749e93e6b47d1388a20e34c1e3832ee1_1_100x100.jpg
/var/www/discourse/public/uploads/default/optimized/2X/0/0e843544885c0ee7a0c350d66bbe852dd4f0a497_1_135x135.jpeg
/var/www/discourse/public/uploads/default/optimized/2X/f/fb3d38e8d1b0e8ae25606156d37b8045f7cbc2b3_1_200x200.jpg
/var/www/discourse/public/uploads/default/optimized/2X/5/54a076f5be7cfabcf0fa34e57b836d85a33a879e_1_200x200.jpg
7 of 247116 optimized_images are missing

I’m having this problem as well on a forum running v2.3.0.beta9. I haven’t restored from backups. It’d be good to know why it’s happening and what we can do to restore the missing uploads. The good news is that the missing uploads seem to be in the tombstone, so hope isn’t lost..

So far, I’ve rebuilt the app and run rake uploads:missing, but that hasn’t helped.

root@forum-app:/var/www/discourse# rake posts:missing_uploads
Looking for missing uploads on: default

146 post uploads are missing.

142 uploads are missing.
81 of 58082 posts are affected.
root@forum-app:/var/www/discourse# rake uploads:missing
Looking for missing uploads on: default

146 post uploads are missing.

142 uploads are missing.
81 of 58082 posts are affected.

The missing uploads are from twoish weeks ago.

Problem solved! rake uploads:recover_from_tombstone did the trick. It’d still be good to understand why this happened…

@vinothkannans, @zogstrip Any updates on this?

Just checking in again. Does anyone know the answers to the following questions from a month ago?

  1. What’s the difference between a missing upload and a missing post upload? (Last I checked, my site appeared to have 196 more of the latter.)

  2. Why does Discourse report images that haven’t been migrated to the new upload scheme as “missing”?

  3. If this is expected behavior, how can I migrate the 200+ images like this on my site to the new upload scheme?

@vinothkannans can advise

  • missing upload - The upload record is found in DB but the file is missing in the file storage.
  • missing post upload - The upload record is not found in DB for an upload URL in the post.

I think those old scheme uploads are not found in the database and/or filesystem too.

You can do it by enabling the setting using the below command. It will migrate those upload if they’re found in DB and filesystem.

SiteSetting.migrate_to_new_scheme = true

Forgive me.. where do we load these?

You need to enter that into the rails console as it’s a hidden site setting.

so:

root@site:~# cd /var/discourse/
root@site:/var/discourse# ./launcher enter app
root@site-app:/var/www/discourse# rails c
[1] pry(main)> SiteSetting.migrate_to_new_scheme = true

then exit, and rebuild?

EDIT:

any other steps.. do we need to undo this after?

Looks good. No need to rebuild or undo.

Thanks, @vinothkannans. :bowing_man:t2:

How will I know that this command has been successful? Should I try running rake posts:missing_uploads again after enabling this hidden site setting in the Rails console? Or maybe PostCustomField.where(name: Post::MISSING_UPLOADS)?

这方面有什么更新吗?

该命令将在后台任务中运行并完成迁移。完成后,SiteSetting.migrate_to_new_scheme 站点设置值将恢复为 false。然后再次运行 rake posts:missing_uploads 任务。

好的,我刚刚按照您的建议操作,输出结果看起来很有希望。首先,我进入了 Discourse 容器:

$ cd /var/discourse
$ sudo ./launcher enter app

WARNING: We are about to start downloading the Discourse base image
This process may take anywhere between a few minutes to an hour, depending on your network speed

Please be patient

Unable to find image 'discourse/base:2.0.20190625-0946' locally
2.0.20190625-0946: Pulling from discourse/base
Digest: sha256:9899c60721649460283ac800836ac1ebecbc3ed8a97a496e514cf8c97f5b6d82
Status: Downloaded newer image for discourse/base:2.0.20190625-0946

接下来,我运行了 rake posts:missing_uploads

# rake posts:missing_uploads
Looking for missing uploads on: default
Fixing missing uploads: 
.........................................................................................................................................................................................................................................................
12 post uploads are missing.

12 uploads are missing.
1 of 12 are old scheme uploads.
3 of 8930 posts are affected.

(这次只有 12 个缺失的帖子上传!太好了!)

最后,我将 SiteSetting.migrate_to_new_scheme 设置为 true 并退出了 Rails 控制台:

# rails c
[1] pry(main)> SiteSetting.migrate_to_new_scheme
=> false
[2] pry(main)> SiteSetting.migrate_to_new_scheme = true
=> true
[3] pry(main)> exit

过了一段时间后,我确认 SiteSetting.migrate_to_new_scheme 的值确实已变为 false,并再次运行了 rake posts:missing_uploads

[1] pry(main)> SiteSetting.migrate_to_new_scheme
=> false
[2] pry(main)> exit
# rake posts:missing_uploads
Looking for missing uploads on: default
Fixing missing uploads: 
.
12 post uploads are missing.

12 uploads are missing.
1 of 12 are old scheme uploads.
3 of 8939 posts are affected.

输出结果大致相同,所以我认为这意味着所有使用旧上传方案的帖子都已迁移到新的上传方案。然而,uploads 目录中仍然有很多带编号的子文件夹:

$ cd /var/discourse/shared/standalone/uploads/default/
$ ls
1    112  125  138  151  164  177  190  203  216  229  242  255  268  281  294  46  59  72  85  98
100  113  126  139  152  165  178  191  204  217  230  243  256  269  282  34   47  60  73  86  99
101  114  127  140  153  166  179  192  205  218  231  244  257  270  283  35   48  61  74  87  optimized
102  115  128  141  154  167  180  193  206  219  232  245  258  271  284  36   49  62  75  88  _optimized
103  116  129  142  155  168  181  194  207  220  233  246  259  272  285  37   50  63  76  89  original
104  117  130  143  156  169  182  195  208  221  234  247  260  273  286  38   51  64  77  90
105  118  131  144  157  170  183  196  209  222  235  248  261  274  287  39   52  65  78  91
106  119  132  145  158  171  184  197  210  223  236  249  262  275  288  40   53  66  79  92
107  120  133  146  159  172  185  198  211  224  237  250  263  276  289  41   54  67  80  93
108  121  134  147  160  173  186  199  212  225  238  251  264  277  290  42   55  68  81  94
109  122  135  148  161  174  187  200  213  226  239  252  265  278  291  43   56  69  82  95
110  123  136  149  162  175  188  201  214  227  240  253  266  279  292  44   57  70  83  96
111  124  137  150  163  176  189  202  215  228  241  254  267  280  293  45   58  71  84  97

最简单的方法是什么,以确定(如果有的话)哪些帖子引用了这些子文件夹中的图片?一个简单的 Rails 控制台查询就可以了。

谢谢!

其中可能还包含许多空目录。我认为你应该忽略所有这些目录。

下面的查询可以帮到你。

Post.where("cooked LIKE '%/uploads/default/%' AND cooked NOT LIKE '%/uploads/default/original/%' AND cooked NOT LIKE '%/uploads/default/optimized/%'")

该输出内容过长,而且我找不到一种简便的方法将其管道化到文件中进行处理。(目前我能做的最好的就是按空格键逐页向前翻页……但我还没看到结尾)。

我们该如何将输出保存到文件中,或者让它跳过消息正文/附件?

好的,我们找到了一种方法,可以跳过输出三个“问题最严重”的列:

"raw", "cooked", "raw_email"

posts = Post.where("cooked LIKE '%/uploads/default/%' AND cooked NOT LIKE '%/uploads/default/original/%' AND cooked NOT LIKE '%/uploads/default/optimized/%'"); 42
CSV.open("/tmp/posts-to-review.csv", "wb") do |csv|
  csv << Post.attribute_names - ["raw", "cooked", "raw_email"]
  posts.each do |post|
    csv << post.attributes.except("raw", "cooked", "raw_email").values
  end
end
  • 通过输入 q 跳过打印输出,
  • 通过 exit 退出 Ruby,
  • 清屏后,使用 cat /tmp/posts-to-review.csv 查看文件。

我想我本可以将其输出到共享文件夹……但这样也能行。

(已编辑以添加更清晰的步骤)

嗯……我刚刚检查了一下(使用 find . -maxdepth 1 -type d -empty | wc -l),在总共 262 个目录(使用 find . -maxdepth 1 -type d -iname '[0-9]*' | wc -l)中,只发现了 19 个空编号目录,约占 7%。所以我认为我不应该完全忽略它们?

太好了,谢谢!:+

我注意到这里使用了“cooked”属性——这是否意味着 Markdown 源代码可能继续使用旧的上传方案?例如,我发现某篇帖子在源代码中包含以下图像标签:

<img src="/uploads/default/293/8d45810f8911c08c.jpg" width="666" height="500">

然而,如果你将鼠标悬停在渲染后的图像上,它会链接到以下 URL:

/uploads/default/original/2X/0/0d1e04b9215f210faf1d8509e6bede9c3319e02b.jpeg

我们是否应该担心 Markdown 源代码中的图像 URL 与渲染后的 HTML 中的图像 URL(包括文件哈希)不匹配?例如,我是否可以安全地删除 /var/discourse/shared/standalone/uploads/default/293 目录,而无需担心破坏上述图像的链接?或者换种说法,Discourse 是否始终知道 /uploads/default/293/8d45810f8911c08c.jpg/uploads/default/original/2X/0/0d1e04b9215f210faf1d8509e6bede9c3319e02b.jpeg 的别名,并且当网站从备份恢复时,这种映射关系是否会被保留?

是的。原始内容和经过处理的内容中的 URL 不应该不同。你能把帖子原始列和经过处理列的完整文本内容粘贴到这里吗?