上传重映射有点过于热情了

When a backup is restored to a database with a different name, there is a neat little piece of code in restorer.rb that attempts to fix this.

However, this renames ALL upload filenames, not just the ones that were extracted from the archive. If the database name occurs in an S3 path (which obviously does not change), it changes it as well, causing all images to fail loading.

3 个赞

What do you think here @tgxworld?

Would changing this to include the trailing slash have helped?

DbHelper.remap("uploads/#{previous_db_name}/", "uploads/#{current_db_name}/")

That feels more correct in any case.

No - I’m talking about filenames like //discourse-cloud-file-uploads.s3.dualstack.us-west-2.amazonaws.com/standard11/uploads/DBNAME/original/1X/abcd.....jpg

So I think we’d need to exclude uploads starting with // or something like that.

We’re experiencing the pain of this quite abit while migrating local uploads to S3. Unfortunately, no amount of hacky remaps can fix this properly. What we’re currently working on at the moment is to remove/disallow upload links in Post#raw. The short upload url, ![test](upload://asdikajsdiasds.png), will be used as the public interface for access to uploads in raw.

Once that is done, we will only have to remap the urls in the uploads table and do a rebake of affected posts.

6 个赞

“only” have to do a rebake… yes, on one hand I think it’s good to get rid of the hacky remaps, on the other hand it seems like the backup/restore and migrate from/to S3 processes are getting more and more complicated and hard.

See also a number of issues in Migrate_from_s3 problems which are kind of related but they’re not all covered by your proposed solution.

I do not think so, the moves will get significantly simpler. Walk the uploads table and upload all files to new s3 bucket, rebake and you are done.

They do though become more expensive computation wise cause a rebake is required.

This is a cost I think is worth bearing given we get a correct move at the end of the process. But, we have the flexibility of still doing database replace :roller_coaster: as we do today for the cooked column and post revisions if that is how we roll, the key change in the process is that we leave posts.raw alone and then have 0% of corruption there. posts.raw and posts_uploads auto generated map will always be stable.

4 个赞

Ok - good point.

I think a significant improvement would be to walk the uploads table instead of posts (see the topic I linked above) and that would indeed leave the flexibility to replace in posts.cooked

这个问题还存在吗 @sam

在原始 Markdown 中使用 upload:// 链接对此有很大帮助。我不确定 @gerhard 是否已修改代码,以便在恢复帖子时排队进行完整的重新烘焙。(我想我们确实这样做了)

1 个赞

这取决于具体情况。如果备份中未包含优化后的图片(这是默认设置),我们会对所有包含上传内容的帖子进行重新烘焙。否则,我们仅依赖重映射。

2 个赞

我们可能应该将其改为无条件重新烘焙,这样更安全。

2 个赞

好的,我刚刚改好了。这将被包含在备份和恢复的大规模重构中。

5 个赞