Rebake with rails command or rake task doesn't work, but rebuilding HTML does. Why?

So, here’s where I am:

After trying to do a post.rebake!(invalidate_broken_images: true) on all my 40000 posts that contains the string [img], it worked for a lot of images… But far from all, despite being hosted on the same external image hosting service.
For example, I have thousands of “working” casimages links (that links to valid images, and show images in the composer preview on edit), broken in the cooked version of the posts, that were properly displayed and uploaded on the server thanks to my script, but I also have a lot of other ones where it simply didn’t, and I don’t know why.

Post.where('raw LIKE ?', '%[img]%').find_each do |p|
    p.rebake!(invalidate_broken_images: true)
end

I also have images links from other image hosting that were uploaded, and some on which it didn’t work.

I failed to see any difference between these posts and image links. They all had working images, and the fact that they used the same images hosting puzzled me.

I tried the operation multiple times and the results were inconsistent, regardless of the external hosting services… Some images were uploaded, some weren’t. It looked like of random.

It reminds me a bit of the issue that encountered @Amethi: Some linked images not displaying/show as broken - #8 by Amethi where it worked only on some images only without any explanation.


:information_source: I’ll talk only about casimages here though my imported forum used various other image hosters.

So, I thought that maybe casimages temporarily blacklisted my IP if I tried to retrieve too many images from their servers. That could explain both the fact that it didn’t work for all images and the randomness of the success of uploading the images from my server.
There were even cases where the Rebuild HTML option worked -at first only-, the images were then displayed instead of showing a broken image icon, though there were still hosted on their external hosting service, but when the pull external image Sidekiq task was triggered it broke the images.
Same by using rail scripts with rebake!(invalidate_broken_images: true)
:weary:

So, I’m currently trying a slower approach, where I wait 5 seconds between each of my rail rebake! commands:

total = Post.where('lower(raw) LIKE ?', '%[img]https:%').count
i = 0
Post.where('raw LIKE ?', '%[img]https:%').find_each do |p|
    p.rebake!(invalidate_broken_images: true)
    print "#{i}/#{total}"
    print "\r"
    i +=1
    sleep(5)
end

I’ll see in ~60 hours if it went better…

I’d like to understand the fundamentals of my issue here and why a “normal” rebake can’t upload an image on the server (if I’m not temporarily blacklisted by casimages).

Note that this time, the certificate of casimages’s server seems OK :smile:

I also don’t understand what invalidate_broken_images really does. I’m not very familiar with Discourse’s code.

I look at the code to see the occurrence of invalidage_broken_images and saw these files:

Why it is searching specifically for the <img string? My posts are from an imported phpBB and the raw version contains only [img] bbCode, not <img> tags; so how it would have an effect (and it did, see my previous message) on my posts? :thinking:

I also don’t really understand the difference between these two methods (?):

It seems to tell that rebake set the default arguments to false, and that rebake! sets the default argument to true.

How are these two related (I’m aware of the purpose of the ! character in ruby by the way), and why are they in different files?

My goal is only to understand why my external images are sometimes uploaded, sometimes not, and if I can find a reliable way to upload them properly and automatically, even if it implies uploading an image every hour. :sweat_smile:
I’ve been almost two weeks on this and it’s driving me (and the people which I migrated their server for) crazy. :woozy_face:

Also, there is nothing in Discourse’s log, instead of multiple Sidekiq is consuming too much memory (using: 592.25M). Note that I’m working on Ubuntu via WSL on Windows 10, but I intend to use a working solution (if I find one…) on our VPS.

1 Like