Rebaking all posts is sometimes a recommended action for various reasons.
After my own experience of a rebake experience that raised issues and seen/foresee other potential others, I’d like to know how they can be avoided.
If I rebake my 2M posts, it will trigger too many requests to Youtube and my IP will be blacklisted, preventing Discourse from generating previews.
If oneboxes (with titles, thumbnails, excerpts copied in Discourse database - cooked field) original URLs went to be broken or redirected, it seems that the oneboxes will break and that we will lose this information.
I decided to drop the support of Facebook (and thus Instagram to my knowledge) on my forums for several reasons. If I rebake all my posts, I suppose that every link that was previously properly oneboxed will break. Is that right?
It feels to me that we need an enhancement to rebake, to be more careful
rate limit on selected sites or perhaps on all sites
inherit the original onebox if the refetch fails for any reason
In other words, I think we need a non-damaging rebake, at least as a selectable option.
(There will be some discourse communities who value an up to date 404, or who don’t value old posts at all, but there will also be communities which very much want to preserve old threads intact.)
Is there any value at all in automatically re-fetching the contents of a tweet? I would be fine with rebake skipping fetching one boxed content again unless a box were ticked.
Thanks for posting that. It could impact something I’m working on. Do you have any idea how many requests it takes to trigger the limit?
Looking at the YouTube API docs, it seems that they allow up to 10k GET requests per 24 hours, but this is for requests made with an API key: YouTube Data API Overview | Google for Developers. It’s not clear to me how rate limits for unauthenticated API requests to pull in the video preview images are rate limited.
I resolved the issue by using Onebox Assistant without any API. Just the plugin enabled. No idea how it resolved my problem. I don’t know either if it would work nowadays.