Race condition between RSS and embed importer

We are quite happy users of @eviltrout 's embed tutorial for our sites. But there is one little thing that regularly hits us:

  1. we publish our site with a static generator.
  2. via the sidekiq admin panel we trigger the pollfeed job so the original post will be created
  3. sometimes an user finds the new post already, visits the site and the article gets embedded via the scraper in the embed code. but then we lose all styling/images and so on. Which would be preserved via the RSS import.

We are curious to hear ideas how we can solve this race condition. pushing the new article from the site also via the discourse-api? adding option to disable the scraper for some embed sites?

Why is the RSS import different from the regular import? If your markup is confusing to the crawler you can look at the embed whitelist selector to improve the accuracy.


well it strips out all images e.g. which is not what we want right now. any way to prevent that?

should we add figure and img tags to the whitelist to make it work?