We want to build an automated pipeline that that can update the discourse custom theme what we built which has different tabs for different kind of scrapped data content:
Scrapes content from sources (RSS feeds, websites, etc.)
Structures the data with metadata: title, source, type (news/conferences) , URL, date
Uses the Discourse API to:
Create a topic under the correct category and update content within specific tabs of custom theme.
Add relevant tags (to make it appear under the correct tab)
what are best way to store scrap data and render:
local database or external CMS to store and schedule content or
Thanks pfaffman for the plugin suggestions. However, we don’t have rss feed data, we are storing scrapped data into standalone database..can we use this plugin to connect to the standalone database and fetch the needed data and render the content
It was an example. You could either make your scaped data into an rss feed or modify the plugin to read whatever format you want to put it in.
What I would probably do is write the scraper in ruby and integrate it into a plugin.
Or maybe Use the Discourse API ruby gem and put it in a Github action and have it push the data. I’m planning to do that for a client that’s hosted and can’t use a custom plugin.