We use the rss-polling plugin to import entries from dozens of feeds. From time to time, some feeds will just vanish from the list. We don’t know under which conditions, we don’t know how, and we cannot find any log of such action. We haven’t found a pattern either. They just vanish, and we need to create their new entries again (when we realize they have disappeared, which isn’t trivial with so many feeds).
Is there a functionality written in the code that would eliminate an RSS feed under certain conditions? Is there a log that would register when this happens?
The only thing we do with RSS feed is to create new ones manually on /admin/plugins/rss_polling. Nothing else.
I have noticed that the order of RSS feeds in that page changes from time to time, which is already strange and it doesn’t help debugging problems because the chronological order of feed creation is lost, and the new sorting doesn’t seem to follow any alphabetical or numerical sorting.
We don’t know how often this problem happens because we cannot detect it right away when it happens. In a best case scenario we can only wait for the owner of the feed to tell us that new entries are not appearing, or we realize ourselves that, say, a weekly podcast has been silent on the forum lately.
I have the impression that this problem might be related to situations where the CPU gets close to 100%, or something else in the server reaching maximum capacity. Even if the scheduler does a good job at polling only 5 feeds at a time, on the hosting analytics we can see that when RSS feeds are being updated the CPU basically reaches 100%.
Maybe this is why forums with just a couple of feeds don’t notice it, but if you add plenty something might break only sometimes regardless of the scheduler?
In any case, I wonder why Discourse should delete RSS feeds in the first place, and do it silently. If a RSS feed is problematic for some reason, it could skip it this one time and log the problem. From this point two things could happen:
It was an occasional problem and the next scheduler runs will work just fine. The log entry will be forgotten.
The problem persists, every skip leaves an entry in the error log, and when someone realizes there is a problem the admins can check the error log and find the details.
Ideally admins would get some kind of notification, but I understand if this adds more work to a potential solution. Skip instead of delete + error log entry would be a big improvement already.
And to explain what impact this problem has in our project. We have a portal of independent podcasts, they contact us to be aggregated, we do create a channel (subcategory) where their programs are imported and where listeners can like and comment… It gives a very bad impression about our project and the Discourse platform when they realize we have stopped aggregating them and we have to answer them that somehow their feed has been deleted…
Let’s start with this fix for now. I’m not saying this is the cause of feeds mysteriously being deleted, but adding at least a confirmation dialog box before actually deleting a feed is a start.
I’ll work later on more improvements when I have more time. I want to add a proper edit/save individual feeds UI instead of just updating/saving ALL of them each edit.
@icaria36 if you update your plugin you will get this latest change. I’ll follow up again when I have more improvements ready.
Thank you for the patch! I immediately upgraded to the new version and added the missing RSS feeds (about 20!), and today I checked and… feeds have been silently removed again. I don’t know when, I don’t know how, I still don’t know which ones, but there are about a dozen missing. I haven’t seen any dialog asking about feeds before removing them, or anything at all. Also, I took care of re-adding RSS feeds when the server CPUs were fine, avoiding the RSS polling time, just to remove that factor from the equation.
I have the impression that there is a connection between the fact that the list changes its sorting and the fact that feeds disappear. Whatever causes the change of sorting might be traumatic enough to also have casualties. A useful question could be: why does the list change its sorting in the first place? Feeds disappearing while the list sorting remains the same would be just as bad, but it would feel… cleaner.
This problem is tedious to fix on our side (finding out what feeds are missing before their owners notice and ask + re-adding the feed entries) but it is even tedious to detect, because the only way is to check category-feed after category-feed to see which ones are missing – or at least to count manually how many feeds there are compared to the last time we counted.
A great time-saver for debugging would be to have a string showing the total number of feeds in /admin/plugins/rss_polling, i.e. just “N feeds”. And/or make the list numbered.
I’ve improved the admin ui quite a bit with my latest changes and it should address the issue of rss feeds mysteriously being removed.
There is now individual create, update, and delete logic instead of it trying to update every single feed anytime you modified something. Should work much better now.
Thank you so much for your swift action! I will upgrade tonight, make sure that we have all our feeds enabled, and report here if there is anything (hopefully nothing at all!).
Wow, this is so much better! Each feed is edited individually, saving one change at a time. Saving a new feed is an instant action. Before, when you had many feeds, adding a new feed and pressing the save icon meant a few seconds of process before the full list was refreshed. This feels solid and trustable. Thank you very much!
I will wait a couple of weeks to verify that no feeds are lost.
A couple of suggestions to improve the UI, just in case they are useful and land at a good time:
The “+” to add a new feed is at the top but the new row is added at the bottom. If the list of feeds is longer than the screen, the user clicks the “+” and doesn’t see anything happening. I know this page well, and at the beginning I thought the button was broken. Then I thought to check the end of the list and there was a new row there, waiting, just like before.
Adding the new row at the beginning of the list, right below the “+” button, could be a good alternative. If this puts the new feed at the top of the list, this is good too if it goes first in the scheduler. New feeds have higher chances of bringing more work and things to check than the established ones.
There are a couple of strings “Feed Settings” and “Discourse Settings” that seem to be just hanging there. “Feed Settings” seems redundant, given that the list is right below and self-explanatory. Is “Discourse Settings” supposed to be a link to /admin/site_settings/category/plugins?filter=plugin%3Adiscourse-rss-polling? Or maybe they are supposed to be tabs for two different subpages?
Ya good point. I’ll keep this in mind. We do have this newish topic about Creating consistent admin interfaces and I think some improvements to the rss-polling plugin will trickle down from that. Likely I imagine we will create a separate edit screen like we do for other plugins rather than editing things inline.
Yes, thank you pointing this out. I can see how this is confusing. “Feed Settings” is a label for the first two columns indicating that those columns affect the RSS Feed. The “Discourse Settings” label is for the last 3 columns indicating that when you modify them they affect things in Discourse.
It’s been a week, we have added a couple dozen RSS feeds on top of the more than hundred we already had, and the list is rock solid now. We haven’t detected any problems!