RSS Polling plugin removes tags added manually

I’m importing RSS feeds with the RSS Polling plugin. I’m not adding any tags by default. Instead, the plan is that users add the tags manually. But the plugin removes the tags on the next pull. See for instance

EDIT: I thought it could be related with the setting create post for category and tag changes but I have changed it and the tags are still being deleted. For example:

EDIT 2: Ok, this is even more strange. A post that says that some tags have been removed but the tags are still there…

Running Discourse 3.3.0.beta3-dev ( c13f64d35b ) and RSS Polling 0.0.1 be7b56e.

1 Like

Maybe related? There is this recurring error in the logs:

Job exception: undefined method `name' for an instance of String 

/var/www/discourse/app/models/topic_embed.rb:125:in `map'

/var/www/discourse/app/models/topic_embed.rb:125:in `import'

/var/www/discourse/plugins/discourse-rss-polling/app/jobs/jobs/discourse_rss_polling/poll_feed.rb:52:in `block in poll_feed'

/var/www/discourse/plugins/discourse-rss-polling/app/jobs/jobs/discourse_rss_polling/poll_feed.rb:41:in `each'

/var/www/discourse/plugins/discourse-rss-polling/app/jobs/jobs/discourse_rss_polling/poll_feed.rb:41:in `poll_feed'

/var/www/discourse/plugins/discourse-rss-polling/app/jobs/jobs/discourse_rss_polling/poll_feed.rb:20:in `execute'

/var/www/discourse/app/jobs/base.rb:305:in `block (2 levels) in perform'

rails_multisite-6.0.0/lib/rails_multisite/connection_management/null_instance.rb:49:in `with_connection'
rails_multisite-6.0.0/lib/rails_multisite/connection_management.rb:21:in `with_connection'
/var/www/discourse/app/jobs/base.rb:292:in `block in perform'

/var/www/discourse/app/jobs/base.rb:288:in `each'

/var/www/discourse/app/jobs/base.rb:288:in `perform'

sidekiq-6.5.12/lib/sidekiq/processor.rb:202:in `execute_job'

sidekiq-6.5.12/lib/sidekiq/processor.rb:170:in `block (2 levels) in process'

sidekiq-6.5.12/lib/sidekiq/middleware/chain.rb:177:in `block in invoke'

/var/www/discourse/lib/sidekiq/pausable.rb:132:in `call'

sidekiq-6.5.12/lib/sidekiq/middleware/chain.rb:179:in `block in invoke'

sidekiq-6.5.12/lib/sidekiq/middleware/chain.rb:182:in `invoke'

sidekiq-6.5.12/lib/sidekiq/processor.rb:169:in `block in process'

sidekiq-6.5.12/lib/sidekiq/processor.rb:136:in `block (6 levels) in dispatch'

sidekiq-6.5.12/lib/sidekiq/job_retry.rb:113:in `local'

sidekiq-6.5.12/lib/sidekiq/processor.rb:135:in `block (5 levels) in dispatch'

sidekiq-6.5.12/lib/sidekiq.rb:44:in `block in <module:Sidekiq>'

sidekiq-6.5.12/lib/sidekiq/processor.rb:131:in `block (4 levels) in dispatch'

sidekiq-6.5.12/lib/sidekiq/processor.rb:263:in `stats'

sidekiq-6.5.12/lib/sidekiq/processor.rb:126:in `block (3 levels) in dispatch'

sidekiq-6.5.12/lib/sidekiq/job_logger.rb:13:in `call'

sidekiq-6.5.12/lib/sidekiq/processor.rb:125:in `block (2 levels) in dispatch'

sidekiq-6.5.12/lib/sidekiq/job_retry.rb:80:in `global'

sidekiq-6.5.12/lib/sidekiq/processor.rb:124:in `block in dispatch'

sidekiq-6.5.12/lib/sidekiq/job_logger.rb:39:in `prepare'

sidekiq-6.5.12/lib/sidekiq/processor.rb:123:in `dispatch'

sidekiq-6.5.12/lib/sidekiq/processor.rb:168:in `process'

sidekiq-6.5.12/lib/sidekiq/processor.rb:78:in `process_one'

sidekiq-6.5.12/lib/sidekiq/processor.rb:68:in `run'

sidekiq-6.5.12/lib/sidekiq/component.rb:8:in `watchdog'

sidekiq-6.5.12/lib/sidekiq/component.rb:17:in `block in safe_thread'

And I think I have found a pattern:

  • If the “Tags” field in /admin/plugins/rss_polling is empty, the tags added manually are removed in the next polling.
  • If that field has a tag, then the tags added manually seem to stay.

After testing further, I’m pretty sure that the problem is that tags are automatically removed when the RSS feed on /admin/plugins/rss_polling doesn’t have any tags assigned.

1 Like

I can repro this. :raised_hand:

My step-by-step:

  • Add https://meta.discourse.org/c/bug/1.rss to RSS Polling
  • Set user and category, but leave tags blank (save)
  • Wait for topics to be pulled in
  • Select a couple and manually add tags
  • Wait for the next poll
  • See a topic with manually added tags has been edited to remove them

Expected: Polled topic with manually added tags should not be edited to remove those tags
Actual: Polled topic tags are overwritten

(No error in the /logs though)

6 Likes

Strange. It keeps reporting this error. 1344 instances accumulated in a week. Looks generic enough.

1 Like

Is there fix on the near roadmap?

I imagine still early. Just wondering if any news.

This is happening in TopicEmbed in core

4 Likes

I’ve updated this so that if tags are nil or missing (as they are in the repro), the tags won’t get updated:

This fixes the repro as listed

2 Likes

Hi, I think there is a new problem now. I just updated Discourse to 3.3.0.beta4-dev ( 7b8863fcd5 ) and now there are some imported posts that keep being updated at every poll, with no diff changes to be seen. This is what these posts have in common:

  • They are all imported with RSS Polling.
  • They received one tag when they were imported, the one set in RSS Polling.
  • We added a second tag manually.

Here is an example diff. No changes to be seen:

For us, this is a regression. The previous bug could be circunvented by adding a default tag to all RSS feeds. Then tags could be added manually without trouble. Now our Latest list is being spammed by these recurrent updates without changes.

Updated: or maybe it’s something more specific or local? Because not all the topics with tags added manually are being updated. I’m removing and re-adding tags in some of the updated posts to see if I find a pattern. I will reply here with any findings.

2 Likes

Ok, I can confirm this pattern for the topics that are resurfacing:

  • They are all imported with RSS Polling.
  • They received one tag when they were imported, the one set in RSS Polling.
  • We added a second tag manually.

I was confused because only some topics with an additional tag keep being refreshed and not all of them, but the answer is simple: the RSS feed is still calling them, whereas older topics or topics with only recent entries in the RSS feed are logically not being triggered.

If possible, the implementation should be as simple as this:

  1. If an imported topic is new, import the tags defined in the RSS Polling settings, if any.
  2. If an imported topic isn’t new, don’t check tags at all.

This way new imported topics come with the expected tags (or none, if not tags are defined) and existing topics don’t get any changes/refreshes because of tags manually edited.

Can you revert this patch until a tested solution is in place, please? Our Latest main page is occupied by these old entries and we are lucky that we just started tagging and there is only a handful of entries manually tagged. Otherwise I can remove the second tags for now…

I am still seeing it remove tags when pulling

You can see the “meta-hmd” is being removed. This tag was added manually at next pull it removes the tag.
The “UploadVR” tag is configured in RSS Polling Plugin. :slightly_frowning_face:

As @RGJ linked. The embed import is seeing that tags have changed and is re importing post removing the added tag.

Maybe a toggle could be added to ignore discourse topic tag changes?

1 Like

Added an option to disable tag updates on RSS polls. Let us know if this resolved your use case

5 Likes

Awesome thank you for the speedy fix!

Was wondering if this pr can be evaluated

1 Like

The patch @Heliosurge mentions comes from RSS Polling setting to use pubDate to set the date of imported topics. For me it is very relevant to this topic here because we commissioned that feature and we have that RSS Polling version installed in our server. We have thousands of imported topics with correct dates, and I fear installing the stock RSS plugin to test yesterday’s patch by @featheredtoast might break things by bringing thousands of changes to incorrect dates or something.

For what is worth, we have been running the patched version for weeks and we have imported dozens of different feeds without a single glitch. It works great and as intended.

2 Likes

I have switched back to Official plugin

Any topics no longer being polled should be fine.

Prior to this patch as you reported old imported feeds that are no longer new enough should remain unchanged.

So any feeds you added tags manually that are not pulling will be fine.

1 Like

@Heliosurge if you can test the fix here that would be great. We are still adding new feeds almost on a daily basis and for us keeping the PubDate is crucial since each import might have dozens or even hundreds of entries. In comparison, we can wait for the solution on the tags.

If your still adding brand new feeds. Those new feeds will not have pubdate honored as you know.

Any old feeds as I mentioned that are no longer being fetched due to age. Will not change the tags.

I am not sure how old the RSS topic has to be. For it to be no longer in que.