Mejores prácticas para antispam de IA más triaje de publicaciones de IA operando juntos

Have been having fantastic success with Discourse AI spam detection—in spite of my initial trepidation, it’s been extremely effective at keeping my self-hosted discourse instance spam-free.

Because the discourse instance I administer is attached to a weather forecasting site, there is the more-than-occasional discussion of politicized topics like climate change, the current administration’s policies re: NOAA and NASA, and other similar items. Because we have a very small moderation team that can’t be around all the time, I’ve set up an Automation that uses a “post triage” persona + prompt to check all new and edited posts for “culture war” items and flags them for attention. (I’ve got the triage automation set to flag only, not hide—the idea is to get human eyes on contentious topics faster to make sure the conversation stays civil.)

This is all working great. However, sometimes, the antispam AI and the triage AI both set flags on the same post. I’ve adjusted my triage prompt a bit to try to work around it, but I’m wary of compromising the effectiveness of triage by screwing with the prompt too much.

Are other folks dealing with the issue of having posts double-flagged by both the antispam and a forum triage automation? What’s the right solution here? Should I not use a triage automation with antispam, or am I missing some setting to have automation not flag posts with flags already set, or something?

I want to re-emphasize that everything is working great, and both systems are very effective! I just want to see if there’s a way I can avoid having things flagged twice and have the two different AI tasks stay out of each others’ way. Advice appreciated!

If double-flagging is causing a problem, then it seems like a bug in one or both of the modules. They should probably just not evaluate posts that are already flagged (and then maybe check again before trying to set a flag)

hmm certainly feels like a “sequencing” thing, I wonder if you just switch to a single persona that handles both spam and triage? or one triage for tl0-1 for spam+triage and another for tl2 that only does triage?

2 Me gusta

Yeah, good calls—though it seems like this would mean disabling the built-in antispam feature and relying on automation instead, unless i’m missing something (very possible!).

Lemme think about this. That might in fact be the best way to do things.

2 Me gusta

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.