Discourse AI - 毒性

Falco · 2024 年8 月 26 日 19:21

我们发现，虽然对于像“品牌”拥有的实例那样对典型毒性零容忍的情况，它效果很好，但对于其他更面向社区的 Discourse 实例，毒性模型过于严格，在更宽松的实例中产生了过多的标记。

因此，我们目前的计划是弃用毒性检测，并将此功能移至我们的 AI Triage 插件，在该插件中，我们提供可自定义的提示，供管理员根据其实例允许的级别调整其自动毒性检测。

我们还计划为客户提供托管的 LLM 审核模型，例如 https://ai.google.dev/gemma/docs/shieldgemma 或 https://arxiv.org/abs/2312.06674，这些模型在我们内部针对催生 Detoxify 的原始 Jigsaw Kaggle 竞赛中使用的数据集进行的评估中表现非常好。

话题		回复	浏览量
Setting up toxicity detection in your community Site Management moderation , automation , how-to , ai	0	856	2024 年8 月 7 日
Have AI check for inappropriate post or at least words and flag the post Support ai , ai-toxicity	3	404	2023 年7 月 7 日
Discourse Google Perspective API Plugin official , perspective-api	2	20976	2024 年8 月 10 日
Setting up NSFW detection in your community Site Management moderation , automation , how-to , ai	0	719	2024 年10 月 10 日
AI flagging too sensitive Support ai , ai-toxicity	2	578	2024 年3 月 31 日