Discourse AI - Toxicity

Discourse · April 24, 2023, 7:39pm

This topic covers the configuration of the Toxicity feature of the Discourse AI plugin.

Required user level: Administrator

The Toxicity modules can automatically classify the toxicity score of every new post and chat message in your Discourse instance. You can also enable automatic flagging of content that crosses a threshold.

Classifications are stored in the database, so you can enable the plugin and use Data Explorer for reports of the classification happening for new content in Discourse immediately. We will soon ship some default Data Explorer queries with the plugin to make this easier.

Settings

ai_toxicity_enabled: Enables or disables the module
ai_toxicity_inference_service_api_endpoint: URL where the API is running for the toxicity module. If you are using CDCK hosting this is automatically handled for you. If you are self-hosting check the self-hosting guide.
ai_toxicity_inference_service_api_key: API key for the toxicity API configured above. If you are using CDCK hosting this is automatically handled for you. If you are self-hosting check the self-hosting guide.
ai_toxicity_inference_service_api_model: ai_toxicity_inference_service_api_model: We offer three different models: original, unbiased, and multilingual. unbiased is recommended over original because it’ll try not to carry over biases introduced by the training material into the classification. For multilingual communities, the last model supports Italian, French, Russian, Portuguese, Spanish, and Turkish.
ai_toxicity_flag_automatically: Automatically flag posts/chat messages when the classification for a specific category surpasses the configured threshold. Available categories are toxicity, severe_toxicity, obscene, identity_attack, insult, threat, and sexual_explicit. There’s an ai_toxicity_flag_threshold_${category} setting for each one.
ai_toxicity_groups_bypass: Users on those groups will not have their posts classified by the toxicity module. By default includes staff users.

Additional resources

Last edited by @hugh 2024-08-06T05:37:39Z

Last checked by @hugh 2024-08-06T05:37:44Z

Check document
Perform check on document:

Topic		Replies	Views
Setting up toxicity detection in your community Site Management moderation , automation , how-to , ai	0	660	August 7, 2024
Have AI check for inappropriate post or at least words and flag the post Support ai , ai-toxicity	3	368	July 7, 2023
Discourse Google Perspective API Plugin official , perspective-api	2	20835	August 10, 2024
AI flagging too sensitive Support ai , ai-toxicity	2	562	March 31, 2024
Introducing Discourse AI Blog	26	3547	May 4, 2023

Discourse AI - Toxicity

Settings

Additional resources

Related topics