Moderation API Plugin

:information_source: Summary The Discourse Moderation API Plugin enhances your Discourse forum with automated moderation capabilities. Leverages the Moderation API’s battle-tested detection engine and includes an improved moderation dashboard to 10x moderator efficiency.
:hammer_and_wrench: Repository Link https://github.com/moderation-api/discourse-moderation-api
:open_book: Install Guide How to install plugins in Discourse

:hammer_and_wrench: Highlighted Features

Moderation API is a full stack moderation solution.

Automated Moderation Actions

  • Automatically flag comments and topics.
  • Choose from 20+ pre-built models for common use cases or build your own.
  • Detect toxicity, NSFW content, PII, spam, self-promotion, illegal activity, and more.
  • Set custom thresholds for automated flagging.

LLM-Powered Detection

  • Integrate your community guidelines into an AI agent.
  • Utilize AI as the first line of defense or as a trusted moderator to flag comments.

Custom Model Training

  • Develop your own AI models for the highest accuracy in moderation.
  • Use moderator actions as feedback to train and refine models.
  • Continuously enhance automated flagging through machine learning.

Enhanced Review Queue

  • Optionally use Moderation API’s review queue for a streamlined and modern moderation experience.
  • Create multiple review queues tailored to different languages, categories, or specific purposes.
  • Develop moderation workflows for escalating content.
  • Assign moderators to specific review queues for efficient management.

Dashboard Analytics

  • Monitor AI activities and outcomes through the Moderation API dashboard.
  • Gain insights into common issues and identify areas for improvement.

Seamless Integration

  • Easily integrates with existing Discourse workflows and user roles.
  • Choose between the Discourse review queue or Moderation API’s review queue.
  • Utilizes built-in moderation actions from Discourse.

:rocket: Configuration

Follow these steps to configure the Moderation API Plugin:

Create a Project in Moderation API

  • Navigate to your Moderation API Dashboard.
  • Create a new project and select the labels you wish to detect.

(Optional) Test and Adjust Thresholds

  • Use the threshold sliders to determine the strictness of your moderation.
  • Test the API response in the playground.

Set API Key

  • Locate your API key under Integrate in your project dashboard.
  • In the Admin panel of Discourse, navigate to Settings > Moderation API.
  • Paste your API key into the Moderation API Key field.
  • Save the changes.

Enable the Plugin

  • Select your flagging behaviour (see options below). You can start with “nothing” to test out the plugin without performing any actions.
  • Press enable plugin to start analyzing new posts. The plugin does not analyze any pre-existing content.

(Optional) Add Your Community Guidelines

  • Go to the Model Studio in Moderation API.
  • Create a new AI agent.
  • Incorporate your guidelines as rules for the agent. If you have extensive guidelines, consider creating multiple agents.
  • Add the agent to your project.


:triangular_flag_on_post: Flagging Behaviors

The plugin offers four different flagging behaviors, determining the actions taken when the Moderation API flags a comment.

1. Flag (Default Behavior)

The plugin bot adds an Inappropriate flag to the comment, following your Discourse configuration. Typically, this means the comment appears in the review queue, but it may not be immediately hidden until a moderator approves it or additional users flag the comment. Review your flag-related settings in Discourse for customization.

2. Queue for Review

The comment is instantly hidden and added to the review queue for moderators to approve or reject.

3. Block Post

The comment is never posted. The author receives an error message indicating that the comment was blocked by the automated moderation system. (You can customize the error message.)

4. Nothing

No immediate actions are taken. The comment is analyzed and will appear in the Moderation API dashboard if flagged. This option is useful for testing the Moderation API before fully enabling the plugin.


:white_check_mark: TODO

  • Enable actions from Moderation API’s review queue to remove content from Discourse.
  • Sync actions from Discourse’s review queue with the review queue in Moderation API.
  • Allow separate moderation projects for different categories.
  • Flag content using a selected Discourse flagging category (currently using Inappropriate).

:wrench: Settings

Below is a table of available settings for the Moderation API Plugin along with their descriptions:

Setting Description
Enable Moderation API Controls whether the plugin is active.
Default: Disabled
Flagging Behavior What happens when content is flagged:
• Queue for review
• Flag post
• Block post
• Nothing
Default: Flag post
Block Message The message shown to users when their post is blocked.
Default: “Your post has been blocked by our moderation system.”
Notify on Post Queue Send notifications when posts are queued for review.
Default: Enabled
Check Private Messages Apply moderation to private messages.
Default: Disabled
Skip Groups User groups that bypass moderation checks.
Default: None
Skip Categories Forum categories that bypass moderation checks.
Default: None
API Key Your Moderation API authentication key.
Default: None

:credit_card: Subscriptions

You can install the plugin immediately and take advantage of our free tier or 30-day trial. For extended features and higher usage limits, explore our subscription options.


:books: Documentation


:hammer_and_wrench: Support



Disclaimer: While the Discourse Moderation API Plugin significantly enhances moderation capabilities, it is essential to review and understand the implications of automated moderation. Always ensure transparency with your community regarding the use of AI in moderation processes.

Privacy Note: This plugin processes user-generated content to enforce moderation rules. Ensure compliance with your privacy policies and inform users about data processing practices.


8 Likes

From the github repo readme:

You can install the plugin right away and use our free tier or 30 day trial.

I couldn’t find info about a free tier on the website or the API documentation. What are the limitations?

Also, is the pay-as-you-go plan only available when we exceed the quota of a paid plan?

2 Likes

The free tier is available for hobby projects. Feel free to send a message to get set up.

Correct, PAYG is opt in for paid plans when exceeding the included quota.

3 Likes

Love to see more AI moderation tools! Can you please clarify what this provides that Discourse AI triaging does not? Thank you!

1 Like

Yes, of course. This could probably be more clear in the original post.

First let me mention that Moderation API gets you access to a complete moderation platform where the detection engine is just a part of it. You’ll essentially be partnering with a company that have years of experience in solving content moderation.

But if we just focus on the detection/triaging:

  1. Better accuracy: You can pick from 20+ pre-built classifiers to handle the most common use cases. This makes it very easy to get started, and we’re constantly improving our models so you don’t have to worry about the latest and greatest.
    You’ll usually get better and more robust results with a well trained classifier compared to a prompt engineered LLM.

  2. Context awareness: Moderation API’s detection engine can also look at previous messages in a thread and an author’s history to provide better analysis. I think this is a big improvement compared to the built in triage.

  3. Specialized LLMs: I believe Discourse let’s you choose between a couple models like gpt-4o and claude, where Moderation API also support LLMs trained specifically for content moderation like Llama-guard and more coming. Our LLMs also come pre-configured with prompts to make them perform the best based on our data.

  4. Train custom models: Once you’re hooked in to Moderation API you’re also able to train your own models on your specific data.

  5. Compliance: We host our models on our own servers and can provide custom DPA’s for companies where compliance and regulation is a priority. In some cases we can even provide on-premise solutions.

  6. Cost: The best part is that we can do it cheaper at large volumes, and in any case provide flat rates for predictable costs.

I hope this makes it more clear. Configuring a project gives you so many options and flexibility over just writing a prompt, so I’d say it’s just a much more powerful and specialised solution.

2 Likes