Discourse Google Perspective API

:discourse2: Summary Google Perspective API is the official Google Perspective API plugin for Discourse
:hammer_and_wrench: Repository Link https://github.com/discourse/discourse-perspective-api
:open_book: Install Guide How to install plugins in Discourse

Features

What is the Perspective API?

From the official site, “Perspective is an API that makes it easier to host better conversations.The API uses machine learning models to score the perceived impact a comment might have on a conversation. This model was trained by asking people to rate internet comments on a scale from very toxic to very healthy contribution. Toxic is defined as… a rude, disrespectful, or unreasonable comment that is likely to make you leave a discussion.”

What can the discourse-perspective-api plugin do?

  • Prompt users if they are sure about submitting a potentially toxic post, before submit.
  • Automatically flag toxic posts for moderators and admins to review.
  • Optionally scan private categories and PM’s for toxic content

Configuration

Where do I get a Perspective API key?

Follow these instructions to create a Google Cloud account and gain access to an API key.

https://developers.perspectiveapi.com/s/docs-get-started

The API can be used free of cost, here are the API Reference docs.

Site Settings Walkthrough

(Admin → Type ‘perspective’ in the Filter text field)

The API is currently only available for the English language.
The default thresholds are set to be reasonably high but these settings offer some customizability for fine-tuning how this plugin works. Play around with the live demo on the official docs linked above to get a sense of how the thresholds will behave.

  • perspective_enabled:
    Enable the plugin for filtering potentially toxic posts.

  • perspective_toxicity_model:
    Choose toxicity model for Google’s Perspective API. Read more about how these models are developed by reading the API Reference docs.

    • standard
      ​classfies rude, disrespectful, or unreasonable comments that are likely to make people leave a discussion. It is easier to cross the threshold on the standard model if curse words and insults are used in a friendly way and posts are flagged easily. if you choose a high threshold of 0.9, the standard model will flag lesser posts and will take lesser incorrect actions.

    • severe toxicity (experimental)
      ​This model uses the same algorithm as the standard model, but is trained to recognise examples that were considered to be ‘very toxic’. This makes it much less sensitive to comments that include positive uses of curse-words for example. Posts are flagged only when extreme cases of toxicity are detected and the threshold for this model can be lowered till 0.7 as a reasonable value.

    For example, a post containing "I f*****g love you man" would get flagged under the standard model (using the default thresholds) but not with the severe toxicity model.

  • perspective_notify_posting_min_toxicity_enable:
    Enable the checking of potentially toxic content while a user is trying to submit a post and push a notification in the composer when a user writes something toxic.

    • perspective_notify_posting_min_toxicity:
      If API returns a score higher than this threshold, we notify the ask the user if they are sure they want to post potentially toxic content. The confidence level of post toxicity between 0 and 1 that is used to check toxicity while a user is composing a post where a score of 1 means extremely toxic. A value above 0.9 should flag highly toxic posts only, depending on the model used. As the user will be notified before posting, we can use a slightly lower threshold here like 0.85 to warn users beforehand.
  • perspective_flag_post_min_toxicity_enable:
    Flag possible toxic posts that have already been submitted and send messages to notify moderators for posts that have been submitted. Admins/Moderators are notified about the flagged posts.

    • perspective_flag_post_min_toxicity:
      If the API returns a score higher than this threshold, we flag the post for admins/moderators to review. The confidence level of post toxicity between 0 and 1 that is used to check toxicity after a user has posted where a score of 1 means extremely toxic. A value above 0.9 should flag highly toxic posts only, varying on the model used.
  • perspective_google_api_key:
    API key for the Perspective API that you have received after completing the registration process mentioned above.

  • perspective_check_private_message:
    Check and flag private messages if toxic.
    Note: The content of the PM will be sent to moderators/admins.
    Also applies to backfill mode.

  • perspective_check_secured_categories:
    Additionally check private categories for toxic content by enabling this setting.

  • perspective_backfill_posts:
    Query toxicity for existing posts and record the results in post custom fields.
    Enabling this mode disables online checking for posts.

  • perspective_historical_inspection_period:
    The period in days to start a new query iteration after finishing last iteration. Used only if perspective_backfill_posts is enabled.

Screenshots

What a user sees when trying to submit a toxic post:

1

What admins/moderators see when a toxic post is submitted:

2

CHANGELOG

TODO


Big thanks to @fantasticfears for creating this plugin!

44 Likes