|Summary||Integration between AI features and Discourse|
|Repository Link||GitHub - discourse/discourse-ai|
|Install Guide||How to install plugins in Discourse|
Please check our blog post about this plugin at
Introducing Discourse AI
We are happy to announce a brand new Discourse plugin that we have been working on: Discourse AI.
Discourse AI is our one-stop solution for integrating Artificial Intelligence and Discourse, enabling both new features and enhancing existing ones. With this first release, we are shipping 5 different Discourse AI modules.
Discourse AI Modules
For Discourse AI, we have opted to keep its features all in a single plugin, but separated by modules that you can enable independently and allow you to customize them for your community needs.
We’ve also made one of our priorities not to lock you to a single company API, so every community can pick the provider that makes sense for them, balancing data privacy, performance, feature sets, and vendor lock-in.
With the sentiment module, we will automatically classify every post in your community across sentiment (positive or negative) and/or emotion (joy, surprise, anger, disgust, fear, sadness, or neutral). This will allow your staff team to have insights into the community’s health and will help you to diagnose the sentiment across axes like category, topic, and user level.
Composer AI Helper
After composing your post, click on the icon and select any of the following options:
- Suggest titles
- Translate to English
And after a couple of seconds, you will get some help from the AI.
This is enabled here on Meta for TL3+
The toxicity module can scan both new posts and chat messages and classify them on a toxicity score across a variety of labels. Those toxicity scores are all available for reports, where the community moderators can identify content that may not be adequate for your instance.
And, if you want to get one step further, you can enable automatic flagging of content that crosses a customizable toxicity threshold, which will put the potential problematic content into the Discourse Review Queue, where they can be manually analyzed by your mod team.
NSFW Image Detection
The NSFW module will automatically scan every new upload in user posts and classify each image found for what’s usually considered NSFW content. The content of the classification is available via reports to your moderator team and, optionally, you can enable automatic flagging of content that crosses a certainty threshold.
This powers two modules at the moment:
Semantic Related Topics
When you get to the end of a topic, Discourse presents you with 5 suggestions of topics to read next. Nowadays, we pick 5 random topics for anonymous users and use the unread topics for logged-in users to populate that list, making it quick to generate but not very useful when you are researching a specific subject.
With the new Semantic Related Topics feature, we will use Semantic Textual Similarity between the current topic and all the other topics in your instance to suggest topics that are potentially more relevant to what a person is looking for.
This is enabled here on Meta for all, including anon
This used the same logic we used for semantic related topics, but to power search results.
This module can summarize topics and chat channels, for times when you need a quick way to figure out what is going on.
Check each module documentation topic:
As we said above, we are committed to offer new AI features without compromising your privacy. See below the current providers and models for each module. CDCK handles hosting for open-source models in our infrastructure and API keys for SaaS providers like OpenAI.
- Toxicity detection is powered by GitHub - unitaryai/detoxify: Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using Pytorch Lightning and Transformers. For access to our API, please email us at firstname.lastname@example.org.
- Sentiment uses j-hartmann/emotion-english-distilroberta-base · Hugging Face and cardiffnlp/twitter-roberta-base-sentiment-latest · Hugging Face
- NSFW detection uses GitHub - GantMan/nsfw_model: Keras model of NSFW detector and GitHub - bhky/opennsfw2: TensorFlow 2 implementation of the Yahoo Open-NSFW model
- Semantic Suggested Topics uses either GitHub - UKPLab/sentence-transformers: Multilingual Sentence & Image Embeddings with BERT or OpenAI to generate embeddings, and GitHub - pgvector/pgvector: Open-source vector similarity search for Postgres for storage and search.
- Composer AI Helper uses either OpenAI or Anthropic APIs.
- Summarization uses philschmid/bart-large-cnn-samsum · Hugging Face, philschmid/flan-t5-base-samsum · Hugging Face, pszemraj/long-t5-tglobal-base-16384-book-summary · Hugging Face, OpenAI or Anthropic.
We are being very mindful with our experimentation around AI. The algorithms we are leaning on are only as good as the data they were trained on. Bias, inaccuracies and hallucinations are all possibilities we need to allow for. We regularly revisit, test and refine our AI modules.
Check the docs on self-hosting the API services at Discourse AI - Self-Hosted Guide
Will this available on Discourse hosting? Which plans?
This is available in preview for Enterprise customers, please contact our support team to get it installed and configured on your instance.
Rollout for select modules for other tiers will follow later.
Will CDCK offer a SaaS version of the AI services API for self-hosted communities?
Not at the moment, but this is something we may consider given the feedback from our community.