Watched Words / Flagging - Advice Sought

RobMeade · November 27, 2018, 7:17pm

Hi all,

I’m after a little advice/guidance from those who have been overseeing/moderating/managing community sites longer than myself (a couple of years).

Our community is wonderfully diverse, it has attracted people from all around the world, different cultures, languages, beliefs, gender, age - insert any protected characteristic here and I’m sure we have representation - it’s great!

In order to keep things clean I added a series of watched words some time ago (well over a year), they are typically swear words, hard to describe them without using them here, but some were the strong flavour, whilst others were perhaps, for some, more everyday use words. In addition I added some terms which could be used as racial or homophobic insults/slurs.

I found it challenging to determine which of the swear words should be added and which shouldn’t, it’s difficult to gauge other people’s tolerances or levels of things they find offensive/inappropriate without doing nothing and then waiting for someone to say.

This evening I became aware of the automated, templated, messages which Discourse sends out when the system flags a topic as inappropriate based on the watched words list. In one example, someone had made a list of terms relating to sexuality, one of these got flagged up because of the list of watched words and the message got sent. When I saw the message that got sent it stated that a number of people in the community had flagged the topic as inappropriate, where, in fact, it was simply because it had triggered one of the words on the list.

Needless to say the message felt a little heavy-handed for the term that had been used. I appreciate I can amend the templated messages but I wanted to ask, are there any situations where these messages are sent based on the number of flags, or, is this just strongly written to act as more of a deterent?

I’m left with the decision of whether to perhaps amend the content of the message, or remove some words from the watched words. The problem is, some of these words might be used legitimately in conversation, I added them so that a moderator could review the term in context, editing the topic where necessary after agreeing/disagreeing with the flag. At that time, I wasn’t aware of the messages that got sent to the users so it felt like the right thing to do. Whilst it creates a bit of an admin a task, it doesn’t happen too often because, generally, we have a very polite community

The process we current use is along these lines;

checking flagged topic
asterisk out swear word
agree with flag, but choose to keep the post

The above allows us to see the number of times a user has had topics flagged, e.g. repeated offenders, but also allows a more gentle approach to resolving it. Should we see those repeat offenders getting lots of flagged topics we could then handle each one individually/personally as appropriate.

I appreciate there is the censor option in the settings where I can add words as well, I’ve not tried this, but I assume that we wouldn’t receive any notification/flag for these, they would just be automatically replaced with asterisks and there wouldn’t be anything on the user’s profile to indicate the flags?

As I’ve said, we do have a fabulous community, I believe more often than not the use of swear words comes from people frustration when things aren’t going as they would like during their learning of new subject matters, their post then gives them the ability to vent. I am also probably a little bit protective of our younger audience. I could probably count the number of youngsters we have on two hands, but I know we have some, and I’d like things to stay nice for them - fully appreciating they probably all know far more words (and maybe worse!) than myself

So I suppose I’m probably asking a few questions above, sorry if I have rambled. Can anyone offer any suggestions on a process to prevent/manage the use of swear words, racial/homophobic insults/slurs which is perhaps better than the above?

I could almost do with another option, one for words which might be used in day to day conversation, but just need someone to cast their eye at to be sure the context is ok - but without sending any form of message to the user until an action is taken which deemed the context to be bad.

I’ve tried to tread carefully above without actually using any of the words I refer to in our watched words so my apologies for the lack of examples/scenarios.

I should add, I don’t personally find a lot of these things offensive, I’m just trying to be mindful of the fact that we are all different and what one person says, even if no malice was intended, may cause someone else to be offended.

Any thoughts/ideas/suggestions are welcomed - thanks in advance

awesomerobot · November 27, 2018, 8:50pm

I see what you mean, a separate flag message (or substitution of part of the existing message) for watched words would be useful.

Here’s the portion of irrelevant text we send when a watched word is auto-flagged:

Multiple community members flagged this post before it was hidden, so please consider how you might revise your post to reflect their feedback.

That’s true when a post is flagged by users, as it often requires multiple user flags before being auto-hidden, but with watched words your post is auto-hidden immediately.

You could alternatively use the “require approval” variety of watched words and change that message. By default new users don’t need any posts approved, so unless you changed that setting the require approval message can be specifically customized for the watched words case.

You could modify this to say something like “We’ve received your new post but it contains a word that is sometimes used to offend others. A moderator will review your post before it’s published.”

RobMeade · November 28, 2018, 1:04pm

Hi Kris,

Thank you for your reply, appreciated

I see what you mean, a separate flag message (or substitution of part of the existing message) for watched words would be useful.

Yeah, a kind of soft approach, working in the same way as the flagging, but without a message going out to the user until a decision/action has been taken perhaps, but still providing the same options to the moderator/administrator for keeping the post, deleting the post, editing the post etc.

The paragraph you highlighted from the email was in fact the very one which one community member copy and pasted back to us to query.

Thanks for the idea regarding the Requires Approval, I obviously wouldn’t want to delay anyone’s posts too much, despite being on the forum pretty much every day there could be a number of hours before I return, this could clash with region time zone differences too and give the user a less than satisfactory experience.

Out of interest, when a post is flagged as requiring approval, are the options to then only approve/disallow? As mentioned above, it has been useful to have individuals flagged so that we could potentially identify repeat offenders. If someone triggered the Needs Approval based on a term - what are the outcomes of the moderators/admins response? e.g. is the user still flagged in any way? Are the options different from those you see when a post is currently flagged as inappropriate?

Topic		Replies	Views
Best Practice for auto-flagging watch words Support moderation	4	1834	October 2, 2020
Watched Words Notification setting enquiry Support watched-words	0	631	February 5, 2021
Watched Words Reference Guide Site Management reference , watched-words , content	7	4525	July 11, 2025
Watched words triggered when editing old posts Bug	8	1168	September 11, 2017
Automatic Actions based on post content (Watched Words) Announcements new-feature , watched-words	2	7885	December 1, 2023

Watched Words / Flagging - Advice Sought

Related topics