Watched Words/フラグ付けについて：アドバイスをお願いします

RobMeade · 2018 年 11 月 27 日午後 7:17

みなさん、こんにちは。

私自身はコミュニティサイトの運営・モデレーション・管理を数年しか経験していないのですが、それよりも長くこの分野に携わっている方々から、少しアドバイスをいただきたいと思い投稿しました。

私たちのコミュニティは素晴らしく多様で、世界中のさまざまな文化、言語、信念、性別、年齢など、あらゆる「保護された特性」を持つ人々が集まっています。これには本当に感謝しています！

コミュニティを「クリーン」に保つために、1年以上前に「監視ワード」のリストを追加しました。これらには通常、 swear words（汚い言葉）が含まれており、ここで具体的に言及するのは難しいですが、中には非常に強い表現もあり、一方で他の人にとっては日常的に使われるような言葉も含まれています。さらに、人種差別的あるいは同性愛者嫌悪的な侮辱やスラッグとして使われる可能性のある「用語」もリストに加えました。

どの swear words をリストに含めるべきか、含めるべきでないかを判断するのは非常に難しく、他人の許容度や、何が不快・不適切だと感じるかのレベルを測ることは、何もしずに誰かが指摘するのを待つしかないため困難です。

今夜、Discourse システムが監視ワードリストに基づいてトピックを不適切と判断した際に送信される、自動化されたテンプレートメッセージの存在に気づきました。ある例では、あるユーザーが性に関する用語のリストを作成し、その中の一つが監視ワードに引っかかり、メッセージが送信されてしまいました。しかし、そのメッセージには「コミュニティの複数のメンバーがトピックを不適切と報告しました」と書かれていましたが、実際には単にリスト上の単語がトリガーされただけでした。

言うまでもなく、使われた用語に対しては、そのメッセージは少し強硬に感じられました。テンプレートメッセージ自体は修正可能だと理解していますが、これらのメッセージは本当に「報告数」に基づいて送信されるのでしょうか？それとも、抑止力として強く書かれているだけなのでしょうか？

私は今、メッセージの内容を修正するか、監視ワードからいくつかの単語を削除するかという決断を迫られています。問題は、これらの言葉の中には会話の中で正当に使われる可能性があるものもあることです。以前は、モデレーターが「文脈の中で」用語を確認し、必要に応じてトピックを編集してから、フラグに同意するか拒否するかを決めるために追加しました。その時点では、ユーザーに送信されるメッセージの存在を知らなかったため、それが正しい対応だと感じていました。確かに管理作業が少し増えますが、一般的に私たちのコミュニティはとても礼儀正しいため、頻繁には発生しません。

現在私たちが使っているプロセスは以下の通りです。

フラグが立ったトピックを確認する
swear words にアスタリスクをつける
フラグに同意するが、投稿は「残す」を選択する

この方法により、ユーザーが過去に何度フラグを立てられたか（繰り返し違反するユーザーなど）を確認でき、より穏やかなアプローチで解決を図ることができます。もし同じユーザーが繰り返し多くのフラグを立てるようであれば、それぞれのケースを個別に、適切に対応することも可能です。

設定には「censor（検閲）」オプションもあり、そこで単語を追加できることも理解しています。これは試していませんが、おそらくこれらの単語については通知やフラグは発生せず、自動的にアスタリスクに置き換えられ、ユーザーのプロフィールにフラグの記録も残らないと推測します。

前述の通り、私たちは素晴らしいコミュニティを持っています。 swear words が使われることの多くは、新しい学習内容において期待通りに進まないことへのフラストレーションから来ていることが多いようです。そのような投稿は、彼らにとって「愚痴を言う」手段となっているのでしょう。また、私は若い読者に対して少し過保護かもしれません。若いメンバーは両手で数えられるほどしかいないかもしれませんが、確かに存在します。彼らのために、コミュニティを「良い状態」に保ちたいと考えています（彼らが私よりもはるかに多くの言葉、あるいはもっとひどい言葉を知っていることは十分に理解しています）。

つまり、上記でいくつかの質問を投げかけていることになります。もし長々と話してしまったら申し訳ありません。 swear words や人種差別的・同性愛者嫌悪的な侮辱やスラッグの使用を防ぎ、管理するための、上記よりも良いプロセスをご提案いただけないでしょうか？

できれば、日常会話で使われる可能性のある単語のための別のオプションが欲しいところです。つまり、文脈が適切かどうかを確認するために誰かが目を通す必要があるが、文脈が悪と判断されるまでユーザーに何らかのメッセージを送らない、という仕組みです。

ここで実際に監視ワードに使っている言葉を直接使わずに慎重に表現しようとしたため、具体例やシナリオが不足している点、お詫び申し上げます。

付け加えますと、私自身はこれらの言葉の多くを特に不快だとは思っていません。ただ、私たちは皆異なる存在であり、ある人が悪意なく言った言葉であっても、それが他の誰かを傷つける「可能性」があることを考慮したいと考えています。

ご意見、アイデア、ご提案を歓迎します。よろしくお願いいたします。

awesomerobot · 2018 年 11 月 27 日午後 8:50

I see what you mean, a separate flag message (or substitution of part of the existing message) for watched words would be useful.

Here’s the portion of irrelevant text we send when a watched word is auto-flagged:

Multiple community members flagged this post before it was hidden, so please consider how you might revise your post to reflect their feedback.

That’s true when a post is flagged by users, as it often requires multiple user flags before being auto-hidden, but with watched words your post is auto-hidden immediately.

You could alternatively use the “require approval” variety of watched words and change that message. By default new users don’t need any posts approved, so unless you changed that setting the require approval message can be specifically customized for the watched words case.

You could modify this to say something like “We’ve received your new post but it contains a word that is sometimes used to offend others. A moderator will review your post before it’s published.”

RobMeade · 2018 年 11 月 28 日午後 1:04

Hi Kris,

Thank you for your reply, appreciated

I see what you mean, a separate flag message (or substitution of part of the existing message) for watched words would be useful.

Yeah, a kind of soft approach, working in the same way as the flagging, but without a message going out to the user until a decision/action has been taken perhaps, but still providing the same options to the moderator/administrator for keeping the post, deleting the post, editing the post etc.

The paragraph you highlighted from the email was in fact the very one which one community member copy and pasted back to us to query.

Thanks for the idea regarding the Requires Approval, I obviously wouldn’t want to delay anyone’s posts too much, despite being on the forum pretty much every day there could be a number of hours before I return, this could clash with region time zone differences too and give the user a less than satisfactory experience.

Out of interest, when a post is flagged as requiring approval, are the options to then only approve/disallow? As mentioned above, it has been useful to have individuals flagged so that we could potentially identify repeat offenders. If someone triggered the Needs Approval based on a term - what are the outcomes of the moderators/admins response? e.g. is the user still flagged in any way? Are the options different from those you see when a post is currently flagged as inappropriate?

トピック		返信	表示
Watched Words Reference Guide Site Management reference , watched-words , content	16	5915	2026 年 2 月 16 日
Best Practice for auto-flagging watch words Support moderation	3	1894	2018 年 4 月 27 日
Watched Words Notification setting enquiry Support watched-words	0	656	2021 年 2 月 5 日
Watched words: Does Discourse automatically remove unsupported html tag Support watched-words	5	188	2025 年 11 月 3 日
Posts being flagged due to watched word, however no watched word is flagged Support email , watched-words	2	62	2026 年 4 月 7 日

Watched Words/フラグ付けについて：アドバイスをお願いします

関連トピック