מילים שנצפו

pfaffman · 26 באוגוסט,‏ 2025,‏ 7:02pm

I’ve had at least two sites get hit with a wave of spam that looks like it’s designed to poison LLMs. The same attack has also been reported here at least once (Anyone else currently undergoing mass spam attack?). The best solution is to set up Discourse AI - Spam detection, which I do recommend, but it’s a bit of a bother. Here’s a stopgap you can implement that will take just a few minutes.

It assumes that you have a unix-like operating system (e.g., linux or mac). If you use Windows and can copy/paste into a terminal, you can ssh to your Discourse server and paste this in.

What it does is create a set of watched words generated from a recent attack that I saw. If you’re handy with nano or similar, you can edit it before you run it. If not, you can run this script and then delete the words you don’t like with one click per word.

The block words are potentially very annoying since they’ll keep legitimate users from creating posts with those words in them, so take a look to make sure none of those words are likely to appear in legitimate posts on your forum!

Put your site URL, api key, and api user in the boxes below (they’ll be only on your browser–but you can just paste it as it is and edit the file if you prefer) and then copy/paste the code block into a terminal. It will create upload_watched_words_full.sh and make it executable. You can then run it with ./upload_watched_words_full.sh.

cat <<'EOF' > upload_watched_words_full.sh
#!/usr/bin/env bash
# Usage: ./upload_watched_words_full.sh

DISCOURSE_URL="=URL="
API_KEY="=API_KEY="
API_USERNAME="=API_USERNAME="

# High-confidence block words
BLOCK_WORDS=(
  "customer service number"
  "contact number"
  "support number"
  "refund phone number"
  "toll free"
  "24/7 support"
  "helpline"
  "call us"
  "live representative"
  "technical support"
  "lufthansa"
  "royal caribbean"
  "coinbase"
  "robinhood"
  "reservation number"
  "booking number"
  "flight cancellation"
  "name change fee"
  "║"
  "⇆"
  "★"
  "®️"
  "™️"
)

# Medium-risk flag words
FLAG_WORDS=(
  "customer service"
  "customer support"
  "support team"
  "help desk"
  "hotline"
  "agent"
  "representative"
  "contact us"
  "phone support"
  "service center"
)

# Require-approval words
REQUIRE_APPROVAL_WORDS=(
  "urgent"
  "immediate action"
  "act now"
  "limited time"
  "exclusive offer"
  "approve this"
  "verify account"
)

# Function to send words in batch
add_words () {
  local ACTION="$1"
  shift
  local WORDS=("$@")

  # Build words[] parameters
  local DATA=""
  for w in "${WORDS[@]}"; do
    DATA+="words%5B%5D=$(printf '%s' "$w" | jq -s -R -r @uri)&"
  done
  DATA+="replacement=&action_key=${ACTION}&case_sensitive=false&html=false"

  echo "Uploading ${ACTION} words..."
  curl -s -X POST "${DISCOURSE_URL}/admin/customize/watched_words.json" \
    -H "Api-Key: ${API_KEY}" \
    -H "Api-Username: ${API_USERNAME}" \
    -H "Content-Type: application/x-www-form-urlencoded" \
    --data "$DATA"
  echo -e "\nDone."
}

# Upload block words
add_words "block" "${BLOCK_WORDS[@]}"

# Upload flag words
add_words "flag" "${FLAG_WORDS[@]}"

# Upload require-approval words
add_words "require_approval" "${REQUIRE_APPROVAL_WORDS[@]}"
EOF

# Make the script executable
chmod +x upload_watched_words_full.sh

echo "Script 'upload_watched_words_full.sh' created and made executable."

RGJ · 26 באוגוסט,‏ 2025,‏ 9:10pm

But Watched word Approval doesn't work if a user edits the reply ?

pfaffman · 26 באוגוסט,‏ 2025,‏ 9:15pm

Well, I haven’t paid attention to that. My naive hope is that the spammers won’t know to do that.

Mine_Zcash · 26 באוגוסט,‏ 2025,‏ 11:12pm

The bots we had a few years back were posting gibberish like dhfhstyhjfhhr as the topic title to spawn threads and get past the keyword filter, then it would edit it to the real “best online casinos” spam message.

pfaffman · 26 באוגוסט,‏ 2025,‏ 11:37pm

That is bizarre that editing the post gets by the watched words.

Maybe this batch of spammers won’t think to do that. Or nether this won’t help at all.

pacharanero · 9 בספטמבר,‏ 2025,‏ 12:25pm

תודה ג’יי, סקריפט מעולה, תודה שיצרת אותו. האם מישהו שהשתמש בו מצא שהוא עבד - מבחינת הפתרון העוקף של ‘עריכת פוסטים’? האם הבוטים עורכים את הפוסטים או פשוט שולחים את הטקסט ישירות?

pfaffman · 9 בספטמבר,‏ 2025,‏ 3:45pm

I don’t know, but if that’s problem, you could change settings so that some users can’t edit posts ( Edit post allowed groups), perhaps by requiring TL2 and adjusting settings to make it easier/harder to get to tl2.

For regular users, especially new ones, not being able to edit a post probably isn’t that big a deal, and may not even be something they expect.

pacharanero · 9 בספטמבר,‏ 2025,‏ 9:41pm

Good point. I’ll change it so that tl_0 cannot edit posts.

Reviewing one of the recent posts:

Looks like it was initially just garbage and then edited to spam, this would have circumvented Watched Words as far as I understand it.

Given that post edits from innocuous gibberish to spam are becoming part of the modus operandi of bots in their efforts to thwart spam filters, does the Discourse team think tl_0 should by default not be able to edit posts?

pfaffman · 9 בספטמבר,‏ 2025,‏ 9:59pm

Wow. I’m surprised that the default includes TL0.

Good to know for sure that these spammers are using that trick.

I think that the AI Spam module would have caught it.

pacharanero · 9 בספטמבר,‏ 2025,‏ 10:31pm

Oh for sure it would. However although it’s not super expensive it is a bit of a pain to have to set up AI on every Discourse to repel spam. Many of my forums would not need it (or want it) for any of the other quite useful Discourse AI features.

pacharanero · 9 בספטמבר,‏ 2025,‏ 10:44pm

הפתרון כאן אינו למנוע מ-tl0 או מכל קבוצה אחרת לערוך פוסטים.

הפתרון הוא לוודא שעריכת פוסט אינה עוקפת הגנות כלשהן באתר. פוסט ערוך יכול (כפי שראינו) להכיל ספאם, שנאה, או התנהגויות לא רצויות אחרות. אם פוסטים ערוכים עוקפים מילים במעקב ומסננים אחרים, זה בהחלט יהפוך לגישה סטנדרטית לא רק עבור בוטים אלא גם עבור בני אדם שרוצים לעקוף הגנות באתר.

pfaffman · 10 בספטמבר,‏ 2025,‏ 3:20pm

Agreed. It seems bizarre that it doesn’t work that way. And if I understand correctly it’s been broken for a long time.

Oh, it’s the “expected behavior”:

I bet this is how this came to be, maybe.

It seems like the watched words could just be applied before_save, but I haven’t looked at the code.

mcdanlj · 10 בספטמבר,‏ 2025,‏ 4:26pm

יש שאלה לגבי מה משמעות הדבר עבור סוגים מסוימים של מילים במעקב שיחולו על עריכה. אבל ברור גם שהמגבלה הזו מפחיתה מאוד את הערך של מילים במעקב.

הייתי מצפה שיהיה מועיל להציע מפרט לגבי מה בדיוק אמור לקרות עבור כל סוג במקרה של עריכה; זה יהפוך את בקשת התכונה ליותר ניתנת לפעולה. לא עשיתי את העבודה הזו, למרות שבזבזתי זמן מדי פעם על חשיבה על מקרים ספציפיים.

נושא		תגובות	צפיות
Send edits of approved posts back to approval queue Feature approval , review-queue	17	905	11 בספטמבר,‏ 2025
Watched Words not applying to post edits Support watched-words	3	392	8 באוגוסט,‏ 2023
Watched word Approval doesn't work if a user edits the reply Support watched-words	17	1568	27 באוקטובר,‏ 2022
Blocking recent wave of spam Support	20	653	29 ביולי,‏ 2025
Watched Words Reference Guide Site Management reference , watched-words , content	9	4932	22 באוקטובר,‏ 2025

מילים שנצפו

נושאים קשורים