For watched words, I think it could be improved if similar unicode characters also matched.
Essentially allows spammers to have a lot of variations of the same words to circumvent the word filter. I’ve been getting hammered by crafty motivated spammers so they’ve really been pushing Discourse’s anti-spam features to the absolute limit. This is one of the techniques they’re using.
Perhaps this could be useful: GitHub - janlelis/unicode-confusable: Unicode::Confusable.confusable? "ℜսᖯʏ", "Ruby"