For watched words, I think it could be improved if similar unicode characters also matched.
For example:
abcabcabc
๐ข๐ฃ๐ค๐ข๐ฃ๐ค๐ข๐ฃ๐ค
๐๐๐๐๐๐๐๐๐
ab๐ค๐ข๐ฃ๐๐๐๐
Essentially allows spammers to have a lot of variations of the same words to circumvent the word filter. Iโve been getting hammered by crafty motivated spammers so theyโve really been pushing Discourseโs anti-spam features to the absolute limit. This is one of the techniques theyโre using.
Perhaps this could be useful: https://github.com/janlelis/unicode-confusable