Accented characters cause false postives in Watched Words

When using Watched Words, accented characters can cause false positives by splitting a word on the accented character rather than treating it as part of the word. It seems that the word filter treats letters with accents and diacritics as blank spaces instead of part of the same word.

Repro steps:

  • Add ‘anal’ to blocked Watched Words
  • As non-admin user, attempt to use analógico in a post

  • Post is blocked

Attempting the same with analog works as intended, and is allowed to be posted.

10 Likes

I was able to reproduce the same thing on my end. This bug also includes other characters with a cedilla like ç and ş:

3 Likes

Support for UTF-8 characters in watched words has been implemented in this PR:

This should correctly detect word boundaries for all words, including those that contain UTF-8 characters.

3 Likes

This topic was automatically closed after 3 days. New replies are no longer allowed.