Russian characters in Watched Words list are failing to be properly identified

CCP_Aurora · February 9, 2021, 12:48pm

I’ve been expanding the watched words list for our company and found an odd issue. We’d like to be able to use the watched words list for all supported languages, but it is improperly flagging certain words which are fine in Russian because it is not detecting all of the characters in the word (it seems).

Example 1: Regular watched words with English characters work fine

Example 2: If I add a character to the front of this, it no longer flags it (which is working as intended)

Example 3: But for certain Russian characters, the letters look identical to the english character but they seem to have a different unicode that makes them not appear.

абля is being improperly flagged even though it is not on the list. Deleting and re-typing the “a” on an English keyboard results in the word no longer being flagged (likely due to a different coding of the character). This is resulting in perfectly fine words being improperly flagged, which is undesired.

Another example is себ being improperly flagged in the same manner, when only еб is on the watched words list.

If anyone has workaround suggestions for this I’d be happy to hear them! Thanks

sam · February 10, 2021, 6:24am

Hi @CCP_Aurora we will have a look, I recall getting the regexes to work properly in unicode and handle boundaries correctly was a bit of an adventure. This certainly looks like a bug.

@gerhard may have some ideas as well, I recall he worked on similar issues in the past.

Topic		Replies	Views
Hope Watched words adds support for non-English characters Bug	1	85	February 16, 2026
Test Watched Words is Broken Bug watched-words	2	538	June 9, 2023
Accented characters cause false postives in Watched Words Bug watched-words	2	489	May 18, 2023
Watched words: in Persian, content is affected without containing the word Support	6	780	May 9, 2019
Bypassing watched words with confusable character replacements Support watched-words	2	232	December 17, 2024

Russian characters in Watched Words list are failing to be properly identified

Related topics