Nope. Already tried it with a normal account. Everything that is wild-carded won’t get masked out.
Incidentally, if a word is included in censored pattern (note: this is in Settings, not in Watched Words), then it gets masked even when I’m admin. But this is beyond this question.
I’m running v1.9.0.beta17 +78. Should I be trying with the latest?
I’m also having trouble using the Watched Words to prevent my community from not respecting TOS over topics and private messages.
Like @schungx, censored pattern from Site Settings is working great, but I’d rather use Watched Words with Required Approval to prevent users from trying to sneak around the regex I’m using.
However, I was only able to trigger the flagging system when creating topics and writing replies. Private messages just won’t trigger anything (Approval, Censor, Flag or Block).
This was tested on v2.0.0.beta1 +26 with 2 test accounts.
Watched Words Censor now works for words with wildcards.
However, it doesn’t work if watched_words_regular_expression is true.
I don’t think the censor function even considers it as regular expression at all.
Repo:
Settings > watched words regular expression ==> ON
Add xyz* to Censor
In compose window, type xyz123
See that it is censored as it is treating the * as a wildcard. If it is treated as a regular expression, xyz* should only match xy followed by a string of zzzz…
This is always assuming that the pattern is a word pattern and \b pairs are auto-wrapped onto it. If the pattern is a regular expression, obviously the \b pairs can be omitted because the user should put them in himself.
Well, if you put in *shit* then obviously that is what you’ll get… Since you are explicitly asking the system to filter out anything containing these words.
Usually you’ll be using shit* for example…
But of course, it won’t be 100% fool-proof if you use any wildcard. For example:
@eviltrout is this setting meant to force every word to explicitly include the word boundaries in the patterns? This feature was added for a specific customer, so removing the \b around the patterns could have… surprising consequences!
Agreed. Censor wasn’t updated to support watched_words_regular_expression, so I’ll need to implement it.
Yes this was intentional. If you write the regular expression yourself you can control whether it’s on a boundary or not. Some of the watched words we imported were not for example! It’s a power feature.
Yup. It makes sense to wrap with \b for regular mode since it is simpler and makes sense (at least for English). A small pitfall is that it screws up on non-ASCII letters, but that’s a small issue comparatively speaking.
When a site turns on regular expression, you assume that the admin knows what he/she is doing and write correct regexp’s. Then those \b will be an unnecessary limitation.
I’m thinking it might be better to not deal with POSIX regex at all and limit it to PostgresSQL wildcards (_ %)
IMHO, assuming that an Admin that wants regex will know regex will in most cases be quite a leap. Even devs that have advanced programming skills in general can have problems getting regex right.
First of all, I believe the markdown processing in Discourse is actually done via JavaScript so it is natural to use JS regex.
Secondly, there are tons of online tools to check regex’s.
Thirdly, common regex’s are not difficult. The difficult ones are trying to make regex do what it wasn’t meant to do. Most of the normal scenarios are actually quite simple.