Getting through Akismet


Users on our forum (Hopscotch forum) have recently found a way to bypass the profanity filter by putting invisible by putting an invalid html tag between two parts of the word.
For example, the word gmail is by default blocked. But, you can do:

Gm<any invalid tag>ail

Which makes the word ‘gmail’ show up, because the tag in the middle is invisible.

One of our users is concerned that this could be used to throw off our spam filter, Akismet, by doing something like

Ge<dfg>t f<dgf>ree co<shd>upons on some<hh>thing today! www<fg>
As a test, I made a post with a bunch of repeated chinese characters on an alt who had never posted before, the kind of thing that would be flagged as spam, but put invisible html tags between some letters, and it didn’t get flagged as spam.
Can Akismet be made to ignore these tags?

(Nick S) #2

In case anyone needs more info, I told him this on the other forums.

Also: tags aren’t only used to “unblock” other words. They could also be used as spam by adding random letters to the end of a post in an HTML tag like this:

(Insert random ad here)


If this is repeated constantly, it could get past Askimet.

It could also make empty posts, since it can also bypass the twenty character rule.

(Jeff Atwood) #3

Sure there are lots of ways to do this, look up zero width Unicode spaces for example. It is literally impossible to come up with a method that will fix every workaround and hack. For that matter you could post an image full of horrible stuff (or spam) that no text processor can even see.

(Why should I tell you my name?) #4

Yeah, as @codinghorror said, and I had already said this on the Hopscotch forums, there isn’t a workaround for everything. Discourse can’t fix everything

(cpradio) #5

Part of this is where moderating comes into play. If you have a set of users who are doing it intentionally, you need to set them straight. Give them an official warning and if they continue to do so, suspend them for a set amount of time. They need to know it isn’t okay and is against the community’s rules.

Unfortunately, if you just ignore it or let it go, there is a good chance it may get way out of control, and that puts you in a really hard place from a moderating perspective. Sitepoint had that issue well before Discourse, and it ended up forcing the staff to be pretty heavy handed with editing and removing posts that attempted to circumvent our rules. That left the community with a horrible impression of an overbearing moderation staff, and that is even harder to overcome.