Suggestion: Wildcard Block Email Address

Would be good if there was a way to add wildcard blocked email addresses. E.g. When a spammer uses the gmail dot trick.

E.g.

example@gmail.com
example+random12345@gmail.com
ex.a.mple+random12345@gmail.com
e.xamp.le@gmail.com

Are all the same email address, spammers can use one gmail address to make unlimited accounts easily.

Blocking an address with wildcards like below I believe would be a good solution:
e*x*a*m*p*l*e*@gmail.com

I don’t necessarily think that all registrations using these gmail address variations should be blocked, just that it would be useful that if a gmail address is blocked, all variations are blocked too or that we can manually add a wildcard gmail to the email blacklist.

Are you seeing an actual specific problem or is this just a theory? If it is a specific problem can you share the specific spammer emails?

3 Likes

Yes it’s an actual problem I’m experiencing, I have spammers regularly making tens of thousands of accounts per single gmail account with the dot method and a sufficient pool of IPs.

I’m only seeing the dot trick being used, not 100% sure about if the + method works also. Last I checked it was possible to register using email addresses with + characters, so that trick should work too.

For example, this email (not a real email):
constantinehamilton1337x@gmail.com

Can make 16,777,216 unique email addresses using the dot method only and essentially unlimited using the + method. Makes it super efficient for spammers. Domain blacklist isn’t viable seeing it’s gmail.

You can see a generator here (gets laggy over 8k combinations): Gmail Dot Trick Generator

If this was actually implemented with a wildcard-like approach (instead of being handled automatically by Discourse), you’d probably want to be much more specific than e*x*a*m*p*l*e*@gmail.com. Doing it that way could result in blocking innocent people, especially if the spammer’s email address is relatively short. Looking specifically for . and + would probably be much safer.

1 Like

What is your levenshtein_distance_spammer_emails setting at, the default 2 or the max 3 ?

1 Like

Thanks for the heads up about this setting levenshtein_distance_spammer_emails. I’ve never seen or modified it before - it’s at the default of 2.

3 Likes

I don’t understand your math. You can add only a single dot between characters, so each N-character address is good for only 2*n addresses. You could probably have a plugin that saved or compared against the dot-removed address and disallowed +addresses.

1 Like

@pfaffman - I was just going off the figures given from Gmail Dot Trick Generator which is for every additional character above 2 the amount of addresses is doubled (it freezes at about 8k though).

I think 2*n, if I understand what you mean by this (as in a 26 character address would have 52 combinations?) would be too low. As they can add multiple dots throughout the address.
E.g:
constantinehamilton1337x@gmail.com
con.stantinehamilton1337.x@gmail.com
co.nst.antineh.amilton1.3.37x@gmail.com
constantineh.a.m.ilto.n13.37x@gmail.com
c.o.nsta.ntinehamil.ton1337x@gmail.com

Anyhow, whatever the exact figure is, it’s a lot. Yeah, your suggested solution would make sense!

Yeah. I wasn’t doing the math right. I was allowing just one dot. I once almost knew that math, but didn’t this morning. :wink:

But a plugin that saved a shot and plus free version of the address as an additional address would do what you want and wouldn’t be that hard.

2 Likes

Note … when you block sam.sam@gmail.com we now automatically block sam.sam+1@gmail.com and so on…

8 Likes

This feature has been working very well @sam :slight_smile:

I think the previous implementation you created could still be quite useful as an additional anti-spam feature, it worked incredibly well for the short time it was available and enabled (default off).

Otherwise spammers can still create bulk accounts with one gmail address prior to a moderator or admin noticing. E.g. Creating the accounts but not posting anything immediately.

Admins/Mods will need to manually find and open each individual account to ban/delete them. Which can be quite tedious, especially when one spammer can create hundreds or thousands of accounts with one gmail prior to being banned. Also as searching for the emails is difficult e.g. j.ohan.2.1@gmail and jo.ha.n21@gmail.

If they aren’t manually hunted down, then the spammers keep a large pool of accounts to play whack-a-mole with, while only needing to expend one gmail account to obtain them.

@sam Just to follow up after more field testing, I believe that the previous implementation that was reverted is definitely much more effective against motivated spammers. I’m still getting a significant amount of registrations using these permuted gmail tricks.

I’m very grateful that the current protection was implemented, which is very effective. However I think it’s a bit of a hole to allow unlimited accounts to be created using the same email until they are specifically noticed and manually banned. It is more burden on moderators (who can’t see account emails by default unless enabled I believe), especially in the absence of bulk account removal tools (e.g. select several accounts from the accounts/search list with checkboxes and ban/remove them all). Which means a moderator will need to manually navigate to each individual account to remove/ban them. That is especially difficult when searching for accounts with permuted emails.

Seeing the previous implementation was optional (off by default), has already been developed and worked as intended, then removed. It just seems a shame that it’s not available anymore for communities that would want to use it for additional anti-spam protection against motivated spammers.

whynotboth

This is why I said certain characters have to be completely disallowed from emails (optionally). Specifically the characters that allow Email address - Wikipedia sub-addressing, such as plus, period, hyphen, etc. With a regex you could block it per service as well, e.g. “no email with a plus ending in @gmail.com is allowed” for example. cc @sam

1 Like

Previous implementation still allowed the +addressing while keeping stuff down to 1 canonical per account (which I think is probably safer)

So you could be registered as sam+discourse-meta@gmail.com which is handy for internal gmail rules you have going. But then it would ban new accounts from sam@gmail.com or sam+1@gmail.com.

Not against adding an allowlist, but I think enforcing canonicals is pretty handy for the gmail case and is not a terrible default.

1 Like

Safety’s not really the goal here. The site in question needs a more extreme solution due to the scope of the problem they are facing. As long as it is optional (add your own “email protection regex”) then it seems perfectly safe to me, for sites that need it, they can opt into Full Lockdown Mode.

1 Like

We currently have

blocked email domains

I guess we could add:

blocked email patterns

Getting the regex right though is somewhat annoying given all the escaping needed. I worry about giving options like this, cause the odds of people getting the regex right and as intended are quite low. They need to remember to escape bot dots and pluses.

.*\+.*@gmail\.com

We could I guess do a non regex based simplified pattern that just expands * and ?.

*+*@gmail.com

4 Likes

:wave: Sorry for the late response!

If the previous implementation was re-added as an option, I believe this would entirely solve the gmail issue. At least in my case. It’s quite perfect in my opinion and adds enough resource costs to the spammers to make fighting it manageable. It’d really be the difference between requiring 24hr full time high intensity moderation and not.

I’ve blocked several domains that allow similar and make use of the allowed email domains list. The problem is that people can create many accounts prior to getting one of their accounts banned/blocked (which activates the blocking of permutations of that gmail address for new accounts, but existing accounts are left untouched). Making it quite a burden for moderation and tedious to clean up each individual account afterwards.

For example I’ve had a thread that had ~200 or so replies, using 1 post per account, all made with the same gmail address. A lot of similar cases. These being an example where the accounts are easy to find, as searching for them via permutations of the original gmail is really difficult as an alternative. Some will farm a large amounts of accounts using a small handful of gmails and not post on them until months later.

For regex blocking as a solution, blocking + signs would be fairly harmless, periods (.) would likely block a significant amount of legit emails i.e. john.smith@gmail.com. Blocking addresses with more than one period would probably have minimal collateral damage, though would still allow several permutations of a gmail address, but much less than with 2+ periods.

IMO the previous implementation is ideal and not unreasonable to implement as an optional protection, most popular social sites won’t allow signing up using several gmail permutations due to it being heavily exploited by spammers.

Thanks :slight_smile:

1 Like

@sam I feel quite strongly that sites should be allowed to implement this optional level of email regex lockdown if they need it. Otherwise we’re going against one of the core principles of Discourse, which is to be “safe by default”.

1 Like

We can get this done for the next release, I still stand behind my original implementation though, canonicalization is the most friendly solution for site operators, you check a box and tada issue is fixed. With regex, you learn regex (so there go 5 hours) and end up with a fix that lets spam accounts slip through or is user hostile (no dots, no pluses) or is a compromise

That said, sure we can slot regex support for next release

1 Like

Nahh, it’s real easy, just “no emails allowed with plus or period in them” which is admittedly quite restrictive and obviously we would not want it on by default… But it’s like the bamwar thing: there will always be enough bad actors that you have to make the nuclear launch button, even if you don’t want to use it…

It’s like nuclear war. Once you have nukes on the table, the “user friendly” options aren’t possible any more, you just have to hope most of the time you never need to go there.