Gmail dot trick

Correct, that’s why we would distil down to the canoncal email to ensure they are unique. It’s covered above. We can’t store the canonical email as their email address though as it’s not the one they provided.

Domain blacklists already exist, but we can’t assume that just because a user can also be reached by a googlemail or gmail address that we should reject one or the other. Hence referring back to a canonical “master”.

There are sites today where users are quite legitimately using plus addressing and dots. The point isn’t to inconvenience legitimate practices, only to curtail the unreasonable side effects such as two users for one canonical address.

If providing the period and plus sign string stripped email is required during the registration process, on the client side with consent (akin to form validation), storing it as their account email would be ok.

Not ideal or perfect, but potentially simpler and a worthy trade-off in some cases where the choice is inconvenience a few users or inconvenience an entire forum with spam.

There are gmail accounts were the primary canonical email includes periods. They would be the users most affected and confused by force removing them during registration.

I don’t think that this would be the best implementation either and definitely would not be default option friendly.

Right, what I meant was having an option menu similar to the already existing email domain blacklist for inputting which email domains should be affected and the parameters of what should/shouldn’t be used to decide if an email address is unique/canonical as being discussed in this thread. Potentially also which domains should be considered the same host e.g. gmail/googlemail.

Regarding gmail and googlemail, I think we’re in agreement. Same regarding the dots and + signs.

Essentially, allow the first registration to go through, but disallow the user from being able to make multiple accounts using that same email. Or at least minimise it within reason.

john@googlemail.com registers first → accepted
john@gmail.com registers later → rejected

matthew+{randomstring}@gmail.com registers first → accepted
matthew@gmail.com registers later → rejected
matthew@googlemail.com registers later → rejected
m.att.he.w@gmail.com registers later → rejected
matthew+{randomstring}@gmail.com registers later → rejected
m.a.tt.ew+{randomstring}@googlemail.com registers later → rejected

The googlemail vs gmail (and other providers that have several alt domains) is vastly less significant to the dot and + address issues. Handling those cases would be nice though.

That’s a really user-hostile change, and totally unnecessary. The reason these features exist to begin with is to identify the source of email. If I register using the email address stephen+meta@gmail.com I can configure a rule that allows any email sent to that address to be labelled. If meta is compromised and my email address ends up receiving spam at that alias I now know where the breach occurred. Crippling the way I use email isn’t the solution, distilling my email address down to a canonical version for comparison achieves the same end result without creating any user inconvenience.

Right, and that’s tied to the concept of a canonical address. If the feature went ahead as it was originally discussed we would really benefit from the ability to associate domains. Every dot and plus permutation and domain variation would be compared to ‘one true email’ for that mailbox without causing any friction.

Providing we don’t create any pain for users, there’s no reason this feature couldn’t ship on by default.

Agreed, imperfect solution = imperfect. I only said this as an alternative potentially simpler to implement solution. It’s the last portion of my post, presented as an alternative to the primary suggestions I was making which agree with a lot of the discussions in this thread as well as allowing +'s and dots, just not duplicate accounts.

That said, legitimate users using +'s in emails on non-tech forums/sites is generally an edge case from what I’ve seen.

Really sounds fantastic. :content:

My post was primarily getting at, how the canonical addresses are calculated for different email domains. So it isn’t limited to use with gmail/googlemail only. I was essentially attempting to say that it could be a good long term implementation to have user options for how the canonical addresses are calculated on a per domain basis.

Some other providers allow + but not period permutations for example. Meaning that the period permutations are unique emails.

A gmail/googlemail only implementation would be great though and don’t see any reasons that it couldn’t be shipped on by default either.

Could you provide an example of one? I ask because the majority of gmail users are oblivious to the dot trick. They signed up for an address with the dot, they give everyone that version of their email and would become very confused if they were told that “their email” was invalid.

I rarely encounter people who even realise their alias minus the dots will still reach them.

1 Like

Sure, I’ll PM you an example now which I’ve sent to Sam. Just because I’m not sure if it’s a good idea to publicly post this in a thread with this title, as it seems that quite a lot of spammers still don’t know about it luckily.

Yeah agreed, that would be the main confusion for regular users with that imperfect solution.

There’s no way we’d go for such a complicated approach. We aren’t going to “normalize” emails.

Either you are in email lockdown mode, which completely disallows certain problematic characters in an email address (per hardcoded email domain, maybe) or you aren’t.

That’s it. Boolean toggle. Email lockdown mode, Y/N?

3 Likes

Per:

This is now complete.

Use the site setting enforce_canonical_emails (default false) to enable this protection.

Once on, we disallow duplicate registrations for people using the . hack in googlemail.com and gmail.com and the + hack globally.

Fix is very safe and has zero impact out-of-the-box when it is disabled.

A side-effect of the implementation is that 1 more duplicate account will slip through once you enable the setting, as we do not store canonical form emails in the user email table unless you turn on the setting. This is perfectly acceptable imo, cause in general I am unable to find cases of this exact abuse across quite a few sites we host.

8 Likes

Storing the canonical form at all is problematic. What format do they take?

The spec is here:

If the site setting is not enabled nothing happens… zero, ziltch.

5 Likes

Thanks for the kind words @markersocial

Sorry not to reply earlier, have been busy on other tasks… just getting caught up on meta:

Detecting spam, bogus registrations, DDOS attacks, intrusions, and cyberspace situational awareness in general and all the other similar classes of detection-oriented and multi-sensor data fusion cybersecurity problems is one of my favorite topics, as you seem to know :slight_smile:

Having been on the front lines and fought many a ''hands on" cyber battle in real time, let me give you two more hints when under attack like this:

(1) Detection is often more of an art than a pure science. The reason is that the more the attackers know about your detection and mitigation algorithms and techniques, the more they will mutate and adapt to your defenses.

(2) Also, never forget the “OODA Loop”. Observe-Orient-Decide-Act The one(s) in the cyber battle who can get inside the OODA loop of the opponent(s), will generally be the winner.

I am pleased to read you are enjoying cyberdefense and looking at the larger picture. It sounds like you have got everything under control (from what I quickly read in summary in this discussion) and that the fine meta team has also committed a helpful change for you.

If you fall under attack and need any help, don’t hesitate to reach out to me. I’m long retired from the world of chasing profits and filling up my coffers (thank goodness!), so there is never a fee to consult with me. Helping others who have interesting tech problems, especially in the area of cybersecurity and cyberwar is a higher priority for me than accumulating more wealth.

I am here for you if you need someone to bounce ideas off of and from what I have read of your replies of recent, it sounds like you have things under control.

Great job!

4 Likes

@codinghorror my thinking here is that this change is pointless and I should just revert my change

None of our hosted sites are asking for it or for maxtreme email blocking modes. None of this is a problem in practice cause we purge our inactive accounts anyway and spam scan profiles.

Spammer can just run an smtp server which is easier than automating gmail and they have access to infinity emails that way

Plus addressing is very widely used in legitimate ways

The most common issue around problems with dots in gmail is not spam, but email typos

I guess the only change I support in core is expanding blocked emails to block canonical emails, at least that is an improvement to the block email feature and solves the OP

Eg if you block Jane@gmail.com it also blocks j.ane+1@gmail.com

Any other changes can go in plugins

Does this sound ok?

7 Likes

Yeah that’s kind of the point I was making on the call…

I guess we’re back to shipping a plugin, because IMHO the only effective “solution” is to completely block periods and plus .+ characters in emails when you are in lockdown mode.

Basically it is a blacklist regex for email and you’d tweak it as you see fit, to add certain providers, certain characters, whatever. Very flexible, very powerful.

2 Likes

I think improving email blacklist is an easy change that can benefit all sites, I can think of zero downsides

If I block sam@gmail.com do I really want to allow s.am@gmail.com

Regarding plugin, I guess we can deal with this when we have a real problem on our hosting. We already support blacklisting domains

4 Likes

Do many know this feature was added ? (Did they get a message telling them ? Maybe they don’t follow that much here on meta !?)

Personally, I don’t find the change “pointless” at all. I would actually put it in core and set it up as enabled by default. My own thinking here is: Does a user have the ability to create multiple accounts with the exact same email address ? Why let someone do it with a Gmail address, then ? (additionally, if it’s enabled by default in the first place, it “solves” the problem of letting one more account be created after activation)

The idea would be to have an option to ALLOW multiple accounts with a single gmail email and the “gmail tricks” (now, I can understand the desire to not want to add the canonical email storage if it feels all this isn’t needed)

This feature seems fine @sam I think we should ship that default off / blank.

A bit of background that you’re probably missing is that forum admins have been told that the plus addressing trick is an excellent way to create unprivileged test accounts on the forums, to check category permissions, many times here on Meta. Banning the trick can’t be done by default, because there are legitimate uses for multiple accounts and this is one legitimate use that shows up pretty much everywhere.

Giving special permissions to use a duplicate email for your “unprivileged user test account” is kind of an oxymoron.

6 Likes

You’re right on that.

I reverted my change here and instead introduced this new awesome default.

This means that if evil.person+77@gmail.com gets blocked we will go ahead and block evilperson@gmail.com instead.

Then when e.v.i.l.person@gmail.com tries to sneak in they will be blocked due to canonical matching.

This entirely solves the OP here, and is a very clean and safe change all Discourse instances can benefit from.

Going to close this off as complete in a week.

11 Likes

This topic was automatically closed after 7 days. New replies are no longer allowed.