Is it possible to hash the email database?

It sounds like a forum for whistleblowers so nobody within the “work” organization can be traced as an identifiable email user.

If that is the purpose, or something like it, then you would also have to ensure that the “work” users do not and more importantly cannot use “work” resources to access the website.

If you don’t then “work” can find out by monitoring IP addresses, process usage, browser history, post times, etc. With enough vectors they can more easily triangulate any individual even if no single vector points directly to one individual.

4 Likes

Problem is, even if you hash emails and never store them in the plain there are ways to identify users.

For one, any company that suspected an employee of using the site would just need to try and re-register with the same email. You can’t reverse the hash, but if you have a good idea what the hashed value was then you can resubmit it and will receive an error that the user is already registered.

Hashing works when there are LOTS of potential combinations. A basic 8 character password with 2 special characters 2 numbers and 4 lower case has 68^8 combinations, so you’ve 457,163,239,653,376 to work through. Compare that to the typical corporate address book which has hundreds or maybe thousands of email addresses at best. Using that method corporate security could identify every user with an account very quickly indeed.

SSO in this case would be even riskier. Linking it to a corporate IdP would be a total no-no, but if you don’t you still have no means to verify the users.

2 Likes

Thank you @Remah and @Stephen for the feedback.

The aim is to have honest discussion about work related stuff, not really about whistleblowing. Although we acknowledge the chance of people using the forum as such. That’s why we try to know as little as possible about our users outside of the initial verification.

I guess one way is to just open the forum, do what we can regarding privacy. We’ll be honest about what the forum is and isn’t.

I think you’re both correct that it’s not that easy to be anonymous in the digital world.

It is easy to be anonymous but that is not what you are asking for: You want users to be firstly identifiable as belonging to a set of non-anonymous users. That is what makes it hard to guarantee anonymity when an audit trail already exists.

What you want is something like an arms length relationship between the two states of identification and anonymity. You were trying to provide both from the same system when the confirmation that someone belongs to the organisation must happen totally independently from Discourse.

3 Likes