Pwned Passwords Validator

Yeah, I agree that aspect of it seemed quite problematic to me as well from the beginning. And he obviously cannot include the email in the hash. So it’s better to just disallow the top x most common passwords, which we already have. (Though it’s not updated on the fly as the list changes over time.)

2 Likes

What would really solve the issue here, in my humble opinion, would just to apply best practices like preventing an IP to try to login more than X times without success, don’t let people try passwords at non human time ( humans don’t try a password every 1 sec ) would fix the security issue without forbidding all the passwords.

The way I see is, someone can make a list that gives all the combinations of A-Z 1-9 up to X length. This means nothing if you can’t brute force the target, unless it’s a targed attack and then there’s not much you can do.

Is anything like that currently implemented?

Yes, there is extensive rate limiting across the whole of Discourse. By default login attempts are limited to 6 per minute and 30 per hour (per IP).

9 Likes

Hmm, can you extrapolate here though? It would seem to me that the distribution of lengths might be quite different if you focus on the 1 mil most common ones since those will obviously tend to be shorter.

1 Like

I think it’s a good idea to alert the user but maybe not prevent them from using the password if they insist. Explain it better, maybe something along the lines of

“The encrypted version of this password has been found in a public database of stolen user information and may put your account at increased risk of being hacked by brute force attacks. If you use this password across multiple web services we strongly recommend changing them to unique passwords to stay secure, especially for mail accounts.”

And let them hit submit a second time to use it anyway.
hopefully webauthn will make passwords a thing of the past someday :wink:

2 Likes

This makes no sense to me (hence the thread bump), once a password is known to have been cracked (or found in plaintext) it is significantly less secure. This represents a huge decrease in entropy even with very forgiving assumptions:

  • a random 10-character password, even with a very limited character set (strawman example of only lower-case letters: Pr = 1/26^10 = 1/1.4e14)
  • a password that is known to appear in the HIBP list (Pr = 1/5e8)

the minute someone else used it and it was also breached and/or decoded - crucial difference.

What is the practical problem with preventing a password that has only appeared in 1 list? As a user it’s definitely something I’d like to know about so I can avoid using that password again. As a site administrator I don’t want my users using compromised passwords. This seems to be well worth the usability tradeoff.

No, it is not a crucial difference because statistically all passwords will be breached over time.

Remember we already block the top million most common passwords, and have for years, so you are already covered in the ways that matter.

3 Likes

Statistically all cryptography and hashing algorithms will be broken over time. When the first hint appears that an algorithm is close to being broken we stop using it.

Why should passwords be any different? All security is a bet that the entropy being used is sufficient for the given task, and there is always an assumption in there about the length or intensity of attack that can be endured.

Given that you lose at least 6 orders of magnitude of entropy the first time a password appears in a list (realistically probably several more), this seems like an easy decision.

Yes, this is a much better situation than what exists with many other software packages available today. HIBP is still a good improvement for sites that want more guarantees on top of that.

Seems like an opportunity for a “Generate passphrase” button via plug in if that’s technically feasible…

Would exclude anything on the pwned list and the user can elect to make their own which has the “basic” 1m exclusion list built in already. The button method means you can skip having to educate the user about pwned lists and entropy which generally they’re not going to have time for.

Discourse does a good enough job already though imo.

Why generate the password at the server?

Assuming the user needs to save it wouldn’t it make sense on the user managing responsibility for password generation? Their credential manager of choice likely already does this.

2 Likes

Because it might be better tha continually telling them that apple, apple1, apple 123, apple 12345 is not acceptable under the 1m verboden list. I wouldn’t much expect the feature on discourse personally but if it puts the pwned passwords thing to bed somewhat gracefully and leaves it in the user hands if it goes wrong then might be worth looking at.

And you’re suggesting that client-side password managers will generate passwords like this?

Advocating for your users to use a password manager will be much more effective than mandating a password from a single site. The latter will just lead to them using said password everywhere else and expediting its compromise.

Fixing stupid is harder than fixing what’s been pwned, holding hands with something akin to https://www.useapassphrase.com/ throws those who aren’t interested in the academics of it a bone imo.

I’ve built systems used by tens of millions and we’ve piloted various approaches relating to passwords including those godawful must contain x and y.

Unless you educate users on password management you only increase the likelihood of password fatigue, re-use and the dreaded “post-it under the keyboard”.

Generating passwords at the server also creates a new attack vector and surface which needs to be secured. Personally I could do without ALL of the above.

3 Likes

I would have thought JS would generate a passphrase randomly clientside then hash it and compare it to a list. The education part can be a stub above “Make me a password”. I don’t really think any of this is something discourse needs to concern itself with tbh, but so long as people are misattributing anger for their own behaviours to the brand it might be worth considering.

I think your best bet, then, is to develop or commission a plugin that does what you suggest.

1 Like

That’s also exactly what this plugin is for.

I have mulled this over for a bit since writing this plugin, and I’ve come to agree with Jeff.

“breached” means that there was someone, somewhere that thought up that phrase that appeared on “a list”. Imagine if someone created a tool that:

  • Made a list of strings public.
  • Systematically generated and appended unique strings to the list.

Jeff’s argument is that this means that eventually, all strings will be “insecure” because they appear on some public list somewhere - and eventually, your “super secure” passphrase would also appear on this ever-expanding list.

The whole idea of passwords themselves are only leaning on a “statistical impossibility” of someone not being able to guess a password for your account. An attacker technically don’t need to know a password, they need to just make a realllly good guess. They can push this in their favor by loading what it means to make a “good guess”:

  • If they have a breach, try the same password for the same account
  • If they have a list of breaches, try the passwords that a lot of people use

We’ve already discussed how Discourse is good at item #2 above. I think what you’re trying to target is #1 but it is rather unnecessary and a usability nightmare to have something like this (assuming any password, not tied to the username, should not be used) in core… But if you want what you’re describing for your site, this is the plugin for you.

4 Likes

Yes. Frankly this is a terrible argument because it is completely disconnected from any notion of probability, which is the only correct way to reason about this.

Fortunately we’re talking about an edge case that appears after throwing out the most commonly used passwords, which is already a pretty good strategy by itself.

1 Like