The API is backed by a ton of breached password dumps, which makes in infeasible to ship with core. It’s a plugin here because it depends on an external API. The advantages here are that as more breaches come in and are updated for the Have I Been Pwned service, this API will update accordingly. Downsides are, 3rd party services can change/go away, which is why this falls into plugin territory.
Please also note that this hasn’t been graced with official status yet, I wrote it this over the weekend because it seemed like pretty low hanging fruit and I wanted to take a crack at it.
Based on this data – 883,203 out of the 1 million most common passwords are under 10 characters. That’s 88% of the dataset.
Extrapolating this to 500 million, that means you could discard all but 60 million for the purposes of Discourse, because it’s impossible to have a password under 10 chars. It also ignores sorting, because you probably only want to block passwords that are actually common (repeated).
Probably not a bad idea. I saw the count, but was not aware of the associated blog, and figured it could do something with the counts, but wasn’t sure what during this round. Thanks for the link.
Fair point. The core product does a pretty Damn Good Job™ already at password management, so it’s definitely not a necessity.
I built this because HIBP’s cache of password breaches has been an excellent source of checking for risk - It’s got a dead easy API, so it seemed like a simple additional option to give the more tinfoiled types.
The larger point is that if you filter to the 60 million (out of 500 million) passwords that are 10 characters or more, and then ordered by frequency to further reduce it to the top million, that’d be pretty easy to ship with Discourse – no external API dependency or plugin required.
Unless I missed it, the passwords are SHA1 hashed and there is a “number of times” value but no “character count” value. The “fast search” way (“range”) matches hashes of the first 5 characters to reduce the number searched against the hash of the full password.
So I’m not seeing an easy way to skip passwords that are less than 10 characters, but it looks like the low frequency matches might be able to be eliminated.
Just a minor comment: this is cryptic information for certain users. Perhaps “Pwned Passwords” could have a link to the site? Or if html isn’t rendered in those messages, it could be replaced with the URL?
The number of times a password has been hacked is irrelevant as far as I can tell. If my account has been hacked, then you want me to be prevented from using that same password again, especially with the same email.
Don’t know whether their “standard” login has ever been hacked somewhere
Therefore it makes sense to disallow passwords that have “only” been compromised once, doesn’t it? And the more unique (long) the password, the more this would hold true, in my thinking.
For a password which has only be hacked 3 times (for example) i believe there’s a rather big chance that different people are using that pass, that it is for the same person to reuse it. Why? because it’s an unused password.
100% secure passwords would really deter people from using your forum, just because you don’t accept their password, even though it is one made of 16 characters.
I’ve added the threshold. It defaults to banning the password if it appears at all, which takes care of both concerns.
Troy seems to have added the count in v2 as feedback:
Now on the one hand, you could argue that once a password has appeared breached even just once, it’s unfit for future use. It’ll go into password dictionaries, be tested against the username it was next to and forever more be a weak choice regardless of where it appears in the future. However, I got a lot of feedback from V1 along the lines of “simply blocking 320M passwords is a usability nightmare”. Blocking half a billion, even more so.
It still defaults to the functionality you describe, but admins can now tune to their liking.
Updated to read “has appeared previously in a data breach. Please use a more secure password.” which should address this.
I’m not really a fan of Troy. I don’t think he has the right concepts at work here, and that’s another example of what I mean. Disallowing people from seeing the length of the passwords is an absolute deal breaker, since length = security.
Just because a password is long doesn’t mean that it was created by a password manager for a single site.
The very fact that it is being entered on a Discourse site proves that it isn’t in use on only one site.
Edited to add: I’m not sure what the solution is, I’m just saying that there’s a pretty good argument for both ends of the length spectrum possibly containing passwords that should be disallowed. Whether it’s that case in reality, I don’t know, and even if it is, the number might be so small that it’s safe to just ignore it.