Pwned Passwords Validator

Discourse Pwned Passwords Validator

Adds a password validator that calls to Troy Hunt’s Pwned Passwords API.

How do you use it? Install it. And… that’s it! Password changes will be blocked if a password match is found via the API.

If a user tries changing a password to one that has been found, the following message will appear:

Inspiration from: Upgrading common password prevention - Pwned Passwords v2

21 Likes

How would you compare this to the Core list of 2344 10 char common passwords,
https://github.com/discourse/discourse/blob/master/lib/common_passwords/10-char-common-passwords.txt

Apples to Oranges, 1 to 10, other?

500 million passwords?!

500 million password, yes :wink:

The API is backed by a ton of breached password dumps, which makes in infeasible to ship with core. It’s a plugin here because it depends on an external API. The advantages here are that as more breaches come in and are updated for the Have I Been Pwned service, this API will update accordingly. Downsides are, 3rd party services can change/go away, which is why this falls into plugin territory.

Please also note that this hasn’t been graced with official status yet, I wrote it this over the weekend because it seemed like pretty low hanging fruit and I wanted to take a crack at it.

12 Likes

Are you taking into account this ranking: Troy Hunt: I've Just Launched "Pwned Passwords" V2 With Half a Billion Passwords for Download ?

Maybe the [number of hits] counter could be a changeable setting with a tooltip taking the admin who’s editing it to above mentioned link?

I think this could be more useful than disallowing the use of passwords which were only counted 10 times for example (10 times out of 5kkk).

3 Likes

Be aware that once you move to 10 character and greater passwords, the value of the external lists is … kinda debatable, because the number of samples is so many orders of magnitude less. Here’s an example, using the most common 1 million of 10 million leaked passwords:

Based on this data – 883,203 out of the 1 million most common passwords are under 10 characters. That’s 88% of the dataset.

Extrapolating this to 500 million, that means you could discard all but 60 million for the purposes of Discourse, because it’s impossible to have a password under 10 chars. It also ignores sorting, because you probably only want to block passwords that are actually common (repeated).

10 Likes

Probably not a bad idea. I saw the count, but was not aware of the associated blog, and figured it could do something with the counts, but wasn’t sure what during this round. Thanks for the link.

Fair point. The core product does a pretty Damn Good Job™ already at password management, so it’s definitely not a necessity.

I built this because HIBP’s cache of password breaches has been an excellent source of checking for risk - It’s got a dead easy API, so it seemed like a simple additional option to give the more tinfoiled types.

1 Like

The larger point is that if you filter to the 60 million (out of 500 million) passwords that are 10 characters or more, and then ordered by frequency to further reduce it to the top million, that’d be pretty easy to ship with Discourse – no external API dependency or plugin required.

5 Likes

Damn! You got me! Guilty as charged … :thinking::laughing:

May i suggest the plugin could build these passwords on rebuild? Would this take a lot of time? (in this case it means that discourse :heart: always has the latest passwords.

Unless I missed it, the passwords are SHA1 hashed and there is a “number of times” value but no “character count” value. The “fast search” way (“range”) matches hashes of the first 5 characters to reduce the number searched against the hash of the full password.

So I’m not seeing an easy way to skip passwords that are less than 10 characters, but it looks like the low frequency matches might be able to be eliminated.

1 Like

Without character count, we’re hosed, since that reduces the search space by 88%.

is a known password on Pwned Passwords

Just a minor comment: this is cryptic information for certain users. Perhaps “Pwned Passwords” could have a link to the site? Or if html isn’t rendered in those messages, it could be replaced with the URL?

3 Likes

The number of times a password has been hacked is irrelevant as far as I can tell. If my account has been hacked, then you want me to be prevented from using that same password again, especially with the same email.

Most people

  1. Reuse passwords
  2. Don’t know whether their “standard” login has ever been hacked somewhere

Therefore it makes sense to disallow passwords that have “only” been compromised once, doesn’t it? And the more unique (long) the password, the more this would hold true, in my thinking.

1 Like

I disagree:

For a password which has only be hacked 3 times (for example) i believe there’s a rather big chance that different people are using that pass, that it is for the same person to reuse it. Why? because it’s an unused password.

100% secure passwords would really deter people from using your forum, just because you don’t accept their password, even though it is one made of 16 characters.

2 Likes

I’ve added the threshold. It defaults to banning the password if it appears at all, which takes care of both concerns.

Troy seems to have added the count in v2 as feedback:

Now on the one hand, you could argue that once a password has appeared breached even just once, it’s unfit for future use. It’ll go into password dictionaries, be tested against the username it was next to and forever more be a weak choice regardless of where it appears in the future. However, I got a lot of feedback from V1 along the lines of “simply blocking 320M passwords is a usability nightmare”. Blocking half a billion, even more so.

It still defaults to the functionality you describe, but admins can now tune to their liking.

Updated to read “has appeared previously in a data breach. Please use a more secure password.” which should address this.

7 Likes

I’m not really a fan of Troy. I don’t think he has the right concepts at work here, and that’s another example of what I mean. Disallowing people from seeing the length of the passwords is an absolute deal breaker, since length = security.

Possibly so, but as I said, the longer the password, the more likely that it is only used by one person and that that person is compromised but doesn’t know it.

Not true, if the password is in fact unique and not reused across different sites. Burned on one site doesn’t necessarily mean burned on all sites.

Having multiple hits on the same password absolutely is, though.

Just because a password is long doesn’t mean that it was created by a password manager for a single site.

The very fact that it is being entered on a Discourse site proves that it isn’t in use on only one site.

Edited to add: I’m not sure what the solution is, I’m just saying that there’s a pretty good argument for both ends of the length spectrum possibly containing passwords that should be disallowed. Whether it’s that case in reality, I don’t know, and even if it is, the number might be so small that it’s safe to just ignore it.

1 Like

None of what you said matters; it’s the combination of username/email and password that makes it secure. The fact that a password appears in a list doesn’t make it inherently insecure.

Here’s what does, though: knowing that {x} unique users all selected that password, above a statistical threshold of interest.

Imagine a world where every password was invalidated for every human, the minute any other human, anywhere, chose that password. That’s what Troy was building toward, and it’s… stupid.

3 Likes