We figured a minimum password length of 10 was a bit too high and relaxed it to 8, but later were surprised to see that “password” was now allowed as a password, despite the other option still being on.
Digging into common_passwords.rb, it turns out the list of common passwords which Discourse checks against has been filtered to remove all the ones under 10 characters. Makes sense for the defaults.
Changing the password list looks like a bit of a pain as it’s stored inside the Docker container and the path isn’t configurable. A plugin could probably override it but I’m not sure that’s the right approach or if the plugin would be able to hook things early enough (unchecked).
On our forum, we’ll go back to the default 10 character minimum for a simple life. So we don’t really need anything, but maybe one or more of these makes sense:
The password list could be configurable.
Pro: May also benefit non-English forums.
The default password list on disk could be unfiltered, then filtered as it is loaded into memory, based on the minimum length setting.
Pro: Retains the speed and memory benefits, at a minor cost to startup time.
Con: Would require new code to re-load/re-filter the list if the setting was changed, or admin education message telling them the server has to be restarted.
Con: For multi-site, the password list is stored in memory once for all sites. If each side had different minimum length settings then new code would need writing.
The minimum length setting could be moved into app.yml.
Pro: Could do the filtering at rebuild time.
Con: Setting is harder to find, and takes the site down to change.
Display a warning that reducing min_password_length effectively breaks block_common_passwords.
Pro: Probably the easiest. It could simply be added to the option description.
Or something else. People who know Discourse better may have better solutions.
After pondering this a bit longer, it does bring up some relevant issues.
What about other languages? Surely “monkey1234” or even “password” itself is not a common password in languages other than English?
What about languages with ideograms, where one character is an entire word. Where is roughly equivalent to “correct horse battery staple” in English.
So perhaps the case here is stronger than I thought for making this file dynamic so it can be loaded per language at a minimum@neil? That also opens the door to “we want to make 6 chars the minimum password, even if we know the risks are dire?”
This points to an issue of what “character” means to your password process. When people say “8 character passwords are perilously close to no password at all”, I read that as “8 character ASCII passwords are perilously close to no password at all”. If your password includes non-ASCII, your encoding comes into play. In UTF-8, every single non-ASCII character will be two or more octets. That effectively doubles the length of the UTF-8 characters.
(I purposely avoid all non-ASCII password characters myself, because I have a pessimistic view that all interfaces will encode the same way. ISO-8859-1 or worse, Windows codepoints, may lurk somewhere; and a string of *********** gives me no way to check encoding.)
41.36/8 = 5.170; consulting “Entropy per symbol for different symbol sets” table on the wikipedia password strength page, I see that’s the value for “Case insensitive alphanumeric”, which is almost certainly not what Discourse is using. I believe 52.56 bits is the correct value for how Discourse treats passwords, and 65.7 is the minimum acceptable amount with a 10 “character” password. That table does not address UTF-8.
It isn’t so much what Discourse uses for passwords, so much as what users use for passwords. The overwhelming majority of users, left to their own devices, will use lowercase letters and numbers (if you’re lucky), thus if you’re going to be iteratively brute-forcing a password, that’s where you’ll start. Dictionary-based brute-forcing is a different problem: there, it’s a matter of “if the password is on the list, you’re toast, otherwise you’re fine”. Thus, against a dictionary brute-force tool, there could very easily be an 8-character (or even shorter!) password that is secure, while a very long and otherwise complicated password would be compromised if it happened to have gotten onto the list of passwords to try for the specific attacker at issue.
That makes me wonder if anyone’s put together something like a bloom filter dataset of “all known passwords”, perhaps with a certain minimum length (because iterative brute-forcing of short passwords is, indeed, trivial). A bloom filter would be far, far more efficient, space- and search-time-wise, than a static list, and it would allow you to block every password ever known to have ended up on a dictionary list somewhere.
Remember, kids: there ain’t no kill like overkill…
If someone is brute forcing with lists of common passwords, they would probably use all lists for all languages. Is the list we used only from english websites? There are words that aren’t english, and some that are just common ways that people press their knuckles into the keyboard. We could add passwords from non-english lists.
I think we should support replacing the list in some reasonable fashion first. Fully localizing the common password list isn’t necessary yet but there should be a reasonable way of plugging in a different password list to start with, as a first step to getting there.
Is there a maximum length? AFAIK you could hypothetically use the entire collected works of Homer as a password if you wanted to. Do you have an example of one you’ve tried that gave you a “too long” error? Am I not seeing something in this?
The problem that’s being solved there is that hashing an extremely long (I think the triggering example was 20,000 characters) takes rather a long time (as in, more than a second) to complete, so it’s a fairly trivial DoS attack to send a couple of dozen requests with huge passwords to /login; each one ties up a unicorn doing extremely CPU-intensive work.
If your password is over 200 characters in length, I both salute your diligence and wonder about your priorities. However, if you wish to continue in your endeavours, I would expect a PR changing User.max_password_length into a site setting to be accepted, and it would not be a hard thing to write.
Oh hun, that ship sailed a long, long time ago… in particular, we allow reducing the minimum password length, and that’s arguably more “no, seriously, you don’t want to do that” than increasing the max password length.
I suppose we could benchmark where the CPU time required goes from “that takes a while” to “OMFG no wai!” (if that wasn’t how 200 was determined initially), but someone’s going to have a password that’s a bit longer than that, and a really fast CPU, and wonder why they can’t login…
It’s unfortunate that the time it takes to hash the password is so proportional to the length of the password. I would have hoped that it would hash the password once first to normalize the length then run the rounds on the result. I.e. I wish it was O(length) + O(2^cost), but it looks like it’s O(length * 2^cost) instead. This seems to be consistent between both bcrypt and pbkdf2, so maybe I’m betraying the fact that I’m not a crypto expert and there’s a specific reason for this. (Hence why I’m not suggesting hashing it before passing the result int pbkdf2.)
I’m not sure what my point is here (lamenting password maximums?).
Prompted me to pull out this alt of mine (@elijah) to test emoji passwords on Discourse. I didn’t want something that hard to type for my main.
I found that the password change box gives me a green “okay” after entering eight “Miscellaneous Symbols and Pictograph” block characters, eg and nearby, but upon submit it changes to a red “too short”. I have to go the full ten characters even with all non-ASCII. After the password change, I was able to login with both FF and Chromium.
I’m not sure my password manager is up to Unicode 8, though, as it seems unhappy with (U+1F3FA), so long term this particular password will give me trouble.