Lifting the restrictions on allowed characters in usernames is one of the oldest feature requests. Starting with Discourse 2.3.0.beta9 it’s finally possible to use Unicode characters within usernames and group names.
New Site Settings
There are two new site settings: allowed unicode username characters
and unicode usernames
.
allowed unicode username characters
allows you to allow only certain Unicode characters (e.g. [äöüßÄÖÜẞ]
or \p{Greek}
). By default Discourse permits letters (Ll / Lm / Lo / Lt / Lu), marks (Mc, Me, Mn) and numbers (Nd, Nl, but not No). The setting can restrict those characters, but it’s not possible to allow additional characters. Also, it’s not possible to forbid ASCII letters and numbers.
You should tailor it to your community’s needs and only allow characters and scripts that are needed for the languages used by your community.
Take a look at the Ruby documentation if you want to know more about character classes and character properties in regular expressions.
unicode usernames
is disabled by default and we strongly advise you to configure the allowed unicode username characters
setting before enabling it in order to prevent homograph username spoofing.
Example allowed values:
- zh_CN Chinese:
[\p{Han}]
- zh_TW Chinese:
[\p{Han}]
- ko Korean only:
[\p{Hangul}]
- jp Japanese:
[\p{Han}\p{Katakana}\p{Hiragana}]
- jp Japanese (カタカナ only):
[\p{Katakana}]
- fi Finnish:
[åäöÅÄÖ]
- cs Czech:
[ěščřžýáíéóůúďťň]
Letter Avatar Service
The Letter Avatar Service has been updated and we added support for generating avatars with the most commonly used scripts. Feel free to create a pull request on GitHub to add a font from the Google Noto Fonts family if you encounter missing avatars for your language.
Enabling unicode usernames
is only possible when the external system avatars enabled
site setting is enabled, because the internal avatar generator doesn’t support Unicode. You can run your own instance of the Letter Avatar Service if you can’t or don’t want to rely on the external service.
We even support the brand new glyph for “令和” (Reiwa) that was added to Unicode in May.
Good to know…
Discourse counts grapheme clusters (“user-perceived characters”) instead of Unicode codepoints when it validates username length (min username length
and max username length
site settings). The Letter Avatar Service also uses the first grapheme cluster of a username to generate an avatar.
You should also take a look at the reserved usernames
site setting. You might want to add additional usernames now that your forum supports Unicode in usernames.
Feedback
Did you enable Unicode usernames for your community? We’d like to hear your feedback.
Also, we want to ship sensible default values for the unicode username character whitelist
for each locale supported by Discourse. Please feel free to suggest regular expressions in a reply.