Understanding PII storage in Discourse

:bookmark: This guide explains what personally identifiable information (PII) Discourse stores by default, where it’s stored, who can access it, and how you can minimize PII collection using DiscourseConnect.

:person_raising_hand: Required user level: Administrator

Discourse stores certain personally identifiable information (PII) to support core functionality like moderation, account management, and user authentication. Understanding what data is collected and how it’s stored helps you make informed decisions about privacy and compliance.

Summary

Discourse stores several types of PII including IP addresses, email addresses, and social login credentials. This information is primarily used for moderation, duplicate account detection, and user authentication. Site administrators can minimize PII storage by implementing DiscourseConnect (SSO), which allows you to control what information is passed to Discourse.

What PII does Discourse store?

IP addresses

Discourse stores the following IP addresses for each user to assist your moderation team in detecting duplicate accounts:

  • Registration IP address - The IP address used when the account was created
  • Last used IP address - The most recent IP address from which the user accessed the site

For example, if you visit your site on your mobile device at 11:00 AM and then on your tablet at 12:00 PM, only the tablet’s IP address will be stored as the “last used” IP address.

Who can access IP addresses

  • Administrators - Full access to all IP information
  • Moderators - Can view IP addresses by default (can be disabled with moderators_view_ips site setting)
  • The system - Uses IP addresses internally for spam detection and duplicate account identification

Email addresses

Email addresses are stored as plain text in the database, visible to anyone with database access. This includes:

Who can access email addresses

  • Administrators - Full access to all email addresses
  • Moderators - Can view email addresses by default (can be disabled with the moderators_view_emails site setting)
  • Database administrators - Anyone with direct database access

Full names (real names)

Discourse can collect and store users’ full names (also referred to as “real names”), which are separate from their usernames. Full names are stored as plain text in the database alongside other user information.

How full names are collected

Full names can be provided in several ways:

  • During registration - Users may enter their full name during the signup process (depending on configuration)
  • Via SSO/DiscourseConnect - The external authentication provider can pass the full name (name field) when creating or updating a user, and can override the local name if configured.
  • Through profile editing - Users can add or update their full name from their profile preferences
  • From social logins - When users authenticate via social providers, their display name is often used as the full name

Who can access full names

Full names are stored as plain text in the name column of the users table in the database and can be accessed by:

  • Administrators - Full access to all full names
  • Moderators - Can view full names by default (controlled by the same permissions as email access)
  • Database administrators - Anyone with direct database access.
  • Public users - May see full names depending on the enable_names and related display settings

Configuration options

Administrators can control how full names are collected and displayed using these site settings:

  • full_name_requirement - Controls whether the full name field appears during signup and whether it’s required

  • auth_overrides_name - When enabled, the name from your SSO/DiscourseConnect provider cannot be changed by users

    • Useful for maintaining consistent identity across your systems
  • use_name_for_username_suggestions - When enabled, Discourse will use the full name when suggesting usernames during registration

  • enable_names - Master switch that shows the user’s full name on their profile, user card, and emails. Disable to hide full name everywhere

    • Default: enabled

:information_source: The following display settings only take effect when enable_names is enabled:

  • display_name_on_posts - Shows a user’s full name on their posts in addition to their @username
  • prioritize_username_in_ux - Controls whether the username or full name appears more prominently in the interface
    • Default: enabled (username takes priority)
  • display_name_on_email_from - Uses the full name in “From” fields in email notifications, if enabled.

:information_source: Discourse has intelligent deduplication, if a user’s full name and username are very similar (ignoring spaces, underscores, and capitalization), only one will be displayed to avoid redundancy. You can disable this behavior using the Remove Name Suppression on Posts theme component, which forces both the full name and username to always display on posts.

Federated social login information

When users authenticate through social login providers (Google, Facebook, GitHub, etc.), Discourse stores various pieces of information:

  • Email
  • Provider account ID
  • Name
  • Avatar
  • [This list may change depending on the provider or over time]

The specific data stored depends on the provider and what information they share.

Example: Google OAuth2

When a user signs in with Google, Discourse retains the following information in the database:

provider_name: "google_oauth2",
provider_uid: "11791234567812345",
info: {
  "name"=>"Bilbo Baggins",
  "email"=>"bilbo.baggins@gmail.com",
  "image"=>"https://lh3.googleusercontent.com/a/ACg8ocJD5vR-JuZZ16mGf51uYH0KyKGoKXF36U3inbh4Bzne0CpuTlH23g=s96-c",
  "last_name"=>"Baggins",
  "first_name"=>"Bilbo",
  "email_verified"=>true,
  "unverified_email"=>"bilbo.baggins@gmail.com"
}

Example: Facebook OAuth

A redacted example for Facebook login shows:

provider_name: "facebook",
provider_uid: "123456789",
info: {
  "name"=>"Bilbo Baggins",
  "email"=>"bbaggins@shire.net",
  "image"=>"https://graph.facebook.com/v5.0/123456789/picture?access_token=swordfish&width=480&height=480",
  "last_name"=>"Baggins",
  "first_name"=>"Bilbo"
}

:information_source: The specific fields stored may change depending on the provider or over time as authentication protocols evolve.

Who can access social login information

  • Administrators - Full access to associated account information through the admin panel and database
  • Moderators - May have limited access depending on site configuration
  • Individual users - Can view and manage their own associated accounts from their user preferences

Minimizing PII storage with DiscourseConnect

To avoid storing certain personally identifiable information in Discourse, you can use DiscourseConnect to handle the login process for your users entirely.

How DiscourseConnect reduces PII exposure

With DiscourseConnect, you fully control the user information passed back to Discourse. Since you manage the implementation, you can create privacy-focused alternatives to traditional identifiers.

Example approach: Instead of giving Discourse the user’s real email address, you can create a unique but PII-free email address.

For example, if the internal unique ID for a user is U123456, you might pass back an email address like:

user-U123456@example.com

Additional privacy benefits

Using DiscourseConnect also hides any connection to federated social logins from Discourse. From Discourse’s perspective, the type of login the user uses (social, mobile, etc.) is irrelevant, as that’s handled on your side. Discourse only knows what the login provider tells it.

MFA and external authentication

Can MFA be enforced on top of external authentication?

:warning: This combination is not currently supported in the expected way.

Discourse has the enforce_second_factor_on_external_auth site setting, which prevents users with MFA enabled from using external authentication methods like social logins. When enabled, this will prevent users logging in with external authentication methods if they have two-factor authentication enabled.

This setting effectively makes users choose between:

  • Using external authentication (social logins) without 2FA on Discourse
  • Using username/password login with 2FA on Discourse

:information_source: For the most secure setup with SSO, implement MFA in your identity provider rather than within Discourse.

Additional resources

2 לייקים