Discourse has many SEO features that work straight out of the box. Using our sensible defaults, community managers can focus on cultivating a community and should not feel as distracted by optimizing for search engines. That said, there are some things you can change, some things you should know and some general tips and tricks below.
Static view for search engines
Discourse has a static HTML view with no JavaScript to help web crawlers index your site faster. The content between the dynamic and static view is identical and nothing will be omitted or stripped out when the site is crawled by search engines.
Here’s a comparison of what a user sees and what a search engine sees:
Topic list:
Topic:
Meta tags
In Discourse, the generic meta tags essential for SEO are auto-generated based on the content present on the page. The title tag, for instance, is derived from the site or topic title, and the description is generated from the content of the first post. However, customization on a per-page basis for metadata is limited. To alter these values, you need to adjust the settings or the content fields which they are generated from.
- The Title, Description and Short site description site settings
- The category names
- The posts’ titles and content
- And so on
Social media meta tags
Discourse automatically generates Open Graph and Twitter (X) Card meta tags for rich social sharing:
Open graph tags
-
og:site_name,og:type,og:url,og:title,og:description -
og:image- Configurable viaopengraph_imagesetting -
og:article:section- Category breadcrumbs -
og:article:tag- Topic tags
X (formerly Twitter) card tags
-
twitter:card- “summary” or “summary_large_image” -
twitter:title,twitter:description,twitter:image -
twitter:label1/data1- Reading time estimate -
twitter:label2/data2- Like count
Configuration settings
-
opengraph_image- Default OG image for social sharing -
twitter_summary_large_image- X (formerly Twitter) large card image -
site_x_summary_large_image_url- X (formerly Twitter) summary image
URL structure and encoding
Non-Latin characters and URLs
Discourse, by default, strips out non-Latin characters from topic URLs when the locale is set to EN. To avoid this, you can change the locale to the primary non-Latin language or change the slug generation method setting from ASCII to encoded.
International SEO and hreflang tags
For multi-language communities, Discourse supports hreflang tags to help search engines understand language variations:
- Enable via
content_localization_enabledsetting - Configure supported locales with
content_localization_supported_locales
When enabled, Discourse generates alternate links for each supported language, helping Google and other search engines serve the correct language version to users.
Sub-folder vs. subdomain setup
Discourse leans towards subdomains over sub-folders due to its technical simplicity. Google doesn’t really have a preference between the two[1], but Discourse strongly recommends avoiding sub-folder setups unless you have deep technical understanding.
Canonicalization
Google is keen on indexing canonical versions of pages. In Discourse, for a topic with multiple replies, the canonical link (the first post) is handed over to Google, which then makes the call on indexing. Topics longer than 20 posts will be paginated, each page being a canonical link containing up to 20 posts.
For example, the canonical tag for the last reply on this topic will be https://meta.discourse.org/t/try-out-the-new-sidebar-and-notification-menus/238821?page=12.
Embedded topics
When topics are embedded on external websites, you can use the embed_set_canonical_url setting to point the canonical URL to the original embed location. This prevents duplicate content issues when the same topic appears both on your forum and the embedding site.
Schema markup
Discourse uses extensive schema.org markup to help search engines understand your content. Each content type includes rich structured data that appears in search results and helps with discoverability.
Topics
Use the DiscussionForumPosting schema to represent the main discussion thread and includes:
-
headline- The topic title -
datePublished- When the topic was created -
articleSection- The category name -
keywords- All topic tags -
publisher- Your site/organization information -
author- The original poster’s information
Posts and replies
Individual posts within topics use the Comment schema, including:
-
author- Post author with name and profile link -
text- The post content -
datePublishedanddateModified- Creation and edit timestamps -
interactionStatistic- Like counts usingInteractionCounterwithLikeActiontype
Breadcrumbs
Category navigation uses the BreadcrumbList schema that appears in search results, showing the category hierarchy path. Each breadcrumb includes:
-
itemListElement- Individual category links -
position- Order in the hierarchy - Category colors for visual distinction
Categories and topic lists
Category pages and topic list views use the ItemList schema:
- Ordered with
ItemListOrderDescendingfor chronological sorting - Includes position metadata for each item
- Helps search engines understand content structure and hierarchy
Homepage
The site homepage includes JSON-LD structured data for WebSite with SearchAction, which enables the Google search box to appear directly in search results for your site.
About page
Your About page uses About page schema with Organization information, helping search engines understand your community’s identity and purpose.
Sitemap
Discourse incorporates a sitemap index located at /sitemap.xml which is enabled by default via the enable sitemap setting. This facilitates better indexing by search engines. There are other sitemaps as well:
-
Recent Sitemap (
/sitemap_recent.xml) - Topics bumped in the last 3 days (cached for 1 hour) -
News Sitemap (
/news.xml) - Topics created in the last 72 hours in Google News format (cached for 5 minutes), useful for news-oriented communities - Paginated Sitemaps - Full catalog of topics split into pages (cached for 24 hours)
Sitemaps are automatically regenerated hourly by a scheduled job and include last modified timestamps. You can configure the number of topics per sitemap page via the sitemap_page_size setting (default: 10,000).
Web crawlers
Web crawlers, also known as robots or bots, are essential for indexing web pages and making your content discoverable. Discourse uses sophisticated crawler detection to serve optimized content and manage bot traffic effectively.
Crawler detection
Discourse automatically detects and handles various types of crawlers, including:
- Search engines: Googlebot, Bingbot, DuckDuckBot, and others
- Social media: Facebookbot, Twitterbot, LinkedInBot, Discordbot
- AI crawlers: GPTBot, ClaudeBot, Anthropic-AI, BrightBot
- Archive services: Wayback Machine, Archive.org
- Monitoring services: Lighthouse, Google Inspection Tool
When a crawler is detected, Discourse serves optimized content and adds special response headers:
-
X-Discourse-Crawler-View: true- Indicates crawler-optimized content -
Last-Modifiedheaders - Enables efficient re-crawling
Managing crawler traffic
Some crawlers can be overly enthusiastic, hitting your forum with many requests. Discourse provides several settings to manage crawler behavior:
-
blocked_crawler_user_agents- Completely block specific crawlers (default blocklist includes: mauibot, semrushbot, ahrefsbot, blexbot, seo spider) -
slow_down_crawler_user_agents- Rate limit crawlers rather than block them (default includes AI crawlers like GPTBot, ClaudeBot) -
allowed_crawler_user_agents- Allowlist specific crawlers when you want stricter control
Crawler analytics
Administrators can monitor crawler activity at Admin → Reports → Web Crawlers to see which bots are accessing your site and how frequently.
Additional SEO features
OpenSearch integration
Discourse provides an OpenSearch description at /opensearch.xml, enabling browser search integration. Users can add your forum directly to their browser’s search engine list.
RSS feeds
Each topic has an RSS feed available at /t/{slug}/{id}.rss, which includes:
- Topic title and description
- All posts/replies with author information
- Publication dates and last update times
- Categories and tags
Note: While RSS feeds are publicly accessible, they are intentionally disallowed in robots.txt to prevent duplicate content indexing.
Google site verification
Use the google_site_verification_token setting to add Google Search Console verification meta tags without editing theme templates.
Progressive web app (PWA)
Discourse generates a Web App Manifest at /manifest.json for mobile app discovery and installation prompts.
Migrations and URL redirections
The permalink feature is used to redirect old URLs, aiming to preserve SEO, preventing “Page Not Found” errors and assist search engines with the right metadata for easier indexing.
If your community site is migrated to Discourse by our team, the URL redirections are included unless there are valid reasons not to do so.
If you are using one of the existing importer scripts, you should ensure that the script handles this[2]. You can manually add permalinks from your admin panel, in Customize → Permalinks.
De-indexing methods
To get pages out of Google’s index, you can either remove content or block access to a page. Depending on your needs, you can make your whole site private[3]. You can exclude topics by deleting them or putting them in restricted categories. Hidden topics aren’t indexed by default, but they can be if there’s a public link somewhere that redirects to it.
For a lasting removal, using the Removals tool in the Google Search Console is the ticket to keeping pages out of search results. Learn more at Remove information on your website from Google - Search Console Help
-
Looking for the permalink string in the import script should give you this info. ↩︎
-
Look for the login required setting. ↩︎
Last edited by @MarkDoerr 2026-02-02T19:24:02Z
Last checked by @MarkDoerr 2026-02-02T19:23:41Z
Check document
Perform check on document:




