Google Not Indexing Discourse Forum – Sitemap Not Approved

Hi everyone,

I’m running a Discourse forum (forum.evteam.pl), and I’m struggling with getting my pages indexed by Google. While a few pages have been indexed, most remain unindexed despite submitting a sitemap.

Here’s the current situation:

  • Only 8 pages indexed out of 180+.
  • The number of indexed pages briefly increased but then dropped again.
  • Google Search Console shows 172 pages as non-indexed.
  • The sitemap has not been approved for a long time.
  • Search performance is very low, with barely any clicks from Google.

I have checked the following:
:white_check_mark: Robots.txt – No obvious restrictions.
:white_check_mark: Sitemap.xml – Submitted, but still not approved.
:white_check_mark: Noindex Tags – Not present on key pages.
:white_check_mark: Google Search Console – No manual penalties or security issues.

Has anyone experienced similar issues with Discourse forums? Could this be due to Google’s indexing policies, or is there something I might be missing? Any tips on how to resolve this?

Thanks in advance!

Can you check

  1. <yoursite>/admin/reports/web_crawlers to see if Googlebot is in the list?
  2. the site setting allowed_crawler_user_agents to make sure you are not blocking Google by accident (share this here if possible)

Discourse SEO overview (sitemap / robots.txt ) may be a useful topic for you.

1 Like

Thanks for your suggestions!

  1. I checked /admin/reports/web_crawlers, and Googlebot is on the list, so it is crawling the forum.
  2. The allowed_crawler_user_agents list was empty, so I added:
Googlebot  
bingbot  
DuckDuckBot  

I’ve also resubmitted the sitemap in Google Search Console and will monitor if indexing improves over the next few days.

That could be potentially bad idea — unless you are totally sure every other bots can be disallowed. Google is using a lot crawlers that aren’t declaring googlebot string.

2 Likes

Thanks for pointing that out! I wasn’t aware that Google uses other crawlers that don’t explicitly declare Googlebot.

I’ll clear the allowed_crawler_user_agents list to avoid accidentally blocking anything.