Google Search Console can't read the sitemap

Hi!

It looks like Google Search Console can’t read one of the sitemaps. The main sitemap sitemap.xml (https://forum.pragmaticentrepreneurs.com/sitemap.xml) is read correctly, but only sitemap_recent.xml is imported.

If I try to add sitemap_1.xml by myself to Google Search Console, I get an error. If I try it in the validator, it looks good.

Do you have any idea why GSE can’t read sitemap_1.xml?

Thanks for the help.

EDIT:
Bing is reading both sitemaps correctly, so it looks specific to Google.

Also, everything is fine with I try to inspect sitemap_1.xml

It looks like you had the same issue last year: Troubleshooting sitemap indexing issues in Google Search Console.

Did it eventually work?

Just saw I already posted a question about this :slight_smile: But no I didn’t get a solution.

Let me know if I can give more info to help.

I have no idea, either.

The URL is accessible to me. I tried various tools to validate the XML, no issue. Reading Manage your sitemaps using the Sitemaps report - Search Console Help, you should see a detail page below the error that tells you what’s wrong (doesn’t look like the case for you, though).

At the very least, I would encourage you to read this article. You might find a clue.

Indeed, I’ve no detail about the error. I tried to inspect the URL as they said, but not error here :frowning:

Let me know if I can do something else to help.

Update

I found the correct sitemap endpoint and it’s behaving normally for Googlebot:

  • https://forum.pragmaticentrepreneurs.com/sitemap.xml

It’s a valid sitemap index and it references:

  • https://forum.pragmaticentrepreneurs.com/sitemap_recent.xml
  • https://forum.pragmaticentrepreneurs.com/sitemap_1.xml

What I tested

  • Googlebot access: sitemap.xml, sitemap_recent.xml, and sitemap_1.xml all return HTTP/2 200 with a Googlebot user-agent, and the body is real XML (not an HTML challenge page).
  • Headers / content type:
    • sitemap.xml: Content-Type: application/xml; charset=utf-8
    • sitemap_recent.xml + sitemap_1.xml: Content-Type: text/xml; charset=utf-8
    • Responses include x-discourse-route: sitemap/* and x-discourse-crawler-view: true (served by Discourse in crawler mode).
  • IPv4 + IPv6: both return 200 on sitemap.xml.
  • Stability: I fetched each sitemap 20 times in a row with a Googlebot UA — no 403/429/5xx.
    • Typical response times were ~0.17–0.28s for sitemap.xml, ~0.19–0.60s for sitemap_recent.xml, and mostly ~0.45–0.99s for sitemap_1.xml (one slower response at ~2.9s, still 200).
  • robots.txt: includes Sitemap: https://forum.pragmaticentrepreneurs.com/sitemap.xml and doesn’t block /sitemap*.xml.

Search Console status

In Google Search Console, the sitemap index processing shows as successful, but only
https://forum.pragmaticentrepreneurs.com/sitemap_recent.xml is currently listed/recognized under “Sitemaps read”.
sitemap_1.xml is still not listed there.

Where this leaves things

From the server side everything looks fine, so this feels like a Search Console-side lag or partial processing: Google is reading the index and at least one child sitemap, but hasn’t surfaced the second one yet in the UI.