URL ?page=2 issue


(Mariano Rodriguez) #1

I’ve a lot of not found errors in Google’s Search Console. Some of my topics are pointing to a ?page=2 and when you click such link it throws a 404 error.

example.com/t/title-of-topic/id?page=2

I’ve been trying to make a regex, elimininate any ?... endeing but the substitution doesn’t seem to work.

/(t\/.*\/.*)\?(.*)\1

(Robert McIntosh) #2

I’m not sure I understand - where does the original ?page=2 come from?

Discourse doesn’t have pagination on topics, you have a format of /topic_iD/post_ID

do you have an example?


(Mariano Rodriguez) #3

Yes I know, that’s the weird part, that looks like it have the pagination.

Some of the pages with errors in Google Search Console

In this image, you can see it says that the /4472 is supposedly linking to /4472?page=2


(Kris) #4

?page=2 is only visible in our crawler view when Javascript is disabled… I wonder if Google’s trying to crawl those pages with Javascript enabled and running into the issue?


(Rafael dos Santos Silva) #5

This topics have deleted or moved posts?


(Mariano Rodriguez) #6

I wouldn’t know for sure, there are a lot of post with that error.

What might be happening is that the forum was transferred from phpBB


(Jeff Atwood) #7

You will need to disable JavaScript and change your user-agent to the google crawler user agent to see Discourse as the google crawler does.

Alternately:

Use Fetch as Google for websites - Search Console Help

The Fetch as Google tool enables you to test how Google crawls or renders a URL on your site. You can use Fetch as Google to see whether Googlebot can access a page on your site, how it renders the page, and whether any page resources (such as images or scripts) are blocked to Googlebot.