Pagination URL scheme not passed through when topic is renamed


#1

The non-javascript version of the site (for the likes of googlebot, et.al.) uses request parameters for pages (?page=2). This works fine for the bot doing the navigating, but if the first displayed post id were in the URL, searchers would get better links when they come from google (or wherever), since they’d at least get close to the bit they searched for.


#2

Agreed. This really makes bringing up a Google search result in a Discourse forum take longer than it should:

  1. Search Google.
  2. Click link which leads to a Discourse forum.
  3. Realize that you are nowhere near the intended post.
  4. Search the topic you landed in.
  5. Finally find the post that was referenced in the Google search.

Since the web these days is all about getting what you want quickly, people are going to hit the back button after step 3 instead of proceeding with step 4. Especially people who are unfamiliar with Discourse.


(Mittineague) #3

Interesting idea though I’m not sure how Google would treat having literally millions of posts to index for a site. Maybe if the posts had canonical pointing to the topic?

Seriously though, I have never had a problem with landing on a long content heavy page and using my browser’s “find” to home in on what I searched for.

But with infinite scroll that would be useless if what I wanted was in post number 100++ and I landed on post number 1.

Using the Discourse Search would be required and that might not be clear to those unfamilliar with Discourse.

As a topic loads in 20 posts at a time, maybe it could be worked out to give bots a per 20 URL to index?


#4

That’s what I’m saying. Instead of using the request parameter, just start the page with the post number. Problem solved.


(PJH) #5

No. It doesn’t always. Not on the board 3 of us come from:

Though if I took that to 1, it would closer resemble Blakey’s suggestion else-board.
 
I can hear the screams from everyone else now…


(Jeff Atwood) #6

Yeah, this is really a side effect of having a larger page size for traditional paging – only the JS-off view of the site uses traditional pages. So when you end up on “page 20” you have to scan through all (n) posts on that page to figure out where what you wanted is. Probably not too difficult when the page size is at the default of 20, but when the page size is set to 50, that’s 2.5× as much stuff to look through.

Pagination sure is awkward for us poor users, isn’t it? :wink:

On the other hand, this isn’t materially different from ending up on a giant 5,000 word single page review for, say, the Samsung Note 4 and having to figure out where the specific text you were searching for is in the article.

Try searching in Google for this:

samsung note “incredibly finicky”

Now find that section of the review and read it. See what I mean?

Since we’re talking about traditional pagination here, as seen on the JS-off view of the site, is there any other traditional paginated forum that does what you’re proposing here @boomzilla? I suspect the answer is no, because it’d heavily pollute search results to have one “page” for every post in a topic, even if that post is only a sentence.


(Kane York) #7

The NeoGAF forums have a single-post view: NeoGAF is linked from Durante Presents: FFXIII resolution unlocking (GeDoSaTo plugin) released - pre-alpha


(Sam Saffron) #8
User-Agent: *
Disallow: /

so not a particularly spectacular example :slight_smile:


(jaming) #9

I think @boomzilla’s idea is to keep the pages the same but have each page in the sans JavaScript view have a URL of /t/topic-name/topic-id/first-post-id-of-page; thus allowing Google to get you in the neighborhood of the post that hit your search terms.


(Sam Saffron) #10

We already have sane canonicals in batches of 20 when you hit a particular post.


#11

No, it isn’t. In paginated forums, Google is able to take you to the page and (usually) the post on that page that matches your search. This is because Google sees basically the same thing that the use sees. On a Discourse forum, this is not possible because there is a huge disconnect between what the user sees and what the Google bots see.


(Jeff Atwood) #12

Google sees the posts, in batches, just like you do. Same exact content, same delivery format. The only difference is that webcrawlers have to explicitly request an URL to get another batch of posts, whereas that happens automatically for users as they scroll down.


(TechnoBear) #13

If you search Google for the text of this fairly banal post (chosen because the text is unlikely to occur elsewhere)


the resulting link will take you into that topic (of 221 posts) around post #162, just a few posts above that one - much the same as the top of the page results in a more conventional forum.


#14

I think that something has broken. This search:

https://www.google.com/?gws_rd=ssl#q=jellypotato+site:what.thedailywtf.com

Gives me as the first result:

Login to your account - What the Daily WTF?

IIRC, this used to do what you’d expect and dump you at that “page” in the topic. Now you end up at the top. Originally noted here.


(Jeff Atwood) #15

Sounds like a legit bug, can you have a look Monday @eviltrout?


(Robin Ward) #16

The redirecting of page seems to work on other topics for me. That one is weird - it seems topic id 1000 is not that slug, so discourse redirects to the appropriate one. I also don’t see it as such in your google results like you do.

Was a topic renamed or something and google hadn’t updated properly?


#17

Ah, yes, it looks like a problem when the slug has changed. That topic changes title quite a bit. More than others, but that’s not an uncommon thing for us in general. I guess the redirect is dropping the query string. When I use the current slug, it works.


(Jeff Atwood) #18

@eviltrout we should make sure this is handled for renamed topics as well, at least for the “current” name.


(Robin Ward) #19

It does work for the “current” name, the issue is trying the previous name, which redirects and loses them.


(Jeff Atwood) #20

Since we’ve established this only happens for

  • renamed topics

  • when the old title is used (not the current title of the topic)

I don’t see this as a relevant problem, unless Google is unable to resolve the correct title over time.

I guess there is some pathological case you guys are running into when you guys keep changing the title of the topic every few minutes. Maybe… stop doing that?