Struggling with pagination within search/query.json

I’m trying to grab the posts within a date interval from our discourse server using the API. I’m struggling quite a lot with pagination.

Firstly I tried using search.json but that limits us to 50 results:
https://meta.discourse.org/search.json?q=after:2016-01-01%20before:2017-01-01

I tried to use the paginated query instead (but page 2 and page 3 end up the same for some queries):
https://meta.discourse.org/search/query.json?term=after:2016-01-01%20before:2017-01-01
https://meta.discourse.org/search/query.json?term=after:2016-01-01%20before:2017-01-01?page=2
https://meta.discourse.org/search/query.json?term=after:2016-01-01%20before:2017-01-01?page=3

That’s odd because simpler queries appear to paginate correctly:
https://meta.discourse.org/search/query.json?term=API
https://meta.discourse.org/search/query.json?term=API?page=2
https://meta.discourse.org/search/query.json?term=API?page=3
https://meta.discourse.org/search/query.json?term=API?page=51 (more_pages becomes null here, although we can fetch higher pages)

Is is possible to combine advanced search and pagination?

1 Like

Second (3rd, 4th, etc.,) query string should look like below. & symbol instead of ?

https://meta.discourse.org/search/query.json?term=after:2016-01-01%20before:2017-01-01&page=2
https://meta.discourse.org/search/query.json?term=API&page=2

not like this

https://meta.discourse.org/search/query.json?term=after:2016-01-01%20before:2017-01-01?page=2
https://meta.discourse.org/search/query.json?term=API?page=2

3 Likes

I don’t know if it’ll help, but I wrote a little ruby program that will download all of a topic or category. (It doesn’t pay attention to dates, though.)

4 Likes

Oops. Getting the links wrong is what made me believe discourse support pagination using page=X in the first place (my own bad conversion from curl to a web browser).

Perhaps I should have ask a more general question: If search/query.json has a grouped_search_result with more_posts set true, how do I get to see the additional posts?

I know this is quite late, but yes we certainly support pagination now, search for something with lots of results and then use:

How to reverse engineer the Discourse API

To find all the endpoints.

3 Likes

I’m trying to understand how to use the Discourse API.
In the URLs below, I’m guessing that term refers to a period or duration. This is based on erm=after:2016-01-01%20before:2017-01-01 etc. But I’m unclear what term=API refers to. Can anyone clarify?

There are actually two search endpoints you can hit /search.json?q= and /search/query?term=

term just means search term and does not refer to period or duration.

For understanding the search API it would be best to follow How to reverse engineer the Discourse API and perform the searches you are intending to do via the API and see how the UI makes the same API requests.

1 Like

hi @blake,
Thank you for the reply.

Actually, I’d like the grab the entire contents of a page (in my case, a list of topics in a category, see Obtaining a list of topics from a category). I don’t want to go page by page if it’s not necessary. And I don’t think reverse engineering this would work, because Discourse just expands the page as you scroll down. There is no option that I am aware of to show the whole page. Is there a way to do this?

There is no single API call that will fetch ALL of the topics in the category. You will have to make multiple api calls. This is because a category could have 1 million+ topics and that could be a huge query and affect the performance of your site. When you reverse engineer the scrolling you can see how the discourse UI makes the api calls for more topics and you can replicate that behavior in your api.

Depending on what you are trying to achieve you can also use the data explore query to get the number of topics in a category and make an api call to that saved query.

3 Likes