Paginating API search results


(mbajur) #1

Hello there!

I’m running a forum on which, in one of all the categories, users are posting a lot of youtube links. I would love to build some third-party youtube player on top of that but can’t quite invent how could i fetch such data by a third party app.

The most obvious way of doing that would be to use a search API and search for “youtube.com” string in desired category but, if i understand discourse code well, there is a limit of search results that can be fetched. It’s hardcoded to 50 results i think.

And with that info given, the only way that comes to my mind is to fetch search results periodically with a CRON and store them on that third-party app side. But, any other ideas are welcome :slight_smile:

Anyway - are you guys planning to support search API pagination?


(Eli the Bearded) #2

I can’t help you with the problem, but let me see if I even understand the question:

“I want to find a lot of posts with youtube URLs (for complicated digression), can the API search response be paginated to get more than 50 results?”


(mbajur) #3

Yeah, that’s exactly what i am asking for (asked it in the last line of my original post).

I included the explanation just to maybe provoke someone to sugest any other solution than i sugested.


(Kevin Wildenradt) #4

@mbajur I assume it is hard limited for performance reasons. Because search allows you to get multiple types of data (posts, topics) matching numerous arbitrary constraints (has tag, is in category, contains string), search results can be quite expensive for the back end to generate. Paginated results usually mean the data is lying around in memory, paginating search results would involve storing previous search results or regenerating preceding pages (getting page 50 would involve either having pages 1-49 laying around or regenerating them). Stopping after 50 matches have been found keeps fruitful queries from being horrendously expensive. It would be convenient if the API would provide a way for say a client with a master key to get more than 50 results for a search in some way and let the developer deal with whatever performance costs are incurred, but I think the idea is that you should be able to find some more efficient way to get your data.

The obvious way to get more results is to modify the source code and up the limit, maybe conditioned based on what API key is being used so your users don’t inadvertently bring down your server.

A better solution to your root goal though is to use Post Event Webhooks, located in Admin -> API -> Webhooks. Once configured, discourse will send a post request to whatever URL you specify with a body containing the post id and cooked post text. You can search this text for youtube.com, then do whatever you like including edit the post in discourse using the provided post id. Instructions for setting up webhooks are here. This will not work for links that have already been posted, but at least you would be notified of all incoming links going forward.


(mbajur) #5

Excellent answer. Thanks for this post! :slight_smile:

That’s exactly the same conclusion i reached some time ago - search is probably expensive so it would be a better idea to use webhooks and a third party system.


(Jeff Atwood) #6

Search now loads more than 50 results if you scroll down so this should be possible with API.