How to export more than the 10,000 limit?

Dumb question - that ceiling is looming closer for one of our forums. Can we exceed it without forking the plugin?

It looks like there is a hidden setting for the default 1000 display limit, allowing it to go up to 10,000. But I don’t need that, and 1000 makes a lot of sense for the UI within Data Explorer.

However, a spreadsheet can handle a lot more. What I want to do is be able to Export to a higher limit (say 20,000). Is that doable?

I am coincidentally tonight also up against this 10,000 record limit. Is there a way to breach it? I need another order of magnitude.

A way to test would be to enter the container, edit /var/www/discourse/plugins/discourse-data-explorer/plugin.rb and change QUERY_RESULT_MAX_LIMIT = 10000, do a sv restart unicorn and see what happens. Of course there are reasons not to allow such huge queries; if you’re confident of your infrastructure being able to handle it, or don’t mind a bit of instability, this could work.

If the world doesn’t explode, then you could add some stuff to app.yml that would modify that file after the plugins get cloned at every rebuild. It would probably conflict with upgrading with docker_manager. (You can look at some other templates the modify files to figure out what some stuff might be.)

@DonH if you need that query a lot and it’s just one then you might want a plugin that would deliver it somehow. It could, say, write a file in chunks that you could retrieve via some path.

3 Likes

Thanks for this advice, Jay. I have indeed thought about writing a plugin but also that it should be doable from the rails console as for the various bulk operations but I don’t know rails and that anyway appears to require triggering mbedded functions. That would create the same overwrite problem unless the jobs were committed to core.

I’ve been managing so far with the Data Explorer on the down side and (very careful) psql on the up side but I would much rather do things by the book, fate being what it is.

I should probably be explicit about what I’m up to.

I run a forum that is in it’s third or fourth manifestation having been through Phorum and phpBB software in my hands and something else before I got to it. The subject matter is narrow and the user base is tiny but the content has carried forward at each move and represents a lot of institutional knowledge. Discourse, with it’s categorization, tagging and interface features seemed like a great solution to knowledge accessibility.

So I’ve ported the forum which was never categorized or tagged in it’s previous incarnations. Rather than wade through 100K posts/8K threads I have been using some natural language processing software to help with the categorizations and tagging. I then update the topics, categories and topic_tag tables directly being, as I said, quite careful.

The process is still ongoing but I have a stable work flow and can easily finish up with the tools at hand. Moving forward, though, there will be regular periodic updates to wrap in new categorizations and tags which may or may not be different than the prior data. So you can see the need.

Obviously the Data Explorer is one-way flow but it’s been very convenient. I can overcome the size limit by doing batches and, now, by raising the limit setting so thanks for that.

1 Like