Have Data Explorer Show number of results?

pfaffman · September 7, 2018, 10:34pm

Data explorer returns only a certain number (500?) of items in the web interface. Too see them all you need to download them and look at them some other way (i.e., download a csv and open it in a spreadsheet).

It would be nice if it were easy to know how many items had been returned. What I’ve done in the past is to generate a second query that just did the count, which I find rather painful.

Feature Request

When the query is completed, would it be easy to pull out the number of results and print it next to (or below) the query completion time?

sam · September 10, 2018, 6:52am

Sure, we should do something here @rishabh

When you run a query it should show Query completed in 12.7 ms, 102 results

IF we are truncating due to our enforced limit eg try SELECT id, raw from posts on meta. we should say Query completed in 12.7 ms, partial results, top 500 returned … something like that.

pfaffman · September 10, 2018, 11:18am

That’s an improvement, but the specific problem I’m trying to solve is that if there are more than 500,the only way to find out how many there are is to write another query or download the data set to a CSV and count the rows (and subtract one).

I’m not sure what the solution there would be.

Mittineague · September 10, 2018, 6:43pm

I think what you are wanting is a SELECT COUNT() query.

I think for many queries it could be a matter of simply replacing the main SELECT fields with COUNT. But I don’t know if it would get messy with queries that had aggregates and GROUP BYs.

IMHO, a type of “results were limited to 500” would be good enough and would indicate that a separate COUNT query might be needed.

I quite often put together COUNT queries when I work up a query so I won’t be taken by surprise by a long running query. eg. from this table with mega rows join another table with mega rows join yet another table with mega rows …

In fact, I would recommend running heavy queries on a copy of the live database to avoid having the forum hang while the query runs.

pfaffman · September 10, 2018, 6:47pm

That’s what I want, but the only way that I know to do that is to create a whole separate query that does the SELECT COUNT. So I’d need to create a query for (e.g., “closed topics” and “closed topics count”).

codinghorror · September 10, 2018, 8:43pm

Just so you know, this is quite standard in SQL databases. You generally don’t get a count unless you explicitly ask for it, because counting “all the results” can be more expensive than getting the first {x} records.

We’re not secretly withholding information that we regularly get back from the database, because we’re jerks

riking · September 13, 2018, 3:37am

I think a button for “Wrap the query with count() and get the result” might be viable.

WITH query AS (
#{sql}
) SELECT COUNT(*) FROM query
-- do not include LIMIT clause inside `sql`

pfaffman · September 13, 2018, 12:00pm

That’s brilliant! That might be a PR that I could pull off (though there is a fairly long list of things that I need to submit).

Mittineague · September 13, 2018, 5:25pm

Would this in effect be running the query twice? Once to get the row count without the built-in resource use safety measures, and then again with the limit.

EDIT
AFAIK, though the Data Explorer plugin has a built in LIMIT, it does not have an OFFSET.

I think if there was some way to return the 500 result rows in “batches” - maybe with a OFFSET or a “not included in prior results” condition? - it would be safer. TBH, I’m thinking the better way would be to simply show a type of “LIMIT met, if you want more do it from the CLI” message where the admin could do a SELECT into an outfile query.

Mittineague · September 13, 2018, 6:43pm

This looks promising

PostgreSQL: Documentation: 9.1: Value Expressions

the window function is able to scan all the rows that would be part of the current row’s group according to the grouping specification

EDIT
I just had a brief play on my localhost as a POC

SELECT users.username 
 , COUNT(users.username) OVER ()  
FROM users 
WHERE users.id > 4 
LIMIT 5

over-count

rishabh · September 18, 2018, 6:20pm

For a first phase, the result limit has been updated from 250 to 1000 and number of results is displayed via:

https://github.com/discourse/discourse-data-explorer/commit/316923f1906d811d8547cea99bde8e44d58dd15e

When the limit is reached, the UI indicates that only partial results are shown:

When the limit is not reached, the exact number of results is shown:

rishabh · September 18, 2018, 6:27pm

Thanks for that tip, OVER() is a pretty cool function!
It works awesome but hiding the count column with the same identical values for every row was somewhat complicated.

sam · November 3, 2020, 7:02am

We are showing counts now, closing this

Topic		Replies	Views
Working around the 10,000 result limit of data explorer? Data & reporting data-explorer , sql-query	3	96	June 26, 2024
How to export more than the 10,000 limit? Dev data-explorer	3	939	September 14, 2022
How to bump 10k records csv export limit in Data Explorer? Dev data-explorer	8	1202	October 14, 2022
Discourse Data Explorer Plugin official , data-explorer , included-in-core	24	78872	July 16, 2025
Run Data Explorer queries with the Discourse API Integrations data-explorer , rest-api , how-to	14	12825	May 6, 2025

Have Data Explorer Show number of results?

Feature Request

Related topics