Have Data Explorer Show number of results?


(Jay Pfaffman) #1

Data explorer returns only a certain number (500?) of items in the web interface. Too see them all you need to download them and look at them some other way (i.e., download a csv and open it in a spreadsheet).

It would be nice if it were easy to know how many items had been returned. What I’ve done in the past is to generate a second query that just did the count, which I find rather painful.

Feature Request

When the query is completed, would it be easy to pull out the number of results and print it next to (or below) the query completion time?

image


(Sam Saffron) #2

Sure, we should do something here @rishabhn

When you run a query it should show Query completed in 12.7 ms, 102 results

IF we are truncating due to our enforced limit eg try SELECT id, raw from posts on meta. we should say Query completed in 12.7 ms, partial results, top 500 returned … something like that.


(Jay Pfaffman) #4

That’s an improvement, but the specific problem I’m trying to solve is that if there are more than 500,the only way to find out how many there are is to write another query or download the data set to a CSV and count the rows (and subtract one).

I’m not sure what the solution there would be.


(Mittineague) #5

I think what you are wanting is a SELECT COUNT() query.

I think for many queries it could be a matter of simply replacing the main SELECT fields with COUNT. But I don’t know if it would get messy with queries that had aggregates and GROUP BYs.

IMHO, a type of “results were limited to 500” would be good enough and would indicate that a separate COUNT query might be needed.

I quite often put together COUNT queries when I work up a query so I won’t be taken by surprise by a long running query. eg. from this table with mega rows join another table with mega rows join yet another table with mega rows …

In fact, I would recommend running heavy queries on a copy of the live database to avoid having the forum hang while the query runs.


(Jay Pfaffman) #6

That’s what I want, but the only way that I know to do that is to create a whole separate query that does the SELECT COUNT. So I’d need to create a query for (e.g., “closed topics” and “closed topics count”).


(Jeff Atwood) #7

Just so you know, this is quite standard in SQL databases. You generally don’t get a count unless you explicitly ask for it, because counting “all the results” can be more expensive than getting the first {x} records.

We’re not secretly withholding information that we regularly get back from the database, because we’re jerks :wink:


(Kane York) #8

I think a button for “Wrap the query with count() and get the result” might be viable.

WITH query AS (
#{sql}
) SELECT COUNT(*) FROM query
-- do not include LIMIT clause inside `sql`

(Jay Pfaffman) #9

That’s brilliant! That might be a PR that I could pull off (though there is a fairly long list of things that I need to submit).


(Mittineague) #10

Would this in effect be running the query twice? Once to get the row count without the built-in resource use safety measures, and then again with the limit.

EDIT
AFAIK, though the Data Explorer plugin has a built in LIMIT, it does not have an OFFSET.

I think if there was some way to return the 500 result rows in “batches” - maybe with a OFFSET or a “not included in prior results” condition? - it would be safer. TBH, I’m thinking the better way would be to simply show a type of “LIMIT met, if you want more do it from the CLI” message where the admin could do a SELECT into an outfile query.


(Mittineague) #11

This looks promising

PostgreSQL: Documentation: 9.1: Value Expressions

the window function is able to scan all the rows that would be part of the current row’s group according to the grouping specification

EDIT
I just had a brief play on my localhost as a POC

SELECT users.username 
 , COUNT(users.username) OVER ()  
FROM users 
WHERE users.id > 4 
LIMIT 5

over-count


(Rishabh Nambiar) #12

For a first phase, the result limit has been updated from 250 to 1000 and number of results is displayed via:

When the limit is reached, the UI indicates that only partial results are shown:


When the limit is not reached, the exact number of results is shown:


(Rishabh Nambiar) #13

Thanks for that tip, OVER() is a pretty cool function!
It works awesome but hiding the count column with the same identical values for every row was somewhat complicated.