What cool data explorer queries have you come up with?

data-explorer

(Joshua Rosenfeld) #46

I can help with that query, but I’m away from a computer for most of the day. If someone else doesn’t beat me to it, I can look at this tonight.


#47

A post was split to a new topic: Collecting data on registration for display later


(Joshua Rosenfeld) #48

OK, here is the query to recreate the user’s directory.

-- [params]
-- null int :period

SELECT users.username AS "Username",
directory_items.likes_received AS "Likes Received",
directory_items.likes_given AS "Likes Given",
directory_items.topic_count AS "Topics Created",
directory_items.post_count AS "Replied",
directory_items.days_visited AS "Vists",
directory_items.topics_entered AS "Viewed",
directory_items.posts_read AS "Read"
FROM users
JOIN  directory_items ON users.id =  directory_items.user_id
WHERE directory_items.period_type = :period
ORDER BY directory_items.likes_received DESC

Once the query is saved you can enter the period into the field below the query to determine what data you get. The periods are as follows:

1: all
2: yearly
3: monthly
4: weekly
5: daily
6: quarterly

You can also change the sort by adjusting directory_items.likes_received on the last line.


Find the users which are more likely to become TL3
#49

Wow, it works like a charm. Thanks a lot!


#50

Hello Super Discourse Users

I wanted to tell you that I started to “play” with the Data-Explorer plugin and I was impressed by the possibilities it offers, I’m super happy with the tool.

I am grateful to the genius who has come up with the creation of the plugin.

As I’m debugging almost every query I’ve been reading, I’ve created a repository and a list of queries.

When I can, I add improvements to the query as parameters, to make it more dynamic.

You can check the list here:

I have maintained the sources of each query in each file, so that when in doubt about the queries, they can send their doubts to the user who created it.

Thank you for any contributions you want to make, we can also use the repository as a backup of the queries you have.

Thank you in advance for making this possible.

A hug from Argentina,
SidV


(David Taylor) #51

It’s great to have these all in one place, rather than spread across meta!

I see you’ve organised them into your fork of the discourse-data-explorer repository. Did you have any plans to improve the plugin to access the pre-made queries directly?

I think it would be really cool to have a “load from library” button in the plugin, which allows browsing these queries. Adding new queries to the library would then just need a PR. Would that be #pr-welcome @team?


(Sam Saffron) #52

Yeah a nice way to share queries would be nice, I worry about vetting.

I think the best way is to ship data explorer with a large amount of built in queries, then we do PRs whenever we want to add queries.

Can you do mockups of the UI for that? I totally support adding something here.


(David Taylor) #53

I’m imagining a new button on the main plugin interface “Load from Library”

Which then opens a popup with a scrollable and filterable table of available queries

In terms of the data itself, I quite like how @SidV has stored them in their own folder, each in their own .sql file here

We would need to add some metadata to the top of the sql files. We already have [params], so maybe have something like

-- [title] Solved Answers
-- [description] Lists users with most solved topics
-- [source] https://meta.discourse.org/t/63981/4?u=sidv

-- [params]
-- string :interval = 1 year
SELECT  ua.acting_user_id, 
        count(case t.user_id when ua.acting_user_id then 1 else null end) as is_op, 
        count(case t.user_id when ua.acting_user_id then null else 1 end) as not_op,
        count(*) as total
FROM user_actions ua
LEFT JOIN topics t ON target_topic_id = t.id
WHERE action_type=15
AND ua.created_at >= CURRENT_DATE - INTERVAL :interval
GROUP BY ua.acting_user_id

(Sam Saffron) #54

I find the whole selection of queries from a combobox deeply flawed, and it breaks down big time when you have a lot of queries.

What we should simply do it display a full list of linkable queries on the front page, we can add an icon next to builtin to show that they are built in and not editable.

The whole “clone from library” workflow feels like busywork to me.


(David Taylor) #55

I agree about the combobox.

I was trying to keep within the existing structure of the UI, but rethinking it from scratch will probably give a better result. It does significantly increase the amount of work that this is though :wink:


(Tobias Eigen) #57

I’m loving this initiative to make the best queries more readily available. But I wonder if it would not be safer and easier to just create a #queries category here on meta akin to the #plugin with a topic per query. Safe, approved queries could be provided there and broken queries could be moved to a #broken-queries category.

The advantage is that we can discuss, ask questions about and refine the queries better right here on meta. It’s also much simpler to implement though of course takes some gardening. :seedling:


(Sam Saffron) #58

Possibly, trouble is we would not have safety. How would we vet that a query or edit to a query is “safe”.


(David Taylor) #59

When you say “safe”, what do you mean?

I thought that the data-explorer plugin is inherently safe - it only has read-only access to the database, and there are timeouts in place to prevent performance issues.

Or are you simply talking about a query being working/not working?


(Sam Saffron) #60

Working/not working I guess. There is is also the question of if it works as described.

But yeah, I guess a category on meta can work as the source of truth, would make it very easy to add queries and discuss them.

We would have to be super careful with naming conventions and kicking out dupes.


(Christoph) #61

Is there an easy way of modifying these queries so as to exclude

  • myself
  • certain staff
  • all staff?

Thinking a bit more along these lines, wouldn’t a comparison between the metrics for all_users with all_users_minus_staff be an interesting measurement in itself? Call it “community independence” or so. If your community develops well, you’d want to see the the role of staff gradually disappear. This will be most relevant in small communities but it should be possible to normalize the measure by only looking at the 100 or 1000 most active users…


#62

Can anyone tell me the query to find:

  1. Total posts of a user with username “xyz” ?
  2. Total topics created by a user with username “xyz” ?
  3. Total solutions created by a user with username “xyz” ?

(David Taylor) #63

Have a look at the queries I wrote in the “Data explorer queries” section of this topic:


#64

@david

Can you tell me the query for this (this one is more important for me)
Total posts of a user with username “xyz” ?


(David Taylor) #65

If you open up data explorer, there’s a section on the right which lets you explore where data is stored in the database. For post count you can do

SELECT user_id, post_count from user_stats

If you just want it for a specific user, then you’ll need to add something like WHERE user_id = 2


(Alex Armstrong) #66

I’m looking for a data explorer query that counts “things posted” (by which I mean topics, posts & messages) and displays them by month.

I think it would be close to this one:

But would output posts instead of users. My SQL-fu isn’t great and I haven’t managed to revise it accordingly (or perhaps I’m on the wrong path entirely). Any help would be much appreciated!