What cool data explorer queries have you come up with?

I answered myself just used this and it’s amazing!
Extracted group_id from group.json

Just a question. How can I modify the script to select a specified range of dates?

3 posts were split to a new topic: Updating likes in user_stats via SQL

Is there a way or query where we can see the most valuable topics and most valued users for our forum?

I couldn’t find ‘Score’ and ‘Percent Ranks’ in any default reports, though I’m sure they are somewhere (my searching ‘Meta’ also didn’t work).

Check this: discourse-data-explorer/score-threads.sql at queries · SidVal/discourse-data-explorer · GitHub

3 Likes

Thanks.
It told about most valuable posts (after a minor edit).
Now I’m looking for most valuable users in the given/chosen period of time.

1 Like
-- coverage: 'week', 'all', or 'date'
-- [params]
-- date :start_date = 2019-08-27
-- date :end_date = 2019-09-30
-- text :coverage = week

WITH date_range AS (
SELECT date_trunc('day', dd):: date AS "date", EXTRACT(week from date_trunc('day', dd):: date) AS "week"
FROM generate_series
        ( :start_date::timestamp 
        , :end_date::timestamp
        , '1 day'::interval) dd
), likes_given AS (
SELECT u.id, dr.date, dr.week, count(pa.*) AS "likes"
FROM date_range dr
FULL JOIN users u ON (1=1)
LEFT JOIN post_actions pa ON (pa.created_at:: date = dr.date and post_action_type_id=2 AND user_id = u.id)
GROUP BY dr.date, dr.week, u.id
ORDER BY u.id, dr.date
), posts_summary AS (SELECT u.id, u.username, u.created_at, dr.*, count(p.id) - count(t.id) AS replies, count(t.id) AS topics, COALESCE(sum(p.like_count),0) AS likes_received
from date_range dr
FULL OUTER JOIN users u ON (1=1)
LEFT JOIN posts p ON (p.user_id = u.id AND p.created_at::date=dr.date AND p.deleted_at IS NULL)
LEFT JOIN topics t ON (t.user_id = u.id AND t.created_at::date = dr.date AND p.topic_id = t.id AND t.deleted_at IS NULL)
GROUP BY u.id, dr.date, dr.week
ORDER BY u.id, dr.date), 
visits AS (SELECT u.id, dr.*, COALESCE(sum(posts_read),0) AS posts_read, COALESCE(sum(time_read),0) AS time_read, COUNT(uv.*) AS visits
FROM date_range dr
FULL OUTER JOIN users u ON (1=1)
LEFT JOIN user_visits uv ON (uv.user_id = u.id AND visited_at = dr.date)
GROUP BY u.id, dr.date, dr.week
ORDER BY u.id, dr.date
)

SELECT ps.id, ps.username, ps.created_at, CASE
WHEN :coverage::text = 'week' THEN ps.week::text
WHEN :coverage::text = 'all' THEN '-1'
ELSE ps.date::text
END  AS period, sum(ps.replies) AS replies, sum(ps.topics) AS topics, sum(ps.likes_received) AS likes_received, sum(lg.likes) AS likes_given, COALESCE(sum(posts_read),0) AS posts_read, COALESCE(sum(time_read),0) AS time_read, SUM(visits) AS visits
FROM posts_summary ps
LEFT JOIN likes_given lg ON (ps.date = lg.date AND ps.id = lg.id)
LEFT JOIN visits v ON (v.id = ps.id AND v.date = ps.date)
GROUP BY ps.id, ps.username, ps.created_at, CASE 
WHEN :coverage::text = 'week' THEN ps.week::text
WHEN :coverage::text = 'all' THEN '-1'
ELSE ps.date::text
END  
ORDER BY ps.id, CASE
WHEN :coverage::text = 'week' THEN ps.week::text
WHEN :coverage::text = 'all' THEN '-1'
ELSE ps.date::text
END

You can use this one to pull daily, weekly, or total stats by user over a specified time range.
Input params:

  • start_date (string) e.g. ‘2019-12-31’
  • end_date (string) e.g. ‘2019-12-31’
  • coverage (enum) one of 1) ‘date’ for daily 2) ‘week’ for weekly or 3) ‘all’ for total

Output Fields:

  • id (user_id)
  • username
  • created_at (user created at)
  • period (date string for ‘date’ coverage, week number (e.g. dec 26-31 = 52 for ‘week’ coverage, or -1 for ‘all’ coverage)
  • replies (number of posts that are not the first post in a topic)
  • topics
  • likes_received
  • likes_given
  • posts_read
  • time_read
  • visits

You’ll likely need to use the API to access the full results of this query (especially for daily data over a long period of time). See this post for more details on how to trigger queries via the API.

You can use this query to generate a data set of daily user engagement.

5 Likes

Thank you very much.
I hope that topics means topics created (it couldn’t mean ‘topics participated in’).
And was also in doubt whether replies mean the replies given my be? Whether in my own topics/to my own posts or someone else’s?
Or does replies mean the replies received by the user on the topic created by him?

Who knows answers to these, could think these questions to be very naive.

1 Like

@Bathinda good questions, let me clarify:

  • topics means topics created by the user, not topics participated in
  • replies refers to all posts written by the user in that time period, but it filters out posts that correspond to the creation of a topic (e.g. the first post in a topic). Importantly, I believe this definition includes all private messages as well. If desired, one could likely filter out private messages when joining the posts table in the creation of the posts_summary temporary view, though I did not need this.

So for example, I open a new topic. This action adds one to my topics count, but it does not increment my replies count. However, someone posts a reply to my topic, and I reply back to them. This second action will increment my replies count, but not my topics count.

I hope this helps!

1 Like

A post was merged into an existing topic: Data explorer plugin - filter users by sign-up ‘custom fields’

Thank you very much.
The query sent by you helped me very much.

Thanks once again for clarifications.

1 Like

Would it be possible to save the queries under given names?

Category lurkers:

See for a category (or all categories by default) the users’ notification level.
rationale: we use categories for “working groups” that everyone can “join” (track or watch). Therefore, a working group member is someone that has a notification level set for this category, and this query lists the members

-- [params]
-- null int :category 

select category_id, cu.user_id,
CASE notification_level 
  WHEN 1 THEN 'regular'
  WHEN 2 THEN 'tracking'
  WHEN 3 THEN 'watching'
  WHEN 4 THEN 'watching_first_post'
END as level

from category_users cu
join categories on category_id=categories.id
where (:category is null OR category_id= :category)
order by category_id, notification_level desc
3 Likes

I agree! This data explorer is blowing my mind. So impressive!

I wanted to import @jomaxro’s User’s Directory but can’t figure out how to add this SQL query.

In the OP, of the Data Explorer Plugin thread, @riking shows how to add his .json file to import a plugin, but how do we grab one of the SQL queries from your GIT repository, and add it to the Data Explorer?

Any chance you could tell me step-by-step what I need to do, here?

3 Likes

For the bare .sql files, simply copy & paste the entire content into the editor.

5 Likes

@riking or @jomaxro - is there a way to add the “name” to this query instead of just the “username”?

This is for the User Directory Query:

-- https://meta.discourse.org/t/43516/48?u=sidv

-- [params]
-- null int :period

SELECT users.username AS "Username",
directory_items.likes_received AS "Likes Received",
directory_items.likes_given AS "Likes Given",
directory_items.topic_count AS "Topics Created",
directory_items.post_count AS "Replied",
directory_items.days_visited AS "Vists",
directory_items.topics_entered AS "Viewed",
directory_items.posts_read AS "Read"
FROM users
JOIN  directory_items ON users.id =  directory_items.user_id
WHERE directory_items.period_type = :period
ORDER BY directory_items.likes_received DESC
2 Likes

Thank you @riking!

For any other newbs out there:

  1. Visit the Data Explorer link, within the Plugins menu and then click the " + " button
  2. Add a title for your data explorer query
  3. Add a description for your data explorer querty
  4. After finding and copying a data explorer query to your clipboard, paste it here
  5. Save it!

5 Likes

Add:
users.name AS "Name", after SELECT users.username AS "Username",

3 Likes

Many thanks @SidV!

I also wanted to add the users id (to be able to sort by when the user joined the community ツ

Here’s what my query looks like now, and it’s EXACLTY what I wanted! (note, I added the different “period” params to remind me what they are without having to come back to @jomaxro’s post.

-- https://meta.discourse.org/t/43516/48?u=sidv

-- [params]
-- null int :period

-- 1: all
-- 2: yearly
-- 3: monthly
-- 4: weekly
-- 5: daily
-- 6: quarterly

SELECT users.id AS "User ID",
users.username AS "Username",
users.name AS "Name",
directory_items.likes_received AS "Likes Received",
directory_items.likes_given AS "Likes Given",
directory_items.topic_count AS "Topics Created",
directory_items.post_count AS "Replied",
directory_items.days_visited AS "Vists",
directory_items.topics_entered AS "Viewed",
directory_items.posts_read AS "Read"
FROM users
JOIN  directory_items ON users.id =  directory_items.user_id
WHERE directory_items.period_type = :period
ORDER BY directory_items.likes_received DESC

Thanks again, y’all!!!

2 Likes

Hot trick: See those “posts”, “users”, “topics”, dropdowns on the right side? If you expand the “users” segment, you can see what’s legal to put in the users.____ slot. It also includes the numeric explanation you’ve put as a comment in your query.

image image

6 Likes

Ahhhhh! I love it! Thanks for the extra instruction. I added “I’d” and hoped it would work and it did… but I’m glad to know the “source” of what will work or not!

The syntax of this query wasn’t too hard, but i don’t understand most of the other ones. Do you know of some SQL basics tutorial or something that would help me understand and know how to write these?

1 Like