What cool data explorer queries have you come up with?


(Felix Freiberger) #26

I don’t think this is strictly possible, because a custom title could be character-by-character identical to a badge granted one (possible of a badge the user doesn’t have).

Here’s a query that might do what you want:

SELECT username FROM users

    SELECT count(*) FROM badges
    WHERE allow_title = true
    AND name = title
    ) = 0

    SELECT count(*) FROM groups
    WHERE groups.title = users.title
    ) = 0

It selects every title any user has, minus every title where a corresponding badge exists, minus every title that could come from a group membership. It does not test whether the user has the badge or the group membership.

Maybe it does help you, at least as a starting point :slight_smile:

(Joshua Rosenfeld) #27

Thanks @fefrei, that seems to work, but as you mentioned it might not cover all the edge cases. I did have to change the first line SELECT title FROM users to SELECT username FROM users as I was looking for the usernames, not the custom titles.

@Mittineague was able to create a query that seems to work as well in this topic.

, users.title 
FROM users 
JOIN user_profiles 
ON user_profiles.user_id = users.id 
WHERE users.title IS NOT NULL 
AND users.title NOT LIKE '' 
AND user_profiles.badge_granted_title IS NOT TRUE

(Felix Freiberger) #28

Oops, I forgot to copy the query again after fixing that :blush:

That query looks better – I didn’t find the badge_granted_title flag. I’m not sure how this query handles titles granted by group membership, though.

(Joshua Rosenfeld) #29

Nor am I. However, this accomplishes our goal. For some background: we currently don’t have any badge_granted_titles available to users, as we have no custom badges, the only titles are regular and leader, and we’ve “disabled” TL3. We have granted titles in the past to exceptional community members. The community started asking about titles, so the mod team started discussing options. One thing that was agreed upon is that we wanted the custom titles to stand out from the “everyday” badge_granted_titles. To apply CSS to them, we needed to add all the users to a group, hence the request for a query to select them.

(Chris Beach) #30

Top quality users in last six months

Top 20 users by average post score.

Post scores are calculated based on reply count, likes, incoming links, bookmarks, average time (reading?) and read count.

    sum(p.score) / count(p) as "average score per post", 
    count(p.id) as post_count, 
FROM posts p
JOIN users u ON u.id = p.user_id
WHERE p.created_at >= CURRENT_DATE - INTERVAL '6 month'
  AND NOT u.admin
  AND NOT u.silenced
  AND u.active
GROUP by user_id, u.views
HAVING count(p.id) > 50
ORDER BY sum(p.score) / count(p) DESC

Sorting all users by total posts?
(Felix Freiberger) #31

Who has been sending the most messages in the last week?

SELECT user_id, count(*) AS message_count
FROM topics
WHERE archetype = 'private_message' AND subtype = 'user_to_user'
AND age(created_at) < interval '7 days'
GROUP BY user_id
ORDER BY message_count DESC

Useful for tracking down suspicious PM activity.

(Andrew Waugh) #32

I’m sure this is a trivial one.

How do I get a list of similar usernames?


but not Bill_Fellows

(i.e. how to I get the list of users who would be candidates for a user merge?)

(Felix Freiberger) #33

I think it isn’t, because the fuzzystrmatch module doesn’t seem to be included in the Discourse Docker base image.

(Andrew Waugh) #34

Ok, another one:

Users where:

User has uploaded an avatar
User has filled in custom field "X"
Custom field “X” includes a category name as a substring
User has Posted in that category at least every 3rd day since they joined.

Group by Category name, sort by post frequency.

(I’m looking for “category moderator” candidates)

(Hosein Naseri) #35

Can someone help me with a query to find topmost email domains, users use as their email? perhaps it would be very good if it sorts the domains like:

gmail … 1234
yahoo … 543

(Tobias Eigen) #36

You could just export your user list and find that out in excel. Would be an interesting query to have handy nontheless.

(Joshua Rosenfeld) #37

I’ve got a quick Data Explorer Query request. Does anyone know what table stores user login date information? We’re trying to determine which users logged into the forums before a certain date. Ideally the query would give usernames, emails, and the date they signed in before by closest to a chosen date.

(David Taylor) #38

Are you after the first time that a user ever logged in? If so there’s a first_seen_at field on the users table.

I don’t think every login is logged in the database (happy to be corrected though!).

last_seen_at is updated regularly, but the previous values are not stored. You might be able to deduce some information from the user_auth_tokens and user_auth_token_logs tables, but I’m not exactly sure what each row in those tables actually represents. Also those tables appear to be cleared out fairly frequently.

(Joshua Rosenfeld) #39

Ideally I was looking for a table with login dates, but first login works. I’ve got at the moment, and it will work for what I need:

SELECT users.username, users.created_at
FROM users
WHERE users.created_at < '2017-05-10'::timestamp
ORDER BY users.created_at

(Joshua Rosenfeld) #40

And I’m back with another request (sorry, SQL really isn’t my strong suit). I’ve been asked to look at the number of topics and post created last year vs. this year. The problem is we delete most topics (and posts) during the off-season. Is there a table in the database that tracks numbers like this, or is there a way to include deleted topics/posts in a query?

(David Taylor) #41

I think that by default deleted topics/posts will be included in data-explorer statistics.

Doing some quick experimentation on a category where we delete a lot of stuff:

SELECT count(*) from topics 
WHERE category_id=40

returns 799

And then, deliberately excluding deleted topics

SELECT count(*) from topics 
WHERE category_id=40
AND deleted_at is null

returns 416.

So to get the statistics you want, I think you can just run the query without checking deleted_at

(Joshua Rosenfeld) #42

Interesting…I assumed (likely incorrectly) that deleted topics were not included. Thanks!


Can someone give me a hint how to make a Data Explorer query to create output in the way as in Discourse Meta including time frames?

(Joshua Rosenfeld) #44

Which one of the data points from /u are you looking for? All the queries are different.


I want to get a list of users with Likes Received, Likes Given, Topics Created, Replies, Viewed, Read, Visits for Week/Month and other time frames.