What cool data explorer queries have you come up with?


(Dave McClure) #6

Recently Read Topics by User

Show the topics with that have been opened by a given user in the past N days, sorted by the amount of time the user has spent in that topic. (requested on feverbee)

-- [params]
-- integer :user = 1
-- integer :since_days_ago = 7

with topic_timing as (
  select user_id, topic_id, sum(msecs) / 1000 as seconds
  from post_timings
  where user_id = :user
  group by user_id, topic_id
SELECT tv.topic_id,
from topic_views tv
left join topic_timing tt
on tv.topic_id = tt.topic_id
and tv.user_id = tt.user_id
where tv.user_id = :user
and viewed_at + :since_days_ago > CURRENT_TIMESTAMP
order by seconds desc

Data Explorer - Find users who read specific topics
(Erlend Sogge Heggen) split this topic #7

6 posts were split to a new topic: “Recently Read Topics by User” query returns Parameter declaration error

(Mittineague) #13

While working on this

I was looking for a way to find topics that were created as a result of a moderator splitting posts into a new topic. (i.e. off-topic posts that deserved their own topic)

It wasn’t as easy as I had hoped. I looked at various “topic” and “post” tables with no success.

Being sure I could find a lead to what I needed in Admin -> Logs -> Staff Actions I was disappointed to find that “split” data was not there (not that it should, hopefully it doesn’t happen all that often)

The only place I could find the id of a topic created this way is in “small-action” posts as part of a link.

Getting the id portion of the string to an integer that can be used is a bit involved.

regexp_matches returns a string array
array_to_string converts the array to a string
CAST converts the string to an integer

WITH new_topics AS (
 CAST( array_to_string(regexp_matches(posts.raw, '([\d]+)(?:\))$', 'g'), '') AS integer) AS new_topic_id
 FROM posts 
 WHERE action_code LIKE 'split_topic'
 AND raw LIKE '%posts were split to a new topic%'
SELECT topics.title 
FROM topics, new_topics
WHERE topics.id = new_topics.new_topic_id

(Mittineague) #14

Member Uploads

I put together this query to help find members that post a lot of uploads that potentially might lead to a problem.

Ordered by total upload weight per member

WITH heavy_uploads AS ( SELECT 
    ( SUM(uploads.filesize) / 1024) AS sum_kb
  , COUNT(uploads.filesize) AS upload_count
  , ( (SUM(uploads.filesize) / COUNT(uploads.filesize)) / 1024) AS avg_weight_kb
  , uploads.user_id 
  FROM uploads
  GROUP BY uploads.user_id
SELECT heavy_uploads.sum_kb
 , heavy_uploads.upload_count
 , heavy_uploads.avg_weight_kb
 , heavy_uploads.user_id
FROM heavy_uploads 
WHERE heavy_uploads.sum_kb > 100
ORDER BY heavy_uploads.sum_kb DESC

(the three Grimm members uploaded animated woge gifs)

Would it be worth resizing uploaded images (to save space)?
(Alessio Fattorini) #15

Users active in last 30 days? Active = at least a post in the last 30 days.

(Kane York) #16
select username
from users
where last_posted_at > current_timestamp - interval '30' day

How to measure active users?
SQL query for users who have posted on a date (or date range)
(Tom Newsom) #17

Or just /users?period=monthly surely?

EDIT: No that also counts people who just read. sorry!


How would you adjust this with parameters so that previous months can also be checked? Eg: read between 62 and 31 days ago (or similar)?

(Mittineague) #19

I haven’t figured out a “between” way of doing this, but using “params” can provide a selectable “back to”

-- [params]
-- int :interval = 30

select username
from users
where last_posted_at > current_timestamp - interval ':interval' day

(Dave McClure) #20

should be able to do something like this (starting where @Mittineague left off):

-- [params]
-- int :until_days_ago = 30
-- int :since_days_ago = 60

select username
from users
where last_posted_at > current_timestamp - interval ':since_days_ago' day 
and last_posted_at < current_timestamp - interval ':until_days_ago' day

(Mittineague) #21

Thanks, the call of Labor Day burgers was distracting me and all I could think of was finding a way to put current_timestamp into the params and getting to the burgers. At times my stomach overpowers my brain.


Ok, so what about ‘posted during that period’ – not last posted during that period.
i.e. a count of active (made at least one post) members during that period?


Got what I need (thanks @meglio) so updating this for future posterity.

-- [params]
-- date :date_from
-- date :date_to
-- int  :min_posts = 1

WITH user_activity AS (
    SELECT p.user_id, count (p.id) as posts_count
    FROM posts p
    LEFT JOIN topics t ON t.id = p.topic_id
    WHERE p.created_at::date BETWEEN :date_from::date AND :date_to::date
        AND t.deleted_at IS NULL
        AND t.visible = TRUE
        AND t.closed = FALSE
        AND t.archived = FALSE
        AND t.archetype = 'regular'
        AND p.deleted_at IS NULL
    GROUP BY p.user_id
FROM user_activity
WHERE posts_count >= :min_posts

Active users for specific months
Is there a Data Explorer for Dummies article somewhere?
How can I count posts in last month by a specific group of users?
What's the difference between "User Visits" and "Active Users"
Strange problem with Data Explorer
Strange problem with Data Explorer
(Felix Freiberger) #24

List of all Members of a Group with Custom Field

We use this query to get a list of the name and matriculation number of all team members:

SELECT users.name, user_custom_fields.value as matriculation
FROM users
JOIN group_users ON users.id = group_users.user_id
JOIN groups ON groups.id = group_users.group_id
JOIN user_custom_fields ON users.id = user_custom_fields.user_id
WHERE groups.name = 'Team'
AND user_custom_fields.name = 'user_field_2'

Simply insert the group name and the custom field id in the last two names.

(Joshua Rosenfeld) #25

Anyone able to provide the query for all users with a custom title (not a badge granted title)?

(Felix Freiberger) #26

I don’t think this is strictly possible, because a custom title could be character-by-character identical to a badge granted one (possible of a badge the user doesn’t have).

Here’s a query that might do what you want:

SELECT username FROM users

    SELECT count(*) FROM badges
    WHERE allow_title = true
    AND name = title
    ) = 0

    SELECT count(*) FROM groups
    WHERE groups.title = users.title
    ) = 0

It selects every title any user has, minus every title where a corresponding badge exists, minus every title that could come from a group membership. It does not test whether the user has the badge or the group membership.

Maybe it does help you, at least as a starting point :slight_smile:

(Joshua Rosenfeld) #27

Thanks @fefrei, that seems to work, but as you mentioned it might not cover all the edge cases. I did have to change the first line SELECT title FROM users to SELECT username FROM users as I was looking for the usernames, not the custom titles.

@Mittineague was able to create a query that seems to work as well in this topic.

, users.title 
FROM users 
JOIN user_profiles 
ON user_profiles.user_id = users.id 
WHERE users.title IS NOT NULL 
AND users.title NOT LIKE '' 
AND user_profiles.badge_granted_title IS NOT TRUE

(Felix Freiberger) #28

Oops, I forgot to copy the query again after fixing that :blush:

That query looks better – I didn’t find the badge_granted_title flag. I’m not sure how this query handles titles granted by group membership, though.

(Joshua Rosenfeld) #29

Nor am I. However, this accomplishes our goal. For some background: we currently don’t have any badge_granted_titles available to users, as we have no custom badges, the only titles are regular and leader, and we’ve “disabled” TL3. We have granted titles in the past to exceptional community members. The community started asking about titles, so the mod team started discussing options. One thing that was agreed upon is that we wanted the custom titles to stand out from the “everyday” badge_granted_titles. To apply CSS to them, we needed to add all the users to a group, hence the request for a query to select them.

(Chris Beach) #30

Top quality users in last six months

Top 20 users by average post score.

Post scores are calculated based on reply count, likes, incoming links, bookmarks, average time (reading?) and read count.

    sum(p.score) / count(p) as "average score per post", 
    count(p.id) as post_count, 
FROM posts p
JOIN users u ON u.id = p.user_id
WHERE p.created_at >= CURRENT_DATE - INTERVAL '6 month'
  AND NOT u.admin
  AND NOT u.silenced
  AND u.active
GROUP by user_id, u.views
HAVING count(p.id) > 50
ORDER BY sum(p.score) / count(p) DESC

Sorting all users by total posts?