What cool data explorer queries have you come up with?

Number of active users per login method

Facebook vs Twitter vs Google vs Github

WITH target_user_ids AS (
SELECT
id
FROM users
WHERE staged = false
AND active = true
AND last_seen_at IS NOT NULL
)

SELECT (
       SELECT Count(DISTINCT user_id)
       FROM user_associated_accounts
       WHERE provider_name = 'google_oauth2'
       AND user_id IN(SELECT id FROM target_user_ids)) AS google,
       (
       SELECT Count(DISTINCT user_id)
       FROM user_associated_accounts
       WHERE provider_name = 'facebook'
       AND user_id IN(SELECT id FROM target_user_ids)) AS facebook,
       (
       SELECT Count(DISTINCT user_id)
       FROM   github_user_infos
       WHERE user_id IN(SELECT id FROM target_user_ids)) AS github,
       (
       SELECT Count(DISTINCT user_id)
       FROM user_associated_accounts
       WHERE provider_name = 'twitter'
       AND user_id IN(SELECT id FROM target_user_ids)) AS twitter
Output

8 Likes

how many members open the Welcome PM

To get a count of the number of non-admin users who have read the welcome PM sent by the discobot user, you could try this:

SELECT
COUNT(1) AS number_of_opens
FROM topics t
JOIN topic_users tu
ON tu.topic_id = t.id
JOIN users u
ON u.id = tu.user_id
WHERE t.user_id = -2
AND u.admin = false
AND tu.last_read_post_number IS NOT NULL
11 Likes

Using a CASE expression to order results

I don’t think it’s possible to pass a keyword as a parameter, but a boolean :desc parameter could be used in a CASE expression.

--[params]
-- boolean :desc = false

SELECT
*
FROM generate_series(1, 10) AS num
ORDER BY
  CASE WHEN :desc THEN num END DESC,
  CASE WHEN NOT :desc THEN num END ASC
5 Likes

Number of non-admin users who opened the welcome PM in the last N months

Try this:

-- [params]
-- int :months_ago = 1
WITH query_period as (
    SELECT
        date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
        date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)
SELECT
COUNT(1) AS number_of_opens
FROM topics t
JOIN topic_users tu
ON tu.topic_id = t.id
RIGHT JOIN query_period qp
    ON t.created_at >= qp.period_start
JOIN users u
ON u.id = tu.user_id
WHERE t.user_id = -2
AND u.admin = false
AND tu.last_read_post_number IS NOT NULL
AND t.created_at <= qp.period_end
4 Likes

List all open PMs

sorted by recent activity

SELECT
  t.id AS topic_id, 
  t.user_id 
FROM
  topics t 
  JOIN posts p ON t.id = p.topic_id 
WHERE
  t.archetype = 'private_message' 
  AND t.user_id > 0 
  AND t.reply_count > 1 
  AND NOT t.closed 
GROUP BY
  t.id 
ORDER BY
  t.updated_at DESC
4 Likes

Users who have accessed a topic in a protected category

This query will return all topic_users (users who have viewed or posted in a topic) who are not members of the group that is given for the query’s :group_name, for topics in categories that have the group in their security settings. This can be used to find admins who have viewed a topic in a category that has group permissions added to it.

--[params]
-- string :group_name

WITH allowed_users AS (
SELECT user_id
FROM group_users gu
JOIN groups g
ON g.id = gu.group_id
WHERE g.name = :group_name
),
group_categories AS (
SELECT category_id
FROM category_groups cg
JOIN groups g
ON g.id = cg.group_id
WHERE g.name = :group_name
)

SELECT
tu.user_id,
tu.topic_id,
tu.last_visited_at::date
FROM topic_users tu
JOIN topics t
ON t.id = tu.topic_id
WHERE t.category_id IN (SELECT category_id FROM group_categories)
AND tu.user_id NOT IN (SELECT user_id FROM allowed_users)
ORDER BY t.category_id
3 Likes

How to query the View counts of entire Topics with Particular Tag?

I’m not sure, try this:

-- [params]
-- text :tag_name
SELECT COUNT(tt.tag_id)
FROM topic_tags tt, tags t, topic_views tv, topics tp
WHERE t.id = tt.tag_id
AND tp.id = tt.topic_id
AND tv.topic_id = tp.id 
AND t.name = :tag_name
GROUP BY tt.tag_id

topics where user mentioned and hasn’t responded after being mentioned

-- [params]
-- user_id :user

WITH mentions AS (
  SELECT target_topic_id, target_post_id, created_at
    FROM user_actions
   WHERE action_type = 7 -- mentions
     AND user_id = :user
), replies AS (
  SELECT target_topic_id, MAX(created_at) created_at
    FROM user_actions
   WHERE action_type = 5 -- replies
     AND user_id = :user
   GROUP BY target_topic_id    
)
SELECT DATE(m.created_at) mentioned_at, target_post_id post_id
FROM mentions m
JOIN replies r ON r.target_topic_id = m.target_topic_id
JOIN topics t ON t.id = m.target_topic_id
WHERE m.created_at > r.created_at
AND t.deleted_at IS NULL
AND NOT t.archived
AND NOT t.closed
ORDER BY m.created_at DESC
6 Likes

Hi Guys,

We have a scheme within our community in which we promote certain regulars to PosBuddy status, these are regulars who share our brands tone of voice and have show an aptitude for answering questions, driving conversation and welcoming new members within the community.

This scheme grants these users to TL4, they have their own private group within the community and have the ability to close, merge and move posts: generally keeping the community tidy. We also give these members lots of great benefits like free products, merch and exclusive invites to our development showcases. So as you can imagine there is great incentives to work their way up to TL3 and contribute to our community.

Often we will want to know which users are almost at TL3, but not quite there yet, to do this we go through hundreds of TL2 users each month and see how far they are away from reaching TL3 and check their requirements for Trust Level 3 table.

We want to make this process easier so we have a data explorer query which shows most of the information we need in a single view, however I can’t seem to work out how to include the following fields into the Query listed below:

  • Topics Replied To
  • Flagged Posts
  • Users Who Flagged
  • Likes Received: unique days
  • Likes Received: unique users
  • Silenced (All time)
  • Suspended (All time)

If anyone could help I would be forever grateful.

-- [params]
-- int :from_days_ago = 0
-- int :duration_days = 100

with
t as (
  select 
    CURRENT_TIMESTAMP - ((:from_days_ago + :duration_days) * (INTERVAL '1 days')) as start,
    CURRENT_TIMESTAMP - (:from_days_ago * (INTERVAL '1 days')) as end
),
pr as (
    select user_id, 
        count(1) as visits,
        sum(posts_read) as posts_read
    from user_visits, t
    where posts_read > 0
    and visited_at > t.start
    and visited_at < t.end
    group by user_id
),
pc as (
  select user_id, 
      count(1) as posts_created
  from posts, t
  where created_at > t.start
  and created_at < t.end
  group by user_id
),
ttopics as (
 select user_id, posts_count
  from topics, t
  where created_at > t.start
  and created_at < t.end
),
tc as (
  select user_id, 
      count(1) as topics_created
  from ttopics
  group by user_id
),
twr as (
  select user_id, 
      count(1) as topics_with_replies
  from ttopics
  where posts_count > 1
  group by user_id
),
tv as (
 select user_id, 
        count(distinct(topic_id)) as topics_viewed
  from topic_views, t
  where viewed_at > t.start
  and viewed_at < t.end
  group by user_id
),
likes as (
  select 
      post_actions.user_id as given_by_user_id, 
      posts.user_id as received_by_user_id
  from t, post_actions
  left join posts
  on post_actions.post_id = posts.id
  where post_actions.created_at > t.start
  and post_actions.created_at < t.end
  and post_action_type_id = 2

),
lg as (
  select given_by_user_id as user_id, 
      count(1) as likes_given
  from likes
  group by user_id
),
lr as (
  select received_by_user_id as user_id, 
      count(1) as likes_received
  from likes
  group by user_id
),
e as (
  select email, user_id
  from user_emails u
  where u.primary = true
)
select pr.user_id,
       coalesce(visits,0) as "Visits",
       coalesce(topics_viewed,0) as "Topics Viewed",
       coalesce(posts_read,0) as "Posts Read", 
       coalesce(posts_created,0) as "Posts Created",
       coalesce(topics_created,0) as "Topics Created",
       coalesce(topics_with_replies,0) as "Topics with Replies",
       coalesce(likes_given,0) as "Likes Given",
       coalesce(likes_received,0) as "Likes Received"
from pr
left join tv using (user_id)
left join pc using (user_id)
left join tc using (user_id)
left join twr using (user_id)
left join lg using (user_id)
left join lr using (user_id)
left join e using (user_id)
left join users on pr.user_id = users.id
order by 
  visits desc, 
  posts_read desc, 
  posts_created desc
1 Like

In the course of working for @quimgil on improving his use of Discourse groups as a support system, we’ve developed some queries (more to follow). Let us know if you can spot any issues or improvements :slight_smile:

Time to first response for a support group

  • Average number of days for a set period: (2019-04-04 to 2019-04-06 in example)
  • Only counts business days (i.e. excluding Saturday and Sunday)
  • Does not count selected public holidays (2019-04-11 in example)
  • Group name in example is ‘support’
SELECT AVG(t.days)::float AS "Average business days to first response"
FROM (
  SELECT t.id, t.title, t.created_at, MIN(p.created_at) as "first_reply_created_at", (
    SELECT count(*) FILTER (
      WHERE d not in ('2019-04-11')
      AND extract('ISODOW' FROM d) < 6
    )
    FROM generate_series(t.created_at::timestamp::date
                        , MIN(p.created_at)::timestamp::date
                        , interval '1 day') as s(d)
  ) as "days"
  FROM topics t
  INNER JOIN posts p ON p.topic_id = t.id AND (
    CASE WHEN p.post_number = 1
    THEN p.via_email IS TRUE 
    ELSE true 
    END
  )
  WHERE t.archetype = 'private_message'
  AND t.id IN (
    SELECT topic_id FROM topic_allowed_groups
    WHERE group_id IN (SELECT id FROM groups WHERE name ilike 'support')
  )
  AND t.deleted_at IS NULL
  AND t.created_at::timestamp::date >= '2019-04-04'
  AND t.created_at::timestamp::date <= '2019-04-06'
  AND p.deleted_at IS NULL
  AND p.post_number > 1
  GROUP BY t.id
) t

To test the query is working properly, you can change the SELECT from an average to

t.days as "business days", t.title, t.created_at, t.first_reply_created_at

For example, on my Sandbox when I do this, I get

Notes:

  • Do you want to count replies on the same day as 0 or 1? (currently it’s 1)
  • How do you want to handle messages with no replies? (i.e. should they be excluded, or how many days should they be ascribed?). Currently messages with no replies are excluded from the count.

Number of messages received in a given group.

  • Group name in example is ‘support’
  • Count for a set period: (2019-04-04 to 2019-04-06 in example)
SELECT count(t) as "Number of messages"
FROM topics t
WHERE t.archetype = 'private_message'
AND t.id IN (
  SELECT topic_id FROM topic_allowed_groups
  WHERE group_id IN (SELECT id FROM groups WHERE name ilike 'support')
)
AND (
  SELECT via_email FROM posts
  WHERE topic_id = t.id AND post_number = 1 
)
AND t.deleted_at IS NULL
AND t.created_at::timestamp >= '2019-04-04'::timestamp
AND t.created_at::timestamp <= '2019-04-06'::timestamp
6 Likes

Got this sorted promptly with the professional talent of @pfaffman, our thanks to him. The script I’ll turn over for the community. They way we will be using it is by letting in 100 registrants or so, locking public access down then using the script to identify potential conversions and direct someone to say hi to them.

In this manner we can better predict our conversions, make better use of staff time and adjust the public intake up or down to suit assimilation onto our board.

Change your group ID to suit.

-- [params]
-- int :from_days_ago = 0
-- int :duration_days = 30
-- int :group_id = 73
--
WITH t AS (
    SELECT CURRENT_TIMESTAMP - ((:from_days_ago + :duration_days) * (INTERVAL '1 days')) AS first,
        CURRENT_TIMESTAMP - (:from_days_ago * (INTERVAL '1 days')) AS last
),
group_members as (
    SELECT user_id FROM
    group_users gu
    WHERE gu.group_id= :group_id
    ),
pr AS (
    SELECT user_id, COUNT(1) AS visits,
        SUM(posts_read) AS posts_read
    FROM user_visits, t
    WHERE posts_read > 0
        AND visited_at > t.first
        AND visited_at < t.last
    GROUP BY
        user_id
),
pc AS (
    SELECT user_id, COUNT(1) AS posts_created
    FROM posts, t
    WHERE
        created_at > t.first
        AND created_at < t.last
    GROUP BY
        user_id
),
ttopics AS (
    SELECT user_id, posts_count
    FROM topics, t
    WHERE created_at > t.first
        AND created_at < t.last
),
tc AS (
    SELECT user_id, COUNT(1) AS topics_created
    FROM ttopics
    GROUP BY user_id
),
twr AS (
    SELECT user_id, COUNT(1) AS topics_with_replies
    FROM ttopics
    WHERE posts_count > 1
    GROUP BY user_id
),
tv AS (
    SELECT user_id,
        COUNT(DISTINCT(topic_id)) AS topics_viewed
    FROM topic_views, t
    WHERE viewed_at > t.first
        AND viewed_at < t.last
    GROUP BY user_id
),
likes AS (
    SELECT post_actions.user_id AS given_by_user_id,
        posts.user_id AS received_by_user_id
    FROM t,
        post_actions
        LEFT JOIN
        posts
        ON post_actions.post_id = posts.id
    WHERE
        post_actions.created_at > t.first
        AND post_actions.created_at < t.last
        AND post_action_type_id = 2
),
lg AS (
    SELECT given_by_user_id AS user_id,
        COUNT(1) AS likes_given
    FROM likes
    GROUP BY user_id
),
lr AS (
    SELECT received_by_user_id AS user_id,
        COUNT(1) AS likes_received
    FROM likes
    GROUP BY user_id
),
e AS (
    SELECT email, user_id
    FROM user_emails u
    WHERE u.PRIMARY = TRUE
)
SELECT
    pr.user_id,
    username,
    name,
    email,
    visits,
    COALESCE(topics_viewed, 0) AS topics_viewed,
    COALESCE(posts_read, 0) AS posts_read,
    COALESCE(posts_created, 0) AS posts_created,
    COALESCE(topics_created, 0) AS topics_created,
    COALESCE(topics_with_replies, 0) AS topics_with_replies,
    COALESCE(likes_given, 0) AS likes_given,
    COALESCE(likes_received, 0) AS likes_received
FROM group_members, pr
LEFT JOIN tv USING (user_id)
LEFT JOIN pc USING (user_id)
LEFT JOIN tc USING (user_id)
LEFT JOIN twr USING (user_id)
LEFT JOIN lg USING (user_id)
LEFT JOIN lr USING (user_id)
LEFT JOIN e USING (user_id)
LEFT JOIN users ON pr.user_id = users.id
WHERE group_members.user_id=pr.user_id
ORDER BY
    visits DESC,
    posts_read DESC,
    posts_created DESC
4 Likes

Users (In Specific Group) Last Seen Since N Days Ago

-- [params] 
-- int :member_group 
-- int :days_since_last_activity 
SELECT u.id                AS user_id, 
       Age(u.last_seen_at) AS last_seen, 
       g.id                AS GROUP_ID 
FROM   users u 
       join group_users gu 
         ON gu.user_id = u.id 
       join GROUPS g 
         ON g.id = gu.group_id 
WHERE  Age(u.last_seen_at) >= ( :days_since_last_activity * '1 day' :: interval ) 
       AND g.id = :member_group 
ORDER  BY u.last_seen_at

Users In Specific Group(s) BUT NOT In Other Group(s)

Info: The Parameters have to be written as arrays. For example: {1,2,3}

-- [params]  
-- string :opt_in_groups
-- string :opt_out_groups
SELECT
   u.id AS user_id,
   g.id AS GROUP_ID 
FROM
   users u 
   join
      group_users gu 
      ON gu.user_id = u.id 
   join
      GROUPS g 
      ON g.id = gu.group_id 
WHERE
   g.id = ANY (:opt_in_groups::int[]) 
   AND u.id NOT IN 
   (
      SELECT
         u.id AS user_id 
      FROM
         users u 
         join
            group_users gu 
            ON gu.user_id = u.id 
         join
            GROUPS g 
            ON g.id = gu.group_id 
      WHERE
         g.id = ANY (:opt_out_groups::int[]) 
   )
ORDER BY
   u.primary_group_id
7 Likes

I’ve adapted @HAWK’s Solved Count report to created this query to see which users have posted the most X months ago & how many users in total have posted within that time period.

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
ua.user_id,
count(1) AS post_count
FROM user_actions ua
INNER JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
INNER JOIN users u
ON u.id = ua.user_id
AND u.admin = 'f'
AND u.moderator = 'f'
GROUP BY ua.user_id
ORDER BY post_count DESC

But I’m struggling to get the latter - the total number of users who posted within the time period. I think I just need a count of the number of users who’re included in the report. I’d be grateful if anyone has some pointers.

Edit - Scrap that, I can see that I can just get this from the total number of results -

I’ve noticed that List of Queries doesn’t include any queries including location information. Is this because location is a hard thing to determine without some external data provider helping out?

I’m trying to find out where our new members/registrations are from. Thoughts on that?

Can anyone help me write a query that tells me what actions my members are taking in the first 7 days they’re part of the community?

So - a member accepts their invite and logs on for the first time. What do they do in the first 7 days? Could this be aggregated to tell me what students spend the most time doing? Are they posting or creating new topics? Liking things? Lurking?

I’m really struggling to put together something that reflects this.

1 Like

I don’t think this is possible via Data Explorer.
I think you need to analyze something like google analytics, or hotjar to track your members.

Does anyone have a query to select the top ten posts by number of Likes, and include the number of likes along side the post titles/links?

Please @Richie, can you open a new topic and use #plugin:data-explorer category?
BTW, check the “Fetch top 10 posts by likes received in the last month” before open that topic :+1:

2 Likes

With a bit of help from a developer (@aluxian), I’ve adapted another one of Hawk’s reports here, to calculate what % of users, who joined within a certain timeframe, have posted X times, within a certain timeframe.

You could use this to see whether you’re improving onboarding for new users enough to make them feel comfortable posting, for example. @Richard_Millington shares some stats for benchmarking your community using these metrics (at 2:30) here.

Users who’ve become active

-- [params]
-- date :user_date_from = 01.06.2019
-- date :user_date_to = 30.06.2019
-- date :post_date_from = 01.06.2019
-- date :post_date_to = 30.06.2019
-- int  :min_posts = 1


WITH user_activity AS (
    SELECT p.user_id, COUNT(p.id) as posts_count
    FROM posts p
	LEFT JOIN users u ON u.id = p.user_id
    LEFT JOIN topics t ON t.id = p.topic_id
    WHERE u.created_at::date BETWEEN :user_date_from::date AND :user_date_to::date
	  AND p.created_at::date BETWEEN :post_date_from::date AND :post_date_to::date
      AND p.deleted_at IS NULL
	  AND t.deleted_at IS NULL
      AND t.visible = TRUE
      AND t.closed = FALSE
      AND t.archived = FALSE
      AND t.archetype = 'regular'
        
    GROUP BY p.user_id
)

SELECT (t1.new_users_with_posts::float) / (t2.new_users::float) * 100 as percent_users_with_posts

FROM ( SELECT COUNT(user_id) as new_users_with_posts
       FROM user_activity
       WHERE posts_count >= :min_posts )
       as t1 
cross join
     ( SELECT COUNT(id) as new_users
       FROM users u
       WHERE u.created_at::date BETWEEN :user_date_from::date AND :user_date_to::date )
       as t2
9 Likes