What cool data explorer queries have you come up with?

data-explorer

(Jeremy M) #87

I am wondering if there is a way to track how many visits and post views come from logged in users vs. anonymous users?


(Robert McIntosh) #88

there are stats on the admin dashboard for pageviews split between web crawlers, logged in and anonymous already - are you looking for other details?


(Jeremy M) #89

Dang it… you’re right…

So where would I go to see the queries / API calls for those in case we need to programmatically add those (or any of them on the Admin Dashboard) to an external dashboard?


(Chris Beach) #90

A pattern I’ve spotted on my forum: one or two users habitually “like” posts that break our community guidelines.

These users may not intentionally be trolling, but their “likes” are serving to encourage bad behaviour.

Here’s a query to identify the users who have “liked” the most flagged posts, where the flags have been “agreed” by a moderator:

SELECT likes.user_id, count(*) as count 
FROM post_actions pa 
    JOIN post_action_types pat ON pa.post_action_type_id = pat.id
    JOIN post_actions likes ON likes.post_id = pa.post_id AND
        likes.post_action_type_id = 2
WHERE 
    pat.is_flag AND
    pat.name_key NOT IN ('notify_user') AND
    pa.agreed_by_id IS NOT NULL
GROUP BY 
    likes.user_id
ORDER BY 
    count DESC
LIMIT 100

#91

Here are some incredibly useful queries that @Simon_Cossar wrote:

New topics

Lists all new topics created with a given month, ordered by category and creation_date. The query accepts a ‘months_ago’ parameter. It defaults to 0 to give you the stats for the current month.

-- [params]
-- int :months_ago = 1

WITH query_period as (
    SELECT
        date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
        date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
    t.id as topic_id,
    t.category_id
FROM topics t
RIGHT JOIN query_period qp
    ON t.created_at >= qp.period_start
        AND t.created_at <= qp.period_end
WHERE t.user_id > 0
    AND t.category_id IS NOT NULL
ORDER BY t.category_id, t.created_at DESC

New topics per category (by month)

Returns the number of new topics created in each category for a given month, ordered by topic_count. The query accepts a ‘months ago’ parameter, defaults to 0 to give the results for the current month.

-- [params]
-- int :months_ago = 1

WITH query_period as (
    SELECT
        date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
        date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
    t.category_id,
    count(1) as topic_count
FROM topics t
RIGHT JOIN query_period qp
    ON t.created_at >= qp.period_start
        AND t.created_at <= qp.period_end
WHERE t.user_id > 0
    AND t.category_id IS NOT NULL
GROUP BY t.category_id
ORDER BY topic_count DESC

Top 50 active topics (and their associated categories)

Returns the top 50 active topics per month. It’s based on the number of replies created for a topic in a given month. The query accepts a ‘months_ago’ parameter, defaults to 0 to give results for the current month.

This is from @vinothkannans

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
t.id as topic_id,
t.category_id,
COUNT(p.id) as reply_count
FROM topics t
JOIN posts p
ON t.id = p.topic_id
JOIN query_period qp
ON p.created_at >= qp.period_start
AND p.created_at <= qp.period_end
WHERE t.archetype = 'regular'
AND t.user_id > 0
GROUP BY t.id
ORDER BY COUNT(p.id) DESC, t.score DESC
LIMIT 50

Solved count by members (monthly)

Count of topics solved by regular members (defaults to last month)

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
ua.user_id,
count(1) AS solved_count
FROM user_actions ua
INNER JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
INNER JOIN users u
ON u.id = ua.user_id
WHERE ua.action_type = 15
AND u.admin = 'f'
AND u.moderator = 'f'
GROUP BY ua.user_id
ORDER BY solved_count DESC

Solved count by staff (monthly)

Count of topics solved by staff(defaults to last month)

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
ua.user_id,
count(1) AS solved_count
FROM user_actions ua
INNER JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
INNER JOIN users u
ON u.id = ua.user_id
WHERE ua.action_type = 15
AND (u.admin = 't' OR u.moderator = 't')
GROUP BY ua.user_id
ORDER BY solved_count DESC

Topics solved by regular users (lists topics)

Returns topics solved by regular users over a given monthly period, ordered by solution_date. The query accepts a ‘months_ago’ parameter, defaults to 0 to give the results for the current month.

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
ua.target_topic_id,
ua.target_post_id
FROM user_actions ua
INNER JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
INNER JOIN users u
ON u.id = ua.user_id
WHERE ua.action_type = 15
AND (u.admin = 'f' AND u.moderator = 'f')
ORDER BY ua.created_at DESC

Topics solved by staff (lists topics)

Returns topics solved by staff over a given monthly period, ordered by solution_date. The query accepts a ‘months_ago’ parameter, defaults to 0 to give the results for the current month.

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
ua.target_topic_id,
ua.target_post_id
FROM user_actions ua
INNER JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
INNER JOIN users u
ON u.id = ua.user_id
WHERE ua.action_type = 15
AND (u.admin = 't' OR u.moderator = 't')
ORDER BY ua.created_at DESC

Top 50 posters

Returns the top 50 posters for a given monthly period. Results are ordered by post_count. It accepts a ‘months_ago’ parameter, defaults to 0 to give results for the current month.

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
),

user_posts_in_period AS (
SELECT
p.user_id
FROM posts p
INNER JOIN query_period qp
ON p.created_at >= qp.period_start
AND p.created_at <= qp.period_end
WHERE p.user_id > 0
)

SELECT
up.user_id,
count(1) as post_count
FROM user_posts_in_period up
GROUP BY up.user_id
ORDER BY post_count DESC
LIMIT 50

Top 50 likers

Returns the top 50 likers for a given monthly period. Results are ordered by like_count. It accepts a ‘months_ago’ parameter, defaults to 0 to give results for the current month.

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
ua.user_id,
count(1) AS like_count
FROM user_actions ua
INNER JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
WHERE ua.action_type = 1
GROUP BY ua.user_id
ORDER BY like_count DESC
LIMIT 50

Revised queries

Time to respond by staff (monthly)

Average time to first staff response for topics created by regular users in a given time period for a hard-coded array of categories. The categories array can be changed by altering this line: AND t.category_id = ANY ('{46,25,43,40,44,35,22,7,20,17,6,12}'::int[]). It’s possible to alter the query so that the category array can be supplied as a string parameter, written in the form {1, 2, 3}.

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
),
staff_responses AS (
SELECT
DISTINCT ON (p.topic_id)
p.topic_id,
p.created_at,
t.category_id,
DATE_TRUNC('minute', p.created_at - t.created_at) AS response_time
FROM posts p
JOIN topics t
ON t.id = p.topic_id
AND t.category_id = ANY ('{46,25,43,40,44,35,22,7,20,17,6,12}'::int[])
JOIN users u
ON u.id = p.user_id
WHERE p.post_number > 1
AND u.admin = 't' OR u.moderator = 't'
ORDER BY p.topic_id, p.created_at
),
user_topics AS (
SELECT
t.id
FROM topics t
JOIN users u
ON u.id = t.user_id
WHERE u.admin = 'f' AND u.moderator = 'f'
)

SELECT
sr.category_id,
AVG(sr.response_time) AS "Average First Response Time",
COUNT(1) AS "Topics Responded to"
FROM staff_responses sr
JOIN query_period qp
ON sr.created_at >= qp.period_start
AND sr.created_at <= qp.period_end
JOIN user_topics t
ON t.id = sr.topic_id
GROUP BY sr.category_id

Topics solved by staff

The number of topics solved by staff for a given time period, broken down by categories. The categories array can be altered by editing this line: WHERE t.category_id = ANY ('{46,25,43,40,44,35,22,7,20,17,6,12}'::int[]). It is possible to alter the query so that it returns results for all categories, or so that the categories array is supplied as a string parameter in the form {1, 2, 3}.

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
t.category_id,
COUNT(1) AS solved_count
FROM user_actions ua
JOIN topics t
ON t.id = ua.target_topic_id
JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
JOIN users u
ON u.id = ua.user_id
WHERE t.category_id = ANY ('{46,25,43,40,44,35,22,7,20,17,6,12}'::int[])
AND ua.action_type = 15
AND (u.admin = 't' OR u.moderator = 't')
GROUP BY t.category_id
ORDER BY solved_count DESC

Topics solved by regular users

(There is probably a way to combine this query with the staff query.)

The number of topics solved by staff for a given time period, broken down by categories. The categories array can be altered by editing this line: WHERE t.category_id = ANY ('{46,25,43,40,44,35,22,7,20,17,6,12}'::int[]). It is possible to alter the query so that it returns results for all categories, or so that the categories array is supplied as a string parameter in the form {1, 2, 3}.

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
t.category_id,
COUNT(1) AS solved_count
FROM user_actions ua
JOIN topics t
ON t.id = ua.target_topic_id
JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
JOIN users u
ON u.id = ua.user_id
WHERE t.category_id = ANY ('{46,25,43,40,44,35,22,7,20,17,6,12}'::int[])
AND ua.action_type = 15
AND (u.admin = 'f' AND u.moderator = 'f')
GROUP BY t.category_id
ORDER BY solved_count DESC

Query for Unsolved Topics
(Robert McIntosh) #92

I realise I posted this in the middle of a busy conversation about a different issue, so it might have been lost, … but does anyone know if it is possible to extract a list of links that have been oneboxed (or links to a specific domain)? THe site seems to track clicks, so I assume the data resides somewhere

any help appreciated


(Carson) #93

Hi, I would love help building the following queries… my gratitude and thanks to you in advance!!

In the [past thirty days / 31-60 days ago / 61-90 days ago]

  1. Out of the total number of users who have visited the site what percentage have
    a. Liked a post
    b. Shared a post
    c. Responded to a post
    d. Liked OR shared OR responded to a post
    e. created a new topic
    f. created a new topic that included a “?” within the first post

  2. Out of all the active topics, what percentage have had more than 9 replies?

The goal of the query is to see these percentages for each time period as well as to see the trend line over the previous three months.


(Carson) #94

Another request…

A list of who has shared a link to a particular topic, ranked by the people whose shares have led to the highest numbers of views of that topic.


#95

@Simon_Cossar can probably help with these.


(Robert McIntosh) #97

It may not be the best way to do it (I still need to find a way to tell it to convert minutes to “Daily”, “Weekly” etc) but I find this information useful in order to know what digests my members are receiving - plus I analyse the results in excel to monitor the % who turn off digests.

SELECT users.id AS "ID",
users.username AS "Username",
user_options.email_digests AS "Digests",
user_options.digest_after_minutes AS "Timing"
FROM users
JOIN  user_options ON users.id =  user_options.user_id
ORDER BY ID

(Jp Robotshop) #98

Indeed :slight_smile: Kinda struggled with it but found a way to make it work. So, here you go!

-- [params]
-- int :months_ago = 0

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
t.category_id,

SUM (CASE WHEN (u.admin = 'false' OR u.moderator = 'false') THEN 1 ELSE 0 END) AS "Solved by Members",
SUM (CASE WHEN (u.admin = 't' OR u.moderator = 't') THEN 1 ELSE 0 END) AS "Solved by Staff"

FROM user_actions ua
JOIN topics t
ON t.id = ua.target_topic_id
JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
JOIN users u
ON u.id = ua.user_id
WHERE t.category_id = ANY ('{[LIST CATEGORY IDS]}'::int[])
AND ua.action_type = 15

GROUP BY t.category_id ORDER BY t.category_id ASC

#99

Hello everyone, Would you please help me building a query for daily tracking the total number of visits for users from each group on discourse, also I need it to include that group’s total activities (topics created, posts created, time read, likes given and likes received) as I need to track the daily progress of 4 different Discourse groups in the first 3 weeks of enrollment.

My gratitude and thanks to you in advance!!


(Kyle Selby) #100

Hello all,

I am attempting to filter out a group I created named “Employees” from this data explorer tab, what am I doing wrong? How do I exclude group that I have created?:

    SELECT 
    sum(p.score) / count(p) as "average score per post", 
    count(p.id) as post_count, 
    p.user_id
FROM posts p
JOIN users u ON u.id = p.user_id
WHERE p.created_at >= CURRENT_DATE - INTERVAL '6 month'
  AND NOT u.admin
  AND NOT u.employees
  AND u.active
GROUP by user_id, u.views
HAVING count(p.id) > 10
ORDER BY sum(p.score) / count(p) DESC
LIMIT 20

(Brendan Schlagel) #101

The users table has primary_group_id (integer) so you could use that in your query if “Employees” is set as primary group for these users. Otherwise I think you’d have to do a separate join w/ the groups table. (Note — I don’t know the details of table structure off the top of my head but the little search thing on the right when editing a Data Explorer query is super useful for this!)


(Simon Cossar) #102

To get a list of users that excludes the members of a group, you could try something like this. This will exclude the members of the ‘employees’ group. It should be possible to rework your query to use this.

WITH group_users AS (
SELECT user_id
FROM group_users gu
JOIN groups g
ON g.id = gu.group_id
WHERE g.name = 'employees'
)

SELECT
u.id AS user_id
FROM users u
WHERE u.id NOT IN (SELECT * FROM group_users)
ORDER BY user_id

(Aaron Bell) #103

Is there a way for me to add profile attributes that get set up on sign up and can later be modified by the user?

Then I would like to present these statistics on the landing page to visualize the community.


#104

I’d like to get a better understanding of what percentage of posts are made by email vs. online. Same for private messages.

Does anyone have such a query?


(Carson) #105

Any thoughts on how to write a query to find out exactly how many minutes a member has been logged into the site? For all-time? For a particular week?


(Jeff Atwood) #106

If you are looking for read time per week that is already available in /users


(Carson) #107

I’m trying to track how much time moderators are spending on the site. Would ‘read time’ be within, say, 10% of total time they have been logged on and viewing the site?