You still have the Trust Level groups, no?
I don’t have the time to mess with the query now, but I think you should be able to get what you want from tables other than the badges table. eg.
users.trust_level
groups.id
- groups.user_count
You still have the Trust Level groups, no?
I don’t have the time to mess with the query now, but I think you should be able to get what you want from tables other than the badges table. eg.
users.trust_level
groups.id
- groups.user_count
Thanks for the suggestion - but do the trust level groups keep track of when they were acquired? If so that would work. With badges we can see when they were granted.
Another thought: instead of 1 above, I could also benefit from tracking the number of posts every month but excluding posts in private staff-only categories. Perhaps that’s easier to get. In fact, it would be great to be able to filter by user on the /admin/reports/posts
stats page like you can with preview digest.
For that I think a JOIN to the group_users table to get created_at / updated_at would work
Here is a “Topic Participation” query, someone else is going to want to check it though.
Data Explorer: topic-participation.dcquery (v4).json (7.9 KB)
-1
) for responses or check for replies within X days.impersonate
system
user to see everything.exclude_groups_topics
,
) to not exclude any.exclude_groups_replies
,
) to not exclude any.exclude_category_slugs
,
) to not exclude any.reply_within_x_days
-1
to restrict responses check to that month (default).0
to restrict responses to the same day topic was created.15
to count as response up to 15 day after initial topic creation even if 15 days is outside of that month.pct_multiplier
and pct_places
pct_multiplier
to 1
and pct_places
to 6
to get percent values ready for Excel.pct_multiplier
to 100
and pct_places
to 1
to display as readable percentage values (default).-- [params]
-- string :impersonate = system
-- string :exclude_groups_topics = staff
-- string :exclude_groups_replies = ,
-- string :exclude_category_slugs = staff,dev,simply-thank-you,was-it-you,wanted
-- int :reply_within_x_days = -1
-- int :pct_multiplier = 100
-- int :pct_places = 1
WITH
year_months AS (
SELECT
to_char(date(day),'YYYY-MM') as year_month
FROM
generate_series(
(date_trunc('month', CURRENT_DATE) - INTERVAL '1 year' ),
CURRENT_DATE,
interval '1 month'
) AS day
),
impersonated_usernames AS (
SELECT unnest(string_to_array( :impersonate, ',')) AS username
),
impersonated_users AS (
SELECT id
FROM users AS u
JOIN impersonated_usernames iun ON iun.username = u.username
WHERE iun.username = u.username
),
excluded_group_names_topics AS (
SELECT unnest(string_to_array( :exclude_groups_topics, ',')) AS name
),
excluded_group_names_replies AS (
SELECT unnest(string_to_array( :exclude_groups_replies, ',')) AS name
),
excluded_users_topics AS (
SELECT user_id AS id
FROM users AS u
JOIN group_users AS gu ON gu.user_id = u.id
JOIN groups AS g ON g.id = gu.group_id
JOIN excluded_group_names_topics eg ON eg.name = g.name
WHERE g.name = eg.name
),
excluded_users_replies AS (
SELECT user_id AS id
FROM users AS u
JOIN group_users AS gu ON gu.user_id = u.id
JOIN groups AS g ON g.id = gu.group_id
JOIN excluded_group_names_replies eg ON eg.name = g.name
WHERE g.name = eg.name
),
excluded_category_slugs AS (
SELECT unnest(string_to_array( :exclude_category_slugs, ',')) AS slug
),
excluded_categories AS (
SELECT id
FROM categories AS c
JOIN excluded_category_slugs ec ON ec.slug = c.slug
WHERE ec.slug = c.slug
),
restricted_topics AS (
SELECT
t.id,
t.created_at,
t.user_id
FROM
topics AS t
JOIN users u on u.id IN (SELECT id FROM impersonated_users)
JOIN user_stats AS us ON us.user_id = u.id
JOIN user_options AS uo ON uo.user_id = u.id
JOIN categories c ON c.id = t.category_id
LEFT JOIN topic_users tu ON tu.topic_id = t.id AND tu.user_id = u.id
WHERE
u.id IN (SELECT id FROM impersonated_users)
/** test **
AND t.created_at > (date_trunc('day', CURRENT_DATE) - INTERVAL '1 day' )
AND t.created_at < (date_trunc('day', CURRENT_DATE) )
** test **/
AND t.created_at > (date_trunc('month', CURRENT_DATE) - INTERVAL '1 year' )
AND t.archetype <> 'private_message'
AND t.deleted_at IS NULL
AND (t.visible OR u.admin OR u.moderator)
AND (
NOT c.read_restricted OR u.admin OR category_id IN (
SELECT c2.id FROM categories c2
JOIN category_groups cg ON cg.category_id = c2.id
JOIN group_users gu ON gu.user_id IN (SELECT id FROM impersonated_users)
AND cg.group_id = gu.group_id
WHERE c2.read_restricted
)
)
AND t.user_id NOT IN (SELECT id FROM excluded_users_topics)
AND t.category_id NOT IN (SELECT id from excluded_categories)
),
restricted_posts AS (
SELECT
p.id,
p.created_at
FROM
posts AS p
LEFT JOIN topics AS t ON p.topic_id = t.id
JOIN users u on u.id IN (SELECT id FROM impersonated_users)
JOIN user_stats AS us ON us.user_id = u.id
JOIN user_options AS uo ON uo.user_id = u.id
JOIN categories c ON c.id = t.category_id
LEFT JOIN topic_users tu ON tu.topic_id = t.id AND tu.user_id = u.id
WHERE
u.id IN (SELECT id FROM impersonated_users)
/** test **
AND p.created_at > (date_trunc('day', CURRENT_DATE) - INTERVAL '1 day' )
AND p.created_at < (date_trunc('day', CURRENT_DATE) )
** test **/
AND p.created_at > (date_trunc('month', CURRENT_DATE) - INTERVAL '1 year' )
AND t.archetype <> 'private_message'
AND t.deleted_at IS NULL
AND (t.visible OR u.admin OR u.moderator)
AND (
NOT c.read_restricted OR u.admin OR category_id IN (
SELECT c2.id FROM categories c2
JOIN category_groups cg ON cg.category_id = c2.id
JOIN group_users gu ON gu.user_id IN (SELECT id FROM impersonated_users)
AND cg.group_id = gu.group_id
WHERE c2.read_restricted
)
)
AND t.user_id NOT IN (SELECT id FROM excluded_users_topics)
AND t.category_id NOT IN (SELECT id from excluded_categories)
),
qualifying_topics AS (
SELECT
rt.id,
rt.created_at::date AS created_at,
MIN(p.post_number) AS first_reply,
COUNT(p) AS new_topic_posts
FROM
restricted_topics AS rt
LEFT JOIN posts p ON
p.topic_id = rt.id
AND p.deleted_at IS NULL
AND p.user_id != rt.user_id
AND p.user_id NOT IN (SELECT id FROM excluded_users_replies)
AND (
( :reply_within_x_days >= 0
AND p.created_at < ( rt.created_at::date + ( :reply_within_x_days + 1 ) )
)
OR
(
:reply_within_x_days < 0
AND to_char(date(p.created_at),'YYYY-MM') = to_char(date(rt.created_at),'YYYY-MM')
)
)
WHERE
rt.created_at > (date_trunc('month', CURRENT_DATE) - INTERVAL '1 year' )
GROUP BY rt.id, rt.created_at
),
topic_year_month_rate AS (
SELECT
to_char(date(tt.created_at),'YYYY-MM') as year_month,
COUNT(*) AS topics,
SUM(new_topic_posts) AS new_topic_posts,
SUM(CASE WHEN tt.first_reply IS NULL THEN 1 ELSE 0 END) AS no_response,
SUM(CASE WHEN tt.first_reply IS NOT NULL THEN 1 ELSE 0 END) AS response
FROM qualifying_topics AS tt
GROUP BY
year_month
ORDER BY
year_month
),
posts_year_month_rate AS (
SELECT
to_char(date(rp.created_at),'YYYY-MM') as year_month,
COUNT(*) as posts
FROM restricted_posts as rp
GROUP BY
year_month
ORDER BY
year_month
),
raw_year_month_rate AS (
SELECT
year_month,
SUM(posts) AS posts,
SUM(new_topic_posts) AS new_topic_posts,
(SUM(posts) - SUM(new_topic_posts)) AS old_topic_posts,
SUM(topics) AS topics,
SUM(no_response) AS no_response,
SUM(response) AS response
FROM (
SELECT year_month, topics, 0 AS posts, new_topic_posts, 0 AS old_topic_posts, no_response, response FROM topic_year_month_rate
UNION ALL
SELECT year_month, 0 AS topics, posts, 0 AS new_topic_posts, 0 AS old_topic_posts, 0 AS no_response, 0 AS response FROM posts_year_month_rate
) AS combined
GROUP BY year_month
ORDER BY year_month
),
year_month_rate AS (
SELECT
rymr.*,
(CASE
WHEN topics <= 0 THEN 0
ELSE (
ROUND( (:pct_multiplier + 0.0) * (
(response * 1.0) / (topics * 1.0)
), :pct_places)
) END
) AS response_rate
FROM
raw_year_month_rate rymr
)
SELECT
*
FROM
year_months AS d
LEFT JOIN (
SELECT * FROM year_month_rate
) AS t USING (year_month)
ORDER BY
year_month
If you don’t want to use any exclusions just put a comma (,
) in that input box.
So it should look a little like this:
So the columns are:
posts
new_topic_posts
old_topic_posts
topics
no_response
staff
member might have replied, but if you excluded staff it would be counted as no_response
if that topic wasn’t created itself by an excluded user.response
response_rate
So for #2 and #3:
Here is a 2nd version based on group_users
table, I added “Trust Level 3” as well:
Data Explorer: count-new-tl1-or-tl2-users-past-12-months-v2.dcquery.json (1.3 KB)
WITH
year_months AS (
SELECT
to_char(date(day),'YYYY-MM') as year_month
FROM
generate_series(
(date_trunc('month', CURRENT_DATE) - INTERVAL '1 year' ),
CURRENT_DATE,
interval '1 month'
) AS day
),
qualifying_users AS (
SELECT
gu.user_id,
to_char(date(gu.created_at),'YYYY-MM') as year_month,
SUM(CASE WHEN g.name = 'trust_level_1' THEN 1 ELSE 0 END) AS tl1,
SUM(CASE WHEN g.name = 'trust_level_2' THEN 1 ELSE 0 END) AS tl2,
SUM(CASE WHEN g.name = 'trust_level_3' THEN 1 ELSE 0 END) AS tl3
FROM group_users AS gu
JOIN groups AS g ON g.id = gu.group_id
WHERE
gu.created_at > ( date_trunc('month', CURRENT_DATE) - INTERVAL '1 year' )
AND ( g.name = 'trust_level_1' OR g.name = 'trust_level_2' OR g.name = 'trust_level_3' )
GROUP BY
gu.user_id,
year_month
),
year_month_rate AS (
SELECT
year_month,
SUM(tl1) AS tl1,
SUM(tl2) AS tl2,
SUM(tl3) AS tl3
FROM qualifying_users q
GROUP BY
year_month
)
SELECT
*
FROM
year_months AS d
LEFT JOIN (
SELECT * FROM year_month_rate
) AS t USING (year_month)
ORDER BY
year_month
It should be noted that for the Topic Participation query:
The query could be improved greatly, it currently checks for a reply within that calendar month for the topic.
So if someone creates a topic on April 28th there are only a couple of days left in that month to get a reply.
But if if someone creates a topic on April 1st they have the entire month to get a reply and have it counted.
Some kind of defined period for checking for replies would be good, perhaps 15 days.
EDIT
Topic Participation query has been updated.
Now includes reply_within_x_days
parameter to allow the topic to be counted if replies are within that period.
If you set reply_within_x_days
to 0
you can get figures on how many topics got replies the same day as the original topic was posted.
Hmm… so maybe my second idea of just tracking the number of posts every month to public community categories is more sensible/meaningful.
Thanks for pulling this together - I really appreciate it!
I believe as long as you don’t “use” last months figures until at least reply_within_x_days
after the last day of the month these are good numbers.
Actually I’m seeing 1
days (78.6%), 4
days (78.6%) and 15
days (79.5%) for one sample set of data - so you could pick a smaller number of days and measure that, Yes 1
day and 4
days are the same.
this is awesome - works like a charm!
This topic participation by month query is pretty neat, and will really help to give insight into activity over time by different groups.
I think what I need is actually something else, something more simple, for my monthly post activity indicator. I’d just like to track the number of posts per month total, excluding private categories or excluding categories that a representative member has access to.
Has anyone done this already?
Loving the way you share queries, btw, this is extremely handy. Maybe we should set up a wiki topic to gather shared queries and maintain them.
So the "topic_count
per month" is already in the “Topic Participation” query results.
These results already exclude categories by slug.
I have no idea what this means:
But it certainly sounds “more complicated” and not “more simple”.
Hi Dean! Sorry I’m not making any sense. To try to clarify:
I’ve updated the “Topic Participation” query:
system
to see everything.-1
) for responses or check for replies within X days.It would be nice if someone would sanity check the query please.
To check and:
I think it works - but only if I limit to a user who doesn’t have access to every single post… otherwise it times out.
So great to be able to get the total number of posts per month in community discussions. Many thanks.
Here, you deserve a beer. Let me pour you one.
Yeah - it’s not optimised, you might want to make a change to search a shorter period.
Thanks for sharing this one!
I also noticed the “within X days” parameter not only defines what counts as a response
for the response_rate
, but also determines what is new_topic_post
and what is an old_topic_post
.
Considering whether it’s useful to separate those two parameters… looks like it should be pretty easy to do if I want to.
Currently it’s “as designed”, but if you wish to separate that into a parameter, happy to make the post a wiki post so it can be updated to add that feature.
Somebody else reviewing it to sanity check things is always helpful.
I have to admit, this query is quite impressive but also quite overwhelming
Could you offer some guidance how would one go about adding one more column, mainly the ratio of topics marked as solved (with customizable period, by default within 30 days from being created) to all topics created in that month?
I have hard time modifying the query