Calculate total participation moments by month using Data Explorer

Hey there Data Explorer wizards!

Has anyone ever tried creating a query that will give the count for total participation moments by moth using Data Explorer queries?

I mean total number of all posts, solutions, likes by month.

Thanks in advance!

2 Likes

Look if you like this.
https://github.com/SidVal/discourse-data-explorer/blob/queries/querys.md

1 Like

I don’t see the exact one I’m looking for

I’m not a Data Explorer expert, but I like writing queries when I’ve got time. Before I get too carried away with this, I’m assuming you’re looking for a month by month breakdown of the stats. If so, something like this might work:

--[params]
-- date :start_date

WITH month_starts AS (
SELECT generate_series(date_trunc('month', :start_date::date), CURRENT_DATE, interval '1 month')::date AS month_start
),
monthly_posts AS (
SELECT
month_start,
COUNT(1) AS posts_count
FROM posts p
JOIN month_starts
ON p.created_at::date >= month_start AND p.created_at::date <= month_start + interval '1 month - 1 day'
WHERE p.deleted_at IS NULL
AND p.post_type = 1
AND p.created_at >= :start_date
GROUP BY month_start
),
monthly_total_users AS (
SELECT
month_start,
COUNT(1) AS total_users_count
FROM users u
JOIN month_starts
ON u.created_at::date <= month_start + interval '1 month - 1 day'
WHERE u.id > 0
GROUP BY month_start
),
monthly_active_users AS (
SELECT
month_start,
COUNT(DISTINCT user_id) AS active_users_count
fROM user_visits uv
JOIN month_starts
ON uv.visited_at >= month_start AND uv.visited_at <= month_start + interval '1 month - 1 day'
WHERE uv.visited_at >= :start_date
GROUP BY month_start
),
monthly_solutions AS (
SELECT
month_start,
COUNT(1) AS solutions_count
FROM user_actions ua
JOIN month_starts ms
ON ua.created_at::date >= month_start AND ua.created_at::date <= month_start + interval '1 month - 1 day'
WHERE ua.action_type = 15
AND ua.created_at >= :start_date
GROUP BY month_start
),
monthly_likes AS (
SELECT
month_start,
COUNT(1) AS likes_count
FROM user_actions ua
JOIN month_starts ms
ON ua.created_at::date >= month_start AND ua.created_at::date <= month_start + interval '1 month - 1 day'
WHERE ua.action_type = 2
AND ua.created_at >= :start_date
GROUP BY month_start
)

SELECT
ms.month_start,
COALESCE(posts_count, 0) AS posts_count,
COALESCE(total_users_count, 0) AS total_users_count,
COALESCE(active_users_count, 0) AS active_users_count,
COALESCE(solutions_count, 0) AS solutions_count,
COALESCE(likes_count, 0) AS likes_count
FROM month_starts ms
LEFT JOIN monthly_posts mp ON mp.month_start = ms.month_start
LEFT JOIN monthly_total_users mtu ON mtu.month_start = ms.month_start
LEFT JOIN monthly_active_users mau ON mau.month_start = ms.month_start
LEFT JOIN monthly_solutions mts ON mts.month_start = ms.month_start
LEFT JOIN monthly_likes ml ON ml.month_start = ms.month_start
ORDER BY month_start DESC

Before running the query, you need to supply a value for the start_date parameter. It should be in the form yyyy-mm--dd. The query is only extracting the month part from that date though. The reason for needing a start date parameter is because the query will time out on a large site if it’s run against the entire time period that the site has been online. Using the start date parameter, I’m able to run it on Meta for a period of a couple of years without it timing out.

One thing to note with the query is that the active_users_count column is returning the number of unique users who logged into the site during a month - it’s not checking if these users performed any actions on the site (for example, liked a post, or created a post.) Possibly the query could do that, but I’m concerned about timeout issues.

Let me know if this isn’t the type of data that you’re looking for. If the general idea is correct, let me know if you need additional data added to the query, or if you notice anything that seems wrong in its results.

5 Likes

Thanks @simon for sharing that that is really helpful and it’s mostly what I’m looking for! Can I ask you if it’s possible to make one change to the query? I don’t like to specify the start date. I would like to sum up all those values (posts, users, solutions, likes) and have the query result like that, by month:

This might be possible. I’ll give it a try. The start date parameter was added to the query at the last minute when I found the query would time out if I ran it on Meta for the full time that the site has been online. There might be ways to improve the query’s efficiency to get rid of the timeout issue. If not, the query should allow a time frame to be set instead of just asking for a start date. That way you’d be able to pull in all data for a site by running the query a few times with different time frames.

3 Likes

Sure! Waiting then for you once you have time to make that adjustment

Hi Konrad,

Below is the adjusted query.


WITH monthly_users AS (
    SELECT 
        date_part('year', created_at) AS year, 
        date_part('month', created_at) AS month,
        COUNT(*) AS "new_users_month"
    FROM users
    WHERE id > 0
    GROUP BY date_part('year', created_at), date_part('month', created_at)
    ORDER BY date_part('year', created_at) ASC, date_part('month', created_at)
),

monthly_posts AS (
	SELECT
        date_part('year', created_at) AS year, 
        date_part('month', created_at) AS month,
		COUNT(*) AS "posts_count"
	FROM posts p
	WHERE p.deleted_at IS NULL
		AND p.post_type = 1
	GROUP BY date_part('year', created_at), date_part('month', created_at)
	ORDER BY date_part('year', created_at) ASC, date_part('month', created_at)
),

monthly_active_users AS (
	SELECT
        date_part('year', visited_at) AS year, 
        date_part('month', visited_at) AS month,
		COUNT(DISTINCT user_id) AS "active_users_count"
	FROM user_visits uv
	GROUP BY date_part('year', visited_at), date_part('month', visited_at)
	ORDER BY date_part('year', visited_at) ASC, date_part('month', visited_at)
),

monthly_solutions AS (
	SELECT
        date_part('year', created_at) AS year, 
        date_part('month', created_at) AS month,
		COUNT(*) AS "solutions_count"
	FROM user_actions ua
	WHERE ua.action_type = 15
	GROUP BY date_part('year', created_at), date_part('month', created_at)
	ORDER BY date_part('year', created_at) ASC, date_part('month', created_at)
),

monthly_likes AS (
	SELECT
        date_part('year', created_at) AS year, 
        date_part('month', created_at) AS month,
		COUNT(*) AS "likes_count"
	FROM user_actions ua
	WHERE ua.action_type = 2
	GROUP BY date_part('year', created_at), date_part('month', created_at)
	ORDER BY date_part('year', created_at) ASC, date_part('month', created_at)
)

SELECT
    mu.year,
    mu.month,
    SUM(new_users_month) over (ORDER BY mu.year, mu.month rows between unbounded preceding AND current row) AS total_users,
    posts_count,
    COALESCE(active_users_count, 0) AS active_users_count,
    COALESCE(solutions_count, 0) AS solutions_count,
    COALESCE(likes_count, 0) AS solutions_count
FROM monthly_users mu
LEFT JOIN monthly_posts mp ON mp.year = mu.year AND mp.month = mu.month
LEFT JOIN monthly_active_users mau ON mau.year = mu.year AND mau.month = mu.month
LEFT JOIN monthly_solutions ms ON ms.year = mu.year AND ms.month = mu.month
LEFT JOIN monthly_likes ml ON ml.year = mu.year AND ml.month = mu.month
ORDER BY mu.year, mu.month 

If you need to add all the columns, to be exactly the same as the image, this is the query:

SQL details
WITH monthly_users AS (
    SELECT 
        date_part('year', created_at) AS year, 
        date_part('month', created_at) AS month,
        COUNT(*) AS new_users_month
    FROM users
    WHERE id > 0
    GROUP BY date_part('year', created_at), date_part('month', created_at)
    ORDER BY date_part('year', created_at) ASC, date_part('month', created_at)
),

monthly_posts AS (
	SELECT
        date_part('year', created_at) AS year, 
        date_part('month', created_at) AS month,
		COUNT(*) AS posts_count
	FROM posts p
	WHERE p.deleted_at IS NULL
		AND p.post_type = 1
	GROUP BY date_part('year', created_at), date_part('month', created_at)
	ORDER BY date_part('year', created_at) ASC, date_part('month', created_at)
),

monthly_active_users AS (
	SELECT
        date_part('year', visited_at) AS year, 
        date_part('month', visited_at) AS month,
		COUNT(DISTINCT user_id) AS active_users_count
	FROM user_visits uv
	GROUP BY date_part('year', visited_at), date_part('month', visited_at)
	ORDER BY date_part('year', visited_at) ASC, date_part('month', visited_at)
),

monthly_solutions AS (
	SELECT
        date_part('year', created_at) AS year, 
        date_part('month', created_at) AS month,
		COUNT(*) AS solutions_count
	FROM user_actions ua
	WHERE ua.action_type = 15
	GROUP BY date_part('year', created_at), date_part('month', created_at)
	ORDER BY date_part('year', created_at) ASC, date_part('month', created_at)
),

monthly_likes AS (
	SELECT
        date_part('year', created_at) AS year, 
        date_part('month', created_at) AS month,
		COUNT(*) AS likes_count
	FROM user_actions ua
	WHERE ua.action_type = 2
	GROUP BY date_part('year', created_at), date_part('month', created_at)
	ORDER BY date_part('year', created_at) ASC, date_part('month', created_at)
)

SELECT
    mu.year,
    mu.month,   
    SUM(new_users_month + COALESCE(posts_count,0) + 
        COALESCE(active_users_count, 0) + 
        COALESCE(solutions_count, 0) + 
        COALESCE(likes_count, 0)) 
        over (ORDER BY mu.year, mu.month rows between unbounded preceding AND current row) AS sum_total
FROM monthly_users mu
LEFT JOIN monthly_posts mp ON mp.year = mu.year AND mp.month = mu.month
LEFT JOIN monthly_active_users mau ON mau.year = mu.year AND mau.month = mu.month
LEFT JOIN monthly_solutions ms ON ms.year = mu.year AND ms.month = mu.month
LEFT JOIN monthly_likes ml ON ml.year = mu.year AND ml.month = mu.month
ORDER BY mu.year, mu.month 
6 Likes

That’s exactly what I was looking for! Thanks again @michebs! Have you folks thought about creating some sort of open-source collection of Data Explorer queries and sharing it with Discourse users?

1 Like

Yes, that’s the built in list of queries that are included with Data Explorer :wink:

2 Likes

yep I know but is it something that you expand? for example based on the questions here from the forum

We can accept PRs to add queries, here’s an example:

https://github.com/discourse/discourse-data-explorer/pull/43

Besides that, What cool data explorer queries have you come up with? and @SidV’s list https://github.com/SidVal/discourse-data-explorer/blob/queries/querys.md have a ton of useful queries too!

4 Likes

Great! Thanks for sharing that!

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.