Understanding what our users are posting about

I would like to create a sql based report that can provide an understanding as to what type of thing our users are posting about.

Categories, topics, tags - these are the fields I have ascertained that will provide this to me, and postcount would be the metric (I think, essentially wanting to use this to understand popularity)

Are there any other fields that would support this? Is there an existing query that I can use as I would assume others have requested this?

2 Likes

not a SQL query, but presume you’ve noticed that:

/categories gives you a number of new Topics/month for each Category, which can be quite useful.

/tags gives you a count of Topics by tag

Just in case you weren’t aware (but might have been!)

1 Like

I’m not sure what you are referring to here.

Are these the reports on the admin part of a community?

I have seen this but want to essentially have them all in one - so you can see how many topics or tags a category has etc

they are URL paths.

I’m not sure I can quite envisage what you have in mind for including categories, tags, and topics, with a post count for each, in the same report/query.

I think you could likely do it in two - one for categories and one for tags, with a count of new topics and new posts within a certain timeframe for each one. Maybe even include the number of users posting?

So it would give something like this as the result table:

category new topics new posts users
category a 9 15 4
category b 56 167 32
2 Likes

Ok, so I’ve input Categories - Discourse Meta as I assume that’s what you meant.

Essentially I want an output of categories/tags/topics in one output for these reasons:

  • Identify duplicates - a topic as far as I understand is the word used for the start of a conversation/thread. A user could have added a similar topic to 2 different categories, be good to understand this to understand user behaviour/potentially tweak the categories to make them clearer
  • Understand what kinds of topics we get per category - a category could be something like “cars” but naturally that could house a lot of different types of topics, wanting to see what users are really talking about
  • Tags - in the instance we have I can see that tags have been used as some kind of cross over multiple different categories, therefore it would be good to understand the posts from this lens also
    Essentially from what I can gather, I need to join categories, tags and topics together using sql and possibly posts and was wondering if this had already been done to view the code/is there some kind of code/query library?
category topic tag posts users
cars love cars wheels 44 1
cars hate cars wheels 32 3
cars hate cars doors 39 4
cars how do cars work? doors 32 1
how do things work how do cars work? how tos 32 3

This is an example of the vision. This would allow for further analysis to understand things such as “what % of users that post in cars category talk about wheels”

I undertand it would be in data explorer combining posts, topics, tags, categories, possibly another table - I just posted in here to understand if this had already been done/where sql queries exist that have been created previously (not on our account, I mean in general)

Hoping this makes sense, only started investigating yesterday

Aah, I see now. You want a topic list with extra details rather than a summary overview. I think that should be possible. :+1:

I think the only quirk would be that a topic can have multiple tags, but let’s whip something up and see where we land. :slight_smile:

But for browsing other existing queries, there are the ones available as standard from your dashboard (you can see the SQL versions here grouped under dashboard-sql), as well as some stock reports bundled with the data explorer, and there’s a whole host of custom ones here on meta grouped under the sql-query tag.


@SStrong - Perhaps something along these lines:

-- [params]
-- date :start_date = 2024-01-01
-- date :end_date = 2024-12-31


WITH tag_names AS (
   
    SELECT 
        t.id AS topic_id,
        string_agg(tags.name, ', ' ORDER BY tags.name) AS "tags"
    FROM topics t
      JOIN topic_tags tt ON tt.topic_id = t.id
      JOIN tags ON tags.id = tt.tag_id
    WHERE t.created_at BETWEEN :start_date AND :end_date
    GROUP BY t.id
    ),
    
user_count AS (

    SELECT
        p.topic_id,
        COUNT(DISTINCT p.user_id) AS users
    FROM posts p
      JOIN topics t ON t.id = p.topic_id
    WHERE t.created_at BETWEEN :start_date AND :end_date
      AND t.deleted_at IS NULL
      AND p.deleted_at IS NULL
      AND t.archetype = 'regular'
      AND p.post_type = 1
      AND p.user_id > 0
    GROUP BY p.topic_id
        
)

SELECT 
    t.category_id,
    t.id AS topic_id,
    tn.tags,
    t.posts_count,
    uc.users
FROM topics t
  JOIN tag_names tn ON tn.topic_id = t.id
  JOIN user_count uc ON uc.topic_id = t.id
WHERE t.created_at BETWEEN :start_date AND :end_date
  AND t.archetype = 'regular'
  AND t.deleted_at IS NULL
ORDER BY t.category_id, t.title
2 Likes

Apologises, thought I had already responded to this.

Is it possible to amend the code to hold the category name and the topic name please? I don’t know if “post name” is a thing or if that is actually topic name?

I attempted to amend the code myself but I don’t understand the way the tables behave together enough currently as it came out with 0 results instead of just the same volume of records but with words not IDs.

That’s not a problem. :slight_smile: When viewed in the data explorer the category_id and topic_id (and lots of others) automatically get converted into usable onsite links, but if exporting to analyse somewhere else you can use the category names and topic titles instead:

-- [params]
-- date :start_date = 2024-01-01
-- date :end_date = 2024-12-31


WITH tag_names AS (
   
    SELECT 
        t.id AS topic_id,
        string_agg(tags.name, ', ' ORDER BY tags.name) AS "tags"
    FROM topics t
      JOIN topic_tags tt ON tt.topic_id = t.id
      JOIN tags ON tags.id = tt.tag_id
    WHERE t.created_at BETWEEN :start_date AND :end_date
    GROUP BY t.id
    ),
    
user_count AS (

    SELECT
        p.topic_id,
        COUNT(DISTINCT p.user_id) AS users
    FROM posts p
      JOIN topics t ON t.id = p.topic_id
    WHERE t.created_at BETWEEN :start_date AND :end_date
      AND t.deleted_at IS NULL
      AND p.deleted_at IS NULL
      AND t.archetype = 'regular'
      AND p.post_type = 1
      AND p.user_id > 0
    GROUP BY p.topic_id
        
)

SELECT 
    c.name AS category_name,
    t.title,
    tn.tags,
    t.posts_count,
    uc.users
FROM topics t
  JOIN tag_names tn ON tn.topic_id = t.id
  JOIN user_count uc ON uc.topic_id = t.id
  JOIN categories c ON c.id = t.category_id
WHERE t.created_at BETWEEN :start_date AND :end_date
  AND t.archetype = 'regular'
  AND t.deleted_at IS NULL
ORDER BY t.category_id, t.title
1 Like