Solved and Unsolved Topic Stats with Date and Tag Parameters

This Data Explorer report provides a comprehensive analysis of solved and unsolved topics on a site, within a specified date range, and optionally filtered by a specific tag.

:discourse: This report requires the Discourse Solved plugin to be enabled.

This report is particularly useful for administrators and moderators looking to understand community responsiveness and to identify areas for improvement in user support and engagement.

Solved and Unsolved Topic Stats with Date and Tag Parameters

--[params]
-- date :start_date = 2022-01-01
-- date :end_date = 2024-01-01
-- text :tag_name = all

WITH valid_topics AS (
    SELECT 
        t.id,
        t.user_id,
        t.title,
        t.views,
        (SELECT COUNT(*) FROM posts WHERE topic_id = t.id AND deleted_at IS NULL AND post_type = 1) - 1 AS "posts_count", 
        t.created_at,
        (CURRENT_DATE::date - t.created_at::date) AS "total_days",
        STRING_AGG(tags.name, ', ') AS tag_names, -- Aggregate tags for each topic
        c.name AS category_name
    FROM topics t
    LEFT JOIN topic_tags tt ON tt.topic_id = t.id
    LEFT JOIN tags ON tags.id = tt.tag_id
    LEFT JOIN categories c ON c.id = t.category_id
    WHERE t.deleted_at IS NULL
        AND t.created_at::date BETWEEN :start_date AND :end_date
        AND t.archetype = 'regular'
    GROUP BY t.id, c.name
),

solved_topics AS (
    SELECT 
        vt.id,
        tcf.created_at
    FROM topic_custom_fields tcf
    INNER JOIN valid_topics vt ON vt.id = tcf.topic_id
    WHERE tcf.name = 'accepted_answer_post_id'
),

last_reply AS (
    SELECT p.topic_id, p.user_id FROM posts p
    INNER JOIN (SELECT topic_id, MAX(id) AS post FROM posts p
                WHERE deleted_at IS NULL
                AND post_type = 1
                GROUP BY topic_id) x ON x.post = p.id
),

first_reply AS (
    SELECT p.topic_id, p.user_id, p.created_at FROM posts p
    INNER JOIN (SELECT topic_id, MIN(id) AS post FROM posts p
                WHERE deleted_at IS NULL
                AND post_type = 1
                AND post_number > 1
                GROUP BY topic_id) x ON x.post = p.id
)

SELECT
    CASE 
        WHEN st.id IS NOT NULL THEN 'solved'
        ELSE 'unsolved'
    END AS status,
    vt.tag_names, 
    vt.category_name,
    vt.id AS topic_id,
    vt.user_id AS topic_user_id,
    ue.email,
    vt.title,
    vt.views,
    lr.user_id AS last_reply_user_id,
    ue2.email AS last_reply_user_email,
    vt.created_at::date AS topic_create,
    COALESCE(TO_CHAR(fr.created_at, 'YYYY-MM-DD'), '') AS first_reply_create,
    COALESCE(TO_CHAR(st.created_at, 'YYYY-MM-DD'), '') AS solution_create,
    COALESCE(fr.created_at::date - vt.created_at::date, 0) AS "time_first_reply(days)",
    COALESCE(CEIL(EXTRACT(EPOCH FROM (fr.created_at - vt.created_at)) / 3600.00), 0) AS "time_first_reply(hours)",
    COALESCE(st.created_at::date - vt.created_at::date, 0) AS "time_solution(days)",
    COALESCE(CEIL(EXTRACT(EPOCH FROM (st.created_at - vt.created_at)) / 3600.00), 0) AS "time_solution(hours)",
    vt.created_at::date,
    vt.posts_count AS number_of_replies,
    vt.total_days AS total_days_without_solution
FROM valid_topics vt
LEFT JOIN last_reply lr ON lr.topic_id = vt.id
LEFT JOIN first_reply fr ON fr.topic_id = vt.id
LEFT JOIN solved_topics st ON st.id = vt.id
INNER JOIN user_emails ue ON vt.user_id = ue.user_id AND ue."primary" = true
LEFT JOIN user_emails ue2 ON lr.user_id = ue2.user_id AND ue2."primary" = true
WHERE (:tag_name = 'all' OR vt.tag_names ILIKE '%' || :tag_name || '%')
GROUP BY st.id, vt.tag_names, vt.category_name, vt.id, vt.user_id, ue.email, vt.title, vt.views, lr.user_id, ue2.email, vt.created_at, fr.created_at, st.created_at, vt.posts_count, vt.total_days
ORDER BY topic_create, vt.total_days DESC

SQL Query Explanation

The report is generated through a complex SQL query that utilizes Common Table Expressions (CTEs) to organize and process the data efficiently. The query is structured as follows:

  • valid_topics: This CTE filters topics by the specified date range and archetype (‘regular’), excluding deleted topics. It also aggregates tags associated with each topic for later filtering by tag name if specified.
  • solved_topics: Identifies topics that have been marked as solved by looking for a custom field named ‘accepted_answer_post_id’. This indicates that a solution has been provided and accepted for the topic.
  • last_reply: Determines the user who made the last reply on each topic by finding the maximum post ID (indicating the most recent post) that is not deleted and is of post type 1 (indicating a regular post).
  • first_reply: Similar to last_reply, but identifies the first user to reply to the topic after the original post.

The main query then combines these CTEs to compile a detailed report on each topic, including whether it’s solved or unsolved, tag names, category name, topic and user IDs, emails, views, reply counts, and timings for the first reply and solution.

Parameters

  • start_date: The beginning of the date range for which to generate the report.
  • end_date: The end of the date range for which to generate the report.
  • tag_name: The specific tag to filter topics by. Use ‘all’ to include topics with any tag.

Results

The report provides the following information for each topic within the specified parameters:

  • status: Indicates whether the topic has been solved or remains unsolved.
  • tag_names: Shows the tags associated with the topic
  • category_name: Shows the category associated with the topic
  • topic_id: The unique identifier for the topic.
  • topic_user_id: The ID of the user who created the topic.
  • user_email: The email address of the topic creator.
  • title: The title of the topic.
  • views: The number of views the topic has received.
  • last_reply_user_id: The ID of the user who made the last reply on the topic.
  • last_reply_user_email: The email address of the user who made the last reply.
  • topic_create: The date the topic was created.
  • first_reply_create: The date of the first reply to the topic.
  • solution_create: The date a solution was marked for the topic (if applicable).
  • time_first_reply(days/hours): The time taken to receive the first reply, in days and hours.
  • time_solution(days/hours): The time taken to solve the topic, in days and hours.
  • created_at: The creation date of the topic.
  • number_of_replies: The total number of replies to the topic.
  • total_days_without_solution: The total number of days the topic has been active without a solution.

Example Results

status tag_names category_name topic_id topic_user_id user_email title views last_reply_user_id last_reply_user_email topic_create first_reply_create solution_create time_first_reply(days) time_first_reply(hours) time_solution(days) time_solution(hours) created_at number_of_replies total_days_without_solution
solved support, password category1 101 1 user1@example.com How to reset my password? 150 3 user3@example.com 2022-01-05 2022-01-06 2022-01-07 1 24 2 48 2022-01-05 5 2
unsolved support, account category2 102 2 user2@example.com Issue with account activation 75 4 user4@example.com 2022-02-10 2022-02-12 2 48 0 0 2022-02-10 3 412
solved support category3 103 5 user5@example.com Can’t upload profile picture 200 6 user6@example.com 2022-03-15 2022-03-16 2022-03-18 1 24 3 72 2022-03-15 8 3
unsolved NULL category4 104 7 user7@example.com Error when posting 50 8 user8@example.com 2022-04-20 0 0 0 0 2022-04-20 0 373