Dashboard Report - Overall Sentiment

This is an SQL version of the Dashboard Report for Overall Sentiment.

:discourse: This report requires the Discourse AI plugin and Sentiment Analysis to be enabled.

This dashboard report compares the number of posts classified either positive or negative with the “Sentiment” AI, over a specified period.

By leveraging sentiment analysis, the report categorizes posts as either positive or negative based on a defined threshold, which can be adjusted. This categorization is then used to calculate the overall sentiment for each day within the given date range. This report is particularly useful for understanding the general mood or sentiment of the community discussions during specific times.

-- [params]
-- date :start_date = 2024-01-13
-- date :end_date = 2024-02-14
-- int :threshold = 60

SELECT
    DATE_TRUNC('day', p.created_at)::DATE AS date,
    COUNT(CASE WHEN (classification->>'positive')::integer > :threshold THEN 1 ELSE NULL END) AS positive_sentiment_posts,
    COUNT(CASE WHEN (classification->>'negative')::integer > :threshold THEN 1 ELSE NULL END) AS negative_sentiment_posts,
    COUNT(CASE WHEN (classification->>'positive')::integer > :threshold THEN 1 ELSE NULL END) - COUNT(CASE WHEN (classification->>'negative')::integer > :threshold THEN 1 ELSE NULL END) AS overall_sentiment
FROM
    classification_results AS cr
    INNER JOIN posts p ON p.id = cr.target_id AND cr.target_type = 'Post'
    INNER JOIN topics t ON t.id = p.topic_id
    INNER JOIN categories c ON c.id = t.category_id
WHERE
    t.archetype = 'regular' AND
    p.user_id > 0 AND
    cr.model_used = 'sentiment' AND
    (p.created_at > :start_date AND p.created_at < :end_date)
GROUP BY
    DATE_TRUNC('day', p.created_at)

SQL Query Explanation

The SQL query operates by performing the following steps:

  • Parameter Definition: It starts by defining three parameters:
    • :start_date and :end_date to specify the date range for the analysis.
    • :threshold to set the minimum score for classifying sentiments as positive or negative. The default value for the :threshold is set to 60 to match the dashboard report.
  • Data Selection and Joining:
    • The query selects data from the classification_results table, which stores the results of sentiment analysis on posts.
    • It joins the classification_results with the posts table to filter posts based on their creation date and ensures the analysis is only for posts (cr.target_type = 'Post').
    • Further joins with the topics and categories tables ensure the analysis is limited to regular posts in specific categories.
  • Sentiment Classification:
    • For each post, it checks the sentiment score from the classification_results table. If the score for positive sentiment is greater than the threshold, it counts the post as positive. Similarly, it counts a post as negative if its negative sentiment score exceeds the threshold.
  • Aggregation:
    • The query then groups the results by day, based on the created_at timestamp of each post. For each day, it calculates:
      • The total number of positive posts.
      • The total number of negative posts.
      • The overall sentiment, which is the difference between the total number of positive and negative posts.

Example Results

date positive_sentiment_posts negative_sentiment_posts overall_sentiment
2024-01-13 10 21 -11
2024-01-14 11 20 -9
2024-01-15 23 7 16
2024-01-16 27 10 17
2024-01-17 47 22 25
5 Likes

9 posts were split to a new topic: Problems with Sentiment Backfill