This is an SQL version of the Dashboard Report for Overall Sentiment.
This report requires the Discourse AI plugin and Sentiment Analysis to be enabled.
This dashboard report compares the number of posts classified either positive or negative with the “Sentiment” AI, over a specified period.
By leveraging sentiment analysis, the report categorizes posts as either positive or negative based on a defined threshold, which can be adjusted. This categorization is then used to calculate the overall sentiment for each day within the given date range. This report is particularly useful for understanding the general mood or sentiment of the community discussions during specific times.
-- [params]
-- date :start_date = 2024-01-13
-- date :end_date = 2024-02-14
-- double :threshold = 0.60
SELECT
DATE_TRUNC('day', p.created_at)::DATE AS date,
COUNT(CASE WHEN (classification->>'positive')::float > :threshold THEN 1 ELSE NULL END) AS positive_sentiment_posts,
COUNT(CASE WHEN (classification->>'negative')::float > :threshold THEN 1 ELSE NULL END) AS negative_sentiment_posts,
COUNT(CASE WHEN (classification->>'positive')::float > :threshold THEN 1 ELSE NULL END) - COUNT(CASE WHEN (classification->>'negative')::float > :threshold THEN 1 ELSE NULL END) AS overall_sentiment
FROM
classification_results AS cr
INNER JOIN posts p ON p.id = cr.target_id AND cr.target_type = 'Post'
INNER JOIN topics t ON t.id = p.topic_id
INNER JOIN categories c ON c.id = t.category_id
WHERE
t.archetype = 'regular' AND
p.user_id > 0 AND
cr.model_used = 'cardiffnlp/twitter-roberta-base-sentiment-latest' AND
(p.created_at > :start_date AND p.created_at < :end_date)
GROUP BY
DATE_TRUNC('day', p.created_at)
SQL Query Explanation
The SQL query operates by performing the following steps:
- Parameter Definition: It starts by defining three parameters:
:start_date
and:end_date
to specify the date range for the analysis.:threshold
to set the minimum score for classifying sentiments as positive or negative. The default value for the:threshold
is set to 60 to match the dashboard report.
- Data Selection and Joining:
- The query selects data from the
classification_results
table, which stores the results of sentiment analysis on posts. - It joins the
classification_results
with theposts
table to filter posts based on their creation date and ensures the analysis is only for posts (cr.target_type = 'Post'
). - Further joins with the
topics
andcategories
tables ensure the analysis is limited to regular posts in specific categories.
- The query selects data from the
- Sentiment Classification:
- For each post, it checks the sentiment score from the
classification_results
table. If the score for positive sentiment is greater than the threshold, it counts the post as positive. Similarly, it counts a post as negative if its negative sentiment score exceeds the threshold.
- For each post, it checks the sentiment score from the
- Aggregation:
- The query then groups the results by day, based on the
created_at
timestamp of each post. For each day, it calculates:- The total number of positive posts.
- The total number of negative posts.
- The overall sentiment, which is the difference between the total number of positive and negative posts.
- The query then groups the results by day, based on the
Example Results
date | positive_sentiment_posts | negative_sentiment_posts | overall_sentiment |
---|---|---|---|
2024-01-13 | 10 | 21 | -11 |
2024-01-14 | 11 | 20 | -9 |
2024-01-15 | 23 | 7 | 16 |
2024-01-16 | 27 | 10 | 17 |
2024-01-17 | 47 | 22 | 25 |