Discourse AI 插件包含情感分析功能,可帮助您更深入地了解社区讨论中的情感基调。本文档讨论了两个详细的数据探索器查询示例,它们利用这些 AI 功能来提供对 Discourse 社区的洞察。
- 按类别和信任等级统计的 AI 情感总量:一种时间序列分析,跟踪特定类别和信任等级内每周的情感趋势
- AI 情感异常话题:识别在 Discourse 站点中引发强烈情感反应的讨论话题
先决条件
要使用这些报告,您需要:
- 安装并启用 Discourse AI 插件:您的实例上必须安装 Discourse AI 插件
- 启用情感分析:必须配置并激活 情感分析 模块
- Data Explorer 插件:运行这些 SQL 查询所必需
- 历史情感数据:需要有足够多的已进行情感分析的帖子以得出有意义的结果(可能需要执行回填操作)
AI 情感模型及其工作原理
在深入探讨报告之前,了解情感模型正在分析社区帖子中的哪些内容会有所帮助:
- 整体情感:cardiffnlp/twitter-roberta-base-sentiment-latest 模型将帖子分类为积极、消极或中性
- 情绪检测:SamLowe/roberta-base-go_emotions 模型识别帖子中的具体情绪,如喜悦、悲伤、愤怒等
这些模型分析每篇帖子的文本,并将分类结果存储在您的数据库中,随后 Data Explorer 插件即可对这些数据进行查询。
按类别和信任等级统计的 AI 情感总量报告
-- [params]
-- date :start_date = 2025-01-01
-- date :end_date = 2025-12-31
-- category_id :category_id = 6
-- int :min_trust_level = 0
-- boolean :exclude_staff = false
-- 创建一个临时结果集,按周聚合指定类别的情感指标
WITH sentiment_counts AS (
SELECT
c.id as category_id,
c.name as category_name,
-- 按周对帖子进行分组以进行时间序列分析
DATE_TRUNC('week', p.created_at) as week_starting,
EXTRACT(YEAR FROM p.created_at) as year,
EXTRACT(WEEK FROM p.created_at) as week_number,
-- 统计积极情感帖子数量(阈值 > 0.6)
COUNT(CASE WHEN (cr.classification::jsonb->'positive')::float > 0.6 THEN 1
ELSE NULL END) as positive_count,
-- 统计消极情感帖子数量(阈值 > 0.6)
COUNT(CASE WHEN (cr.classification::jsonb->'negative')::float > 0.6 THEN 1
ELSE NULL END) as negative_count,
-- 统计中性情感帖子数量(积极和消极均 <= 0.6)
COUNT(CASE WHEN (cr.classification::jsonb->'positive')::float <= 0.6
AND (cr.classification::jsonb->'negative')::float <= 0.6 THEN 1
ELSE NULL END) as neutral_count,
-- 已进行情感分析的帖子总数
COUNT(*) as total_classifications
FROM classification_results cr
-- 连接帖子数据以获取创建日期和元数据
JOIN posts p ON p.id = cr.target_id AND cr.target_type = 'Post'
-- 连接话题数据以按类别过滤
JOIN topics t ON t.id = p.topic_id
-- 连接用户数据以按信任等级过滤
JOIN users u ON u.id = p.user_id
-- 连接类别数据以获取类别名称
JOIN categories c ON c.id = t.category_id
WHERE
-- 仅包含来自此特定模型的情感结果
cr.model_used = 'cardiffnlp/twitter-roberta-base-sentiment-latest'
-- 仅包含常规话题(排除私信等)
AND t.archetype = 'regular'
-- 排除系统帖子
AND p.user_id > 0
-- 按选定类别过滤
AND c.id = :category_id
-- 按最低信任等级过滤
AND u.trust_level >= :min_trust_level
-- 如果参数被选中,则排除工作人员用户
AND (:exclude_staff = false OR (u.admin = false AND u.moderator = false))
-- 按日期范围过滤
AND p.created_at BETWEEN :start_date AND :end_date
-- 按周和类别分组
GROUP BY c.id, c.name, week_starting, year, week_number
)
-- 格式化最终结果以供显示
SELECT
category_id,
category_name,
-- 转换为日期格式以便更清晰地显示
week_starting::Date,
-- 格式化为 ISO 周表示法 (YYYY-WXX)
year || '-W' || LPAD(week_number::text, 2, '0') as year_week,
-- 计算净情感值(积极减去消极)
positive_count - negative_count as sentiment_balance,
positive_count,
negative_count,
neutral_count,
-- 计算积极帖子百分比(保留两位小数)
ROUND(
(positive_count::float / NULLIF(total_classifications, 0) * 100)::numeric,
2
) as positive_percentage
FROM sentiment_counts
-- 按时间顺序排序以显示情感趋势
ORDER BY week_starting ASC
此报告提供了特定类别内每周情感趋势的分析,展示内容如下:
- 每周的积极、消极和中性帖子数量
- 情感平衡计算(积极帖子数减去消极帖子数)
- 积极帖子占已分析帖子总数的百分比
- 按用户信任等级过滤,并可选择排除工作人员帖子
此报告对于以下方面非常有价值:
- 跟踪特定类别中社区情感随时间的变化趋势
- 识别可能与特定事件或变更相关的社区情绪波动
- 比较不同用户群体(按信任等级划分)之间的情感差异
- 衡量管理干预措施对社区整体情感的影响
参数
该查询接受多个参数以自定义您的分析:
- 日期范围:设置分析周期的开始和结束日期
- 类别:选择要分析的类别
- 最低信任等级:仅过滤来自达到或超过特定信任等级的用户的帖子
- 排除工作人员:可选择从分析中移除工作人员帖子(以便专注于普通社区成员)
结果
结果以表格形式呈现,每一行代表一周的数据:
- 类别信息:已分析类别的 ID 和名称
- 时间段:周开始日期和 ISO 周表示法 (YYYY-WXX)
- 情感指标:
- 情感平衡:积极与消极帖子之间的差值(正值表示整体情感为积极)
- 积极/消极/中性计数:各情感类别中的帖子数量
- 积极百分比:被分类为积极的帖子所占的百分比
示例结果
| category_name | week_starting | year_week | sentiment_balance | positive_count | negative_count | neutral_count | positive_percentage |
|---|---|---|---|---|---|---|---|
| Product Discussion | 2025-01-06 | 2025-W01 | -8 | 24 | 32 | 145 | 11.94 |
| Product Discussion | 2025-01-13 | 2025-W02 | -11 | 30 | 41 | 210 | 10.68 |
| Product Discussion | 2025-01-20 | 2025-W03 | -9 | 28 | 37 | 220 | 9.82 |
| Product Discussion | 2025-01-27 | 2025-W04 | -13 | 33 | 46 | 260 | 9.74 |
| Product Discussion | 2025-02-03 | 2025-W05 | -15 | 22 | 37 | 180 | 9.21 |
| Product Discussion | 2025-02-10 | 2025-W06 | -6 | 37 | 43 | 195 | 13.45 |
AI 情感异常话题报告
-- [params]
-- date :start_date = 2025-01-01
-- date :end_date = 2025-12-31
-- category_id :category_id = 6
-- int :min_trust_level = 1
-- int :emotion_threshold = 10
-- 首先,创建一个公共表表达式 (CTE),按话题聚合情绪反应
WITH topic_emotions AS (
SELECT
topics.id AS topic_id, -- 存储话题 ID 以便后续连接/过滤
topics.title, -- 包含话题标题以生成可读结果
topics.created_at::date AS topic_date, -- 存储话题创建日期
-- 对于每种情绪类型,统计该情绪置信度超过 0.1 的帖子数量
-- classification_results 表将情绪分数存储为 JSON 值
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'admiration')::float > 0.1) AS admiration_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'amusement')::float > 0.1) AS amusement_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'anger')::float > 0.1) AS anger_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'annoyance')::float > 0.1) AS annoyance_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'approval')::float > 0.1) AS approval_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'caring')::float > 0.1) AS caring_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'confusion')::float > 0.1) AS confusion_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'curiosity')::float > 0.1) AS curiosity_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'desire')::float > 0.1) AS desire_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'disappointment')::float > 0.1) AS disappointment_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'disapproval')::float > 0.1) AS disapproval_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'disgust')::float > 0.1) AS disgust_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'embarrassment')::float > 0.1) AS embarrassment_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'excitement')::float > 0.1) AS excitement_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'fear')::float > 0.1) AS fear_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'gratitude')::float > 0.1) AS gratitude_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'grief')::float > 0.1) AS grief_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'joy')::float > 0.1) AS joy_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'love')::float > 0.1) AS love_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'nervousness')::float > 0.1) AS nervousness_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'neutral')::float > 0.1) AS neutral_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'optimism')::float > 0.1) AS optimism_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'pride')::float > 0.1) AS pride_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'realization')::float > 0.1) AS realization_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'relief')::float > 0.1) AS relief_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'remorse')::float > 0.1) AS remorse_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'sadness')::float > 0.1) AS sadness_count,
COUNT(*) FILTER (WHERE (classification_results.classification::jsonb->'surprise')::float > 0.1) AS surprise_count,
-- 计算情绪反应总数以便排名
COUNT(*) AS total_emotional_reactions
FROM
classification_results
-- 连接 posts 表以获取帖子元数据并过滤已删除帖子
INNER JOIN
posts ON posts.id = classification_results.target_id AND
posts.deleted_at IS NULL -- 排除已删除帖子
-- 连接 topics 表以获取话题元数据并按话题类型/状态过滤
INNER JOIN
topics ON topics.id = posts.topic_id AND
topics.archetype = 'regular' AND -- 仅包含标准话题(排除私信或系统消息)
topics.deleted_at IS NULL -- 排除已删除话题
-- 连接 users 表以获取用户信任等级用于过滤
INNER JOIN
users ON users.id = posts.user_id
WHERE
-- 仅包含帖子的情绪分类(排除其他内容类型)
classification_results.target_type = 'Post' AND
-- 仅使用来自此特定情绪检测模型的结果
classification_results.model_used = 'SamLowe/roberta-base-go_emotions' AND
-- 使用参数化值按日期范围过滤
posts.created_at BETWEEN :start_date AND :end_date AND
-- 按指定类别过滤
(topics.category_id = :category_id) AND
-- 仅包含来自具有足够信任等级的用户的帖子
(users.trust_level >= :min_trust_level)
-- 按话题对所有计数进行分组
GROUP BY
topics.id, topics.title, topics.created_at::date
)
-- 主查询,用于格式化并过滤来自 CTE 的聚合数据
SELECT
topic_id, -- 显示话题 ID(在 Discourse 中将渲染为链接)
--title, -- 显示话题标题
topic_date, -- 显示话题创建日期
total_emotional_reactions, -- 显示检测到的情绪总数
-- 将显著情绪数组转换为格式化字符串
-- 仅包含超过阈值的情绪,其他情绪变为 NULL 并被省略
-- 每种情绪格式化为 "情绪名称 (计数)"
ARRAY_TO_STRING(ARRAY[
CASE WHEN admiration_count >= :emotion_threshold THEN 'Admiration(' || admiration_count || ')' ELSE NULL END,
CASE WHEN amusement_count >= :emotion_threshold THEN 'Amusement(' || amusement_count || ')' ELSE NULL END,
CASE WHEN anger_count >= :emotion_threshold THEN 'Anger(' || anger_count || ')' ELSE NULL END,
CASE WHEN annoyance_count >= :emotion_threshold THEN 'Annoyance(' || annoyance_count || ')' ELSE NULL END,
CASE WHEN approval_count >= :emotion_threshold THEN 'Approval(' || approval_count || ')' ELSE NULL END,
CASE WHEN caring_count >= :emotion_threshold THEN 'Caring(' || caring_count || ')' ELSE NULL END,
CASE WHEN confusion_count >= :emotion_threshold THEN 'Confusion(' || confusion_count || ')' ELSE NULL END,
CASE WHEN curiosity_count >= :emotion_threshold THEN 'Curiosity(' || curiosity_count || ')' ELSE NULL END,
CASE WHEN desire_count >= :emotion_threshold THEN 'Desire(' || desire_count || ')' ELSE NULL END,
CASE WHEN disappointment_count >= :emotion_threshold THEN 'Disappointment(' || disappointment_count || ')' ELSE NULL END,
CASE WHEN disapproval_count >= :emotion_threshold THEN 'Disapproval(' || disapproval_count || ')' ELSE NULL END,
CASE WHEN disgust_count >= :emotion_threshold THEN 'Disgust(' || disgust_count || ')' ELSE NULL END,
CASE WHEN embarrassment_count >= :emotion_threshold THEN 'Embarrassment(' || embarrassment_count || ')' ELSE NULL END,
CASE WHEN excitement_count >= :emotion_threshold THEN 'Excitement(' || excitement_count || ')' ELSE NULL END,
CASE WHEN fear_count >= :emotion_threshold THEN 'Fear(' || fear_count || ')' ELSE NULL END,
CASE WHEN gratitude_count >= :emotion_threshold THEN 'Gratitude(' || gratitude_count || ')' ELSE NULL END,
CASE WHEN grief_count >= :emotion_threshold THEN 'Grief(' || grief_count || ')' ELSE NULL END,
CASE WHEN joy_count >= :emotion_threshold THEN 'Joy(' || joy_count || ')' ELSE NULL END,
CASE WHEN love_count >= :emotion_threshold THEN 'Love(' || love_count || ')' ELSE NULL END,
CASE WHEN nervousness_count >= :emotion_threshold THEN 'Nervousness(' || nervousness_count || ')' ELSE NULL END,
CASE WHEN optimism_count >= :emotion_threshold THEN 'Optimism(' || optimism_count || ')' ELSE NULL END,
CASE WHEN pride_count >= :emotion_threshold THEN 'Pride(' || pride_count || ')' ELSE NULL END,
CASE WHEN realization_count >= :emotion_threshold THEN 'Realization(' || realization_count || ')' ELSE NULL END,
CASE WHEN relief_count >= :emotion_threshold THEN 'Relief(' || relief_count || ')' ELSE NULL END,
CASE WHEN remorse_count >= :emotion_threshold THEN 'Remorse(' || remorse_count || ')' ELSE NULL END,
CASE WHEN sadness_count >= :emotion_threshold THEN 'Sadness(' || sadness_count || ')' ELSE NULL END,
CASE WHEN surprise_count >= :emotion_threshold THEN 'Surprise(' || surprise_count || ')' ELSE NULL END
], ', ', '') AS significant_emotions -- 用逗号分隔符连接,如果不需要分隔符则使用空字符串
FROM
topic_emotions
WHERE
-- 仅包含至少有一种情绪超过阈值的话题
-- 这将识别出具有显著情感影响的话题
(
admiration_count >= :emotion_threshold OR
amusement_count >= :emotion_threshold OR
anger_count >= :emotion_threshold OR
annoyance_count >= :emotion_threshold OR
approval_count >= :emotion_threshold OR
caring_count >= :emotion_threshold OR
confusion_count >= :emotion_threshold OR
curiosity_count >= :emotion_threshold OR
desire_count >= :emotion_threshold OR
disappointment_count >= :emotion_threshold OR
disapproval_count >= :emotion_threshold OR
disgust_count >= :emotion_threshold OR
embarrassment_count >= :emotion_threshold OR
excitement_count >= :emotion_threshold OR
fear_count >= :emotion_threshold OR
gratitude_count >= :emotion_threshold OR
grief_count >= :emotion_threshold OR
joy_count >= :emotion_threshold OR
love_count >= :emotion_threshold OR
nervousness_count >= :emotion_threshold OR
optimism_count >= :emotion_threshold OR
pride_count >= :emotion_threshold OR
realization_count >= :emotion_threshold OR
relief_count >= :emotion_threshold OR
remorse_count >= :emotion_threshold OR
sadness_count >= :emotion_threshold OR
surprise_count >= :emotion_threshold
)
-- 按情绪反应总数降序排序结果
ORDER BY
total_emotional_reactions DESC
此报告识别出在您的社区中引发强烈情感反应的话题,依据包括:
- 话题内帖子中检测到的每种情绪类型的计数
- 可配置的阈值,用于确定何种程度的情感反应被视为“显著”
- 按类别、日期范围和用户信任等级进行过滤
此报告可帮助您:
- 识别可能产生强烈负面情绪的潜在问题讨论
- 发现能够引起社区情感共鸣的高参与度内容
- 检测可能需要管理关注以防止升级的话题
- 发现触发特定情感反应的内容主题
- 更好地了解驱动社区情感参与的因素
参数
该查询接受多个参数:
- 日期范围:设置分析周期的开始和结束日期
- 类别:选择要分析的类别
- 最低信任等级:仅过滤来自达到或超过特定信任等级的用户的帖子
- 情绪阈值:设置某种情绪需要出现多少次才被视为显著
结果
结果显示如下:
- 话题 ID:直接链接到该话题(在 Data Explorer 中可点击)
- 话题日期:话题创建时间
- 总情绪反应:检测到的情绪反应总数
- 显著情绪:格式化的情绪列表,显示超过您阈值的情绪及其计数(括号内)
检测到的情绪范围广泛,包括:钦佩、娱乐、愤怒、烦恼、认可、关怀、困惑、好奇、渴望、失望、不认可、厌恶、尴尬、兴奋、恐惧、感激、悲痛、喜悦、爱、紧张、中性、乐观、自豪、领悟、宽慰、懊悔、悲伤和惊讶。
示例结果
| topic | topic_date | total_emotional_reactions | significant_emotions |
|---|---|---|---|
| Feature Request: Increased API Rate Limits | 2025-03-06 | 42 | Approval(15), Confusion(9), Curiosity(7), Gratitude(8) |
| Authentication Error with Third-Party Integration | 2025-01-07 | 33 | Curiosity(6), Gratitude(5), Disapproval(8), Frustration(9) |
| Best Practices for Configuration Settings | 2025-02-16 | 31 | Curiosity(9), Excitement(6), Gratitude(5), Optimism(5) |
| Troubleshooting Database Connection Issues | 2025-01-15 | 29 | Curiosity(7), Confusion(8), Disappointment(6), Frustration(5) |
| Critical Bug in Latest Beta Release | 2025-02-02 | 26 | Confusion(7), Concern(6), Disapproval(5), Urgency(6) |
在社区管理中的实际应用
这些报告可以通过多种方式增强您的社区管理工作流程:
- 早期干预:识别可能需要管理介入的情感化话题,以防其演变成问题
- 内容规划:利用关于什么能引发积极情绪的洞察来指导内容策略
- 衡量影响:评估政策变更、新功能或事件对社区情感的影响
- 针对性互动:将工作人员的关注集中在那些可能受益于官方回应的、具有强烈情感反应的话题上