当用户需要超出其偏好设置页面上提供的基本用户数据导出的自定义或详细数据导出时,管理员可以使用Data Explorer插件执行用户数据导出。
此过程涉及编写和执行针对特定数据需求的 SQL 查询,同时确保符合站点策略和数据保护法规。
管理员可能还出于其他原因想要导出用户数据,例如:
- 用户参与度分析:
- 分析用户如何与站点互动,哪些帖子受欢迎,哪些主题获得最多的回复。
- 报告生成:
- 为利益相关者创建自定义报告,以提供社区指标和用户参与度的见解。
- 数据备份:
- 保留用户数据的备份以用于恢复目的或历史分析。
- 数据迁移:
- 迁移到另一个平台或创建数据仓库时导出用户数据。
- 内容审核:
- 在发生违规或争议时调查用户活动或内容。
用户数据导出查询的关键注意事项
在编写导出用户数据的查询时,重要的是要同时考虑导出的具体要求以及您的站点关于用户数据导出的策略。
以下是一些可能影响用户数据导出查询编写方式的因素和注意事项:
-
用户属性:
- 如果您需要特定的用户属性,例如用户名、电子邮件或注册日期,您的查询应从数据库中明确选择这些列。例如:
SELECT username, email, created_at AS registration_date FROM users WHERE id = :user_id -
活动数据:
- 要导出用户活动数据(如帖子、点赞或登录历史记录),需要与其他表(例如
posts、likes、user_actions)进行其他连接:
SELECT p.id AS post_id, p.topic_id, p.created_at AS post_date, u.username FROM posts p JOIN users u ON u.id = p.user_id WHERE u.id = :user_id - 要导出用户活动数据(如帖子、点赞或登录历史记录),需要与其他表(例如
-
日期范围:
- 如果您需要特定时间段的数据,请在查询中包含日期过滤器以将结果限制在相关期间内:
SELECT username, email, created_at AS registration_date FROM users WHERE created_at BETWEEN :start_date AND :end_date -
数据隐私:
- 确保导出的数据符合数据隐私法律和站点策略。除非绝对必要,否则在包含敏感数据时要小心。
-
匿名化:
- 在某些情况下,您可能需要匿名化用户数据以保护用户隐私。例如,您可以用唯一标识符替换姓名。
-
用户同意:
- 如果数据导出涉及共享个人或敏感用户信息,请确保已到位明确的同意机制。
示例查询
以下是一些常见数据导出需求的示例查询。这些查询的设计与从用户偏好设置页面获得的数据导出类似:
导出用户信息:
-- [params]
-- int :user_id
SELECT id, username, email, created_at AS registration_date
FROM users
WHERE id = :user_id
导出帖子
-- [params]
-- int :user_id
SELECT
id,
topic_id,
post_number,
created_at,
updated_at,
reply_to_post_number,
reply_count,
quote_count,
deleted_at,
like_count,
incoming_link_count,
bookmark_count,
score,
reads,
post_type,
sort_order,
last_editor_id,
hidden,
hidden_reason_id,
notify_moderators_count,
spam_count,
illegal_count,
inappropriate_count,
last_version_at,
user_deleted,
reply_to_user_id,
percent_rank,
notify_user_count,
like_score,
deleted_by_id,
word_count,
version,
cook_method,
wiki,
baked_at,
baked_version,
hidden_at,
self_edits,
reply_quoted,
via_email,
public_version,
locked_by_id,
image_upload_id,
qa_vote_count,
raw,
cooked,
edit_reason,
raw_email,
action_code,
outbound_message_id
FROM
posts
WHERE
user_id = :user_id
ORDER BY
created_at DESC
导出认证令牌日志:
-- [params]
-- int :user_id
SELECT
path,
action,
user_auth_token_id,
client_ip,
user_agent,
auth_token,
created_at
FROM
user_auth_token_logs
WHERE
user_id = :user_id
导出认证令牌:
-- [params]
-- int :user_id
SELECT
auth_token,
client_ip,
rotated_at,
created_at,
updated_at,
seen_at,
auth_token_seen,
user_agent,
prev_auth_token
FROM
user_auth_tokens
WHERE
user_id = :user_id
导出用户徽章:
-- [params]
-- int :user_id
SELECT
notification_id,
badge_id,
granted_at,
granted_by_id,
post_id,
seq,
featured_rank,
created_at,
is_favorite
FROM
user_badges
WHERE
user_id = :user_id
导出用户书签:
-- [params]
-- int :user_id
SELECT
name,
reminder_at,
created_at,
updated_at,
reminder_last_sent_at,
reminder_set_at,
auto_delete_preference,
pinned,
bookmarkable_id,
bookmarkable_type
FROM
bookmarks
WHERE
user_id = :user_id
导出类别偏好设置:
-- [params]
-- int :user_id
SELECT
category_id,
notification_level,
last_seen_at
FROM
category_users
WHERE
user_id = :user_id
导出标志:
-- [params]
-- int :user_id
SELECT
id,
name,
name_key,
description,
notify_type,
auto_action_type,
custom_type,
applies_to,
position,
enabled,
created_at,
updated_at,
score_type
FROM
flags
WHERE
id IN (
SELECT
post_action_type_id
FROM
post_actions
WHERE
user_id = :user_id
)
导出点赞:
-- [params]
-- int :user_id
SELECT
post_id,
post_action_type_id,
created_at,
updated_at,
deleted_at,
deleted_by_id,
related_post_id,
staff_took_action,
deferred_by_id,
targets_topic,
agreed_at,
agreed_by_id,
deferred_at,
disagreed_at,
disagreed_by_id
FROM
post_actions
WHERE
user_id = :user_id
AND post_action_type_id = 1
导出用户偏好设置:
-- [params]
-- int :user_id
SELECT
mailing_list_mode,
email_digests,
external_links_in_new_tab,
enable_quoting,
dynamic_favicon,
automatically_unpin_topics,
digest_after_minutes,
auto_track_topics_after_msecs,
new_topic_duration_minutes,
last_redirected_to_top_at,
email_previous_replies,
email_in_reply_to,
like_notification_frequency,
include_tl0_in_digests,
mailing_list_mode_frequency,
notification_level_when_replying,
theme_key_seq,
allow_private messages,
homepage_id,
theme_ids,
hide_profile_and_presence,
text_size_key,
text_size_seq,
email_level,
email_messages_level,
title_count_mode_key,
enable_defer,
timezone,
enable_allowed_pm_users,
dark_scheme_id,
skip_new_user_tips,
color_scheme_id,
default_calendar,
chat_enabled,
only_chat_push_notifications,
oldest_search_log_date,
chat_sound,
dismissed_channel_retention_reminder,
dismissed_dm_retention_reminder,
bookmark_auto_delete_preference,
ignore_channel_wide_mention,
chat_email_frequency,
enable_experimental_sidebar,
seen_popups,
chat_header_indicator_preference,
sidebar_link_to_filtered_list,
sidebar_show_count_of_new_items,
watched_precedence_over_muted,
chat_separate_sidebar_mode,
topics_unread_when_closed,
policy_email_frequency,
show_thread_title_prompts,
auto_image_caption
FROM
user_options
WHERE
user_id = :user_id
导出排队帖子:
-- [params]
-- int :user_id
SELECT
id,
user_id,
topic_id,
post_number,
raw,
created_at,
updated_at,
state,
error,
queue
FROM
queued_posts
WHERE
user_id = :user_id
导出用户存档:
-- [params]
-- int :user_id
SELECT
id,
topic_id,
created_at,
updated_at
FROM
user_archived_messages
WHERE
user_id = :user_id
导出用户访问记录:
-- [params]
-- int :user_id
SELECT
visited_at,
posts_read,
mobile,
time_read
FROM
user_visits
WHERE
user_id = :user_id
有关创建和运行数据导出查询的更多信息,请参阅Discourse Data Explorer 插件文档和 #data-reporting::category 类别。