在分类和标签层面追踪页面浏览量和独立用户数

I am curious if you have advice on tracking community health metrics at the category and/or tag level.
For example, I’d love to have a clearer sense of the daily number of pageviews and user visits (both known and anonymous users, and ideally by specific group-members) on topic-threads under each category and tag.

In /admin/dashboard/reports, new topics and new post have reports that can be filtered to the category level, user profile_views can be filtered by group, and there are sitewide metrics on page views and other useful info.

I suspect some of this can be set-up somehow in Google Analytics, but that’s not obvious to me since it’s not obvious how to associate topic urls with tags and categories in google analytics. I also suspect this may be done via the /admin/plugins/explorer, but I haven’t had any success.

Your advice would be greatly appreciated.

Do you have any advice here @hawk? It seems like a reasonable question and maybe new dashboard / reports covers this a bit?

It does seem like a reasonable request, yes. At the moment the reports on the dashboard aren’t segment-able but if we can work out a way to do that I think it would be an excellent addition to v2. We’ve also talked about exposing approved Data Explorer queries on the dashboard.

cc @j.jaffeux – do you have plans to revisit the dashboard any time soon?

Yes, it will be step by step but might come to something close to this. I don‘t want it to become too complex though and it will never be a “combine any segment you want”, there’s sql (and data-explorer) for this. But will probably try to provide this on a per report basis where we think it makes sense.

This query is specifically about pageviews per category, which I don’t think can be handled with Data Explorer can they?

Yes, I would also be interested in a report about the unique users on my site.

Yes this specific report is using ApplicationRequest table, which doesn’t have any notion of category. So not possible through reports, data-explorer or raw sql.

Can we pull a report on the number of anonymous users instead of the number of pages viewed anonymously?

Do you mean the number of people that have used the anonymous posting feature? They all get unique anonymous usernames so it will be very easy to pull a report from Data Explorer

I’m trying to find out how many people are visiting the site who have not logged in at all and do not have an account.

也很想看看这个!关于如何构建查询有什么建议吗?谢谢!

我们还需要这个数字(每个类别的页面浏览量,排除爬虫活动)。是否可以通过数据探索器获取?

我不这么认为。我们不会像那样细致地追踪页面浏览量。你需要接入你的 GA 并使用它。

@HAWK:那么你认为这不可能实现吗?Counting Pageviews for users (non-staff, non-crawler)

有人找到统计特定分类下所有帖子页面浏览量的解决方案了吗?用 GA 可以实现吗?

您可以通过汇总类别下所有主题的浏览量,从数据浏览器中获取_总_类别浏览量:

SELECT 
       c.id as category_id, 
       SUM(views) as "total views"
FROM categories c 
JOIN topics t ON t.category_id = c.id
WHERE read_restricted is false
GROUP BY c.id
order by sum(views) desc

按标签获取总浏览量的方法类似:

SELECT tags.name,
       sum(views)
from topics t
     join topic_tags tt on t.id = topic_id
     join tags on tags.id = tt.tag_id
group by tags.name
order by sum(views) desc

还有一个 topic_views 表,理论上可以用来按用户和日期划分浏览量。然而,我发现这个表不太有用,因为当主题浏览量很大时,查询会超时。它还显示每个主题的浏览量少得多,因为如果我理解正确的话,它不计算匿名浏览量。我还认为 topic.views 可能不仅包括匿名浏览量,还包括机器人流量?这比我在 GA 上看到的要多得多。

说到 GA,它拥有您需要的所有数据,但很难按类别或标签进行分组。我能做的最好的事情是尝试解析 pageTitle 来查找类别。我正在使用 googleAnalyticsR 在 R 中执行此操作。请遵循手册中的说明授权您的 Google 帐户并获取您想要的指标。请务必将 pageTitle 包含为维度。我的 API 调用大致如下:

ga_this_year <- ga_data(ga_id, 
                        date_range = c("2023-01-01", "2023-05-30" ),
                        metrics=c("screenPageViews","averageSessionDuration", "sessions"), 
                        dimensions = c("pageTitle", "deviceCategory")
                       )

理解下一部分的关键是看到 pageTitles 符合此通用格式:

Topic title - Category - Site title

如果主题未分类,则“Category”将缺失。还有很多实用页面(“Latest topics”、“New topics”等)没有类别。(我没有将“Latest [category] topics”视为类别的一部分,尽管包含它们可能更好。)最后,主页使用 Site title - short site description 作为页面标题。然而,我们不关心其中任何一个。所以,我使用的正则表达式是:

str_extract(pageTitle,
            str_glue(".*? - (.*) - {sitename}"),
            group=1)

将其组合成一个按类别分组的函数,我得到类似这样的结果:

category_views <- function(data, sitename = "Meta Jon") {
  data %>% 
  mutate(category = str_extract(
                                pageTitle,
                                str_glue(".*? - (.*) - {sitename}"),
                                group=1
                                )) %>% 
  group_by(category) %>% 
  summarise(views = sum(screenPageViews)) %>% 
  arrange(desc(views)) #%>% head(20)
}

category_views(ga_this_year)

(显然,请将“Meta Jon”替换为您自己的网站名称。)

不过,我目前不知道如何根据标签提取 GA 数据。

这些总观看次数是否包含机器人,也就是匿名用户?

GA 根据其文档排除了机器人流量。我找不到 topic_views 的文档,但代码中的一条注释说:

# 每天每个事物每个(用户 || IP)只存储一次视图

我不确定,但 topic.views 似乎显示了机器人的页面浏览量,因为它显示的浏览量比 GA 同一页面的浏览量多得多。

是的,但使用数据浏览器时来自 Discourse 的计数是多少?

早些时候 GA 也这么说过,当有合法的机器人时,这可能是真的。但大多数调用是由恶意机器人、行为不端的 SEO 机器人、敲诈者等引起的,然后一次访问规则就会被打破。如果我诚实地说,我就是这么认为的。

无论如何,我正在使用 Matomo。我不是那种需要 GA 的大人物。

仅供参考,这实际上是因为 topic_views 表限制为每天只计算一次新浏览,而 topics.views 字段允许每 8 小时(默认情况下,但可以使用 topic view duration hours 更改)计算一次新浏览。它确实包括用户和匿名用户。:+1:

更新
事实证明,topic_views 表甚至不计算每天一次的新浏览……它实际上计算的是某人第一次浏览某个主题,之后不再计算。因此,每个用户或 IP 每个主题只会有一个记录。