It’s possible to get total category views from the Data Explorer by summing up the views of all topics under a category:
SELECT
c.id as category_id,
SUM(views) as "total views"
FROM categories c
JOIN topics t ON t.category_id = c.id
WHERE read_restricted is false
GROUP BY c.id
order by sum(views) desc
Getting total views by tag is similar:
SELECT tags.name,
sum(views)
from topics t
join topic_tags tt on t.id = topic_id
join tags on tags.id = tt.tag_id
group by tags.name
order by sum(views) desc
There’s also a topic_views
table which could, in theory, be used to divide views by user and date. I haven’t found that table particularly useful, however, because queries time out when topics have a lot of views. It also shows a lot fewer views per topic because, if I understand it correctly, it’s not counting anonymous views. I also think maybe topic.views
includes not only anonymous views but also bot traffic? It’s a lot more than what I see on GA.
Speaking of GA, it has all the data you need, but it’s not easy to group it by category or tag. The best I can do try to parse the pageTitle to find categories. I’m doing this in R using googleAnalyticsR. Follow the instructions in the manual to authorize your Google account and get the metrics you want. Be sure to include pageTitle
as a dimension. My API calls look a bit like this:
ga_this_year <- ga_data(ga_id,
date_range = c("2023-01-01", "2023-05-30" ),
metrics=c("screenPageViews","averageSessionDuration", "sessions"),
dimensions = c("pageTitle", "deviceCategory")
)
The key to understanding the next bit is to see that pageTitles fit this general format:
Topic title - Category - Site title
If the topic is not categorized, the Category is missing. There are also a bunch of utility pages (“Latest topics”, “New topics”, etc.) that don’t have categories. (I’m not counting “Latest [category] topics” as being part of the category, though it might be better to include them.) Finally the homepage uses Site title - short site description
for the page title. We don’t care about any of those, however. So the regex I’m using is:
str_extract(pageTitle,
str_glue(".*? - (.*) - {sitename}"),
group=1)
Putting it together into a function that groups by category, I get something like this:
category_views <- function(data, sitename = "Meta Jon") {
data %>%
mutate(category = str_extract(
pageTitle,
str_glue(".*? - (.*) - {sitename}"),
group=1
)) %>%
group_by(category) %>%
summarise(views = sum(screenPageViews)) %>%
arrange(desc(views)) #%>% head(20)
}
category_views(ga_this_year)
(Obviously change “Meta Jon” for your own site’s name.)
I don’t currently know how to extract GA data based on tag, however.