Better pageview metrics with the new site traffic report

We’re excited to announce a significant improvement in how we handle pageviews and present this crucial data to you. Our new site traffic report offers a more comprehensive and accurate view of your community’s engagement. Let’s dive into what’s new and how it benefits you!

:information_source: Please note that we are in the process of rolling the new pageview tracking out to hosted customers, so not all sites will be switched over right away.

What has changed

We’ve revamped our approach to tracking and reporting pageviews to provide you with more reliable and actionable data. We now monitor the sources of individual pageviews and are able to detect if they came from a real browser or a crawler.

The new site traffic report combines data from various sources to give you a holistic view of your site’s traffic.

What is included in the report

The site traffic report includes the following four types of pageviews:

  1. Pageviews (logged in): Pageviews from users who are logged into your Discourse instance.
  2. Pageviews (anonymous): Pageviews from users who are not logged in but are using a web browser.
  3. Known crawlers: Pageviews from identified web crawlers or bots (e.g., search engine crawlers).
  4. Other traffic: Various types of requests that don’t fall into the other three categories, including other crawlers.

The default report view hides the known crawlers and other traffic metrics, so that it aligns with the pageview metrics displayed elsewhere in the dashboard.

Why this matters

This now give you a far more accurate gauge of actual traffic on your forums. Many crawlers are not easily detectable using user agent strings, so this report helps you gain a clearer understanding of who is visiting your forum.

This allows for better decision-making, easier growth-tracking, and an improved ability to identify trends in users and page views.

:information_source: For hosted customers, this also means the pageviews that count toward your monthly limits are more accurate and realistic.

How to access the new report

You can find the new consolidated pageviews report in your admin dashboard For a detailed guide on how to interpret and make the most of this new report, please refer to our comprehensive documentation: Understanding pageviews and the site traffic report

We value your feedback

As always, we’re committed to improving your experience with Discourse. We’d love to hear your thoughts on this new report and how it’s helping you understand your community better. Please share your feedback and any questions you may have in the comments below.

34 Likes

On two self-hosted installs I’m missing data prior to ~4 months ago now, although stats from that time period still appear in the other graphs. Do you see this in your reports?

,

5 Likes

Hey Jonah, what you’re seeing is expected as we only started collecting data in this new format recently.

Others may have slightly different experiences, depending on when they first updated to a version that began to collect traffic data in this new format.

Other data is not impacted by this change, which is why the other charts you see have data going back much further.

5 Likes

I think at minimum the previous data should appear as logged in and “other traffic” for historical purposes.

8 Likes

Just noticed this and it has completely broken our pageview stats, showing a reduction in total views by around a fifth/sixth. Should this be reported as a bug?

As @JonahAragon1 and @darkpixlz also pointed out we’ve now lost years of page view stats including the total since the forum started - can these be brought back? Or could you give us an option to continue using the old system, either as an alternative or alongside this new one until any teething issues are dealt with?

3 Likes

Yeah tbh this is a big misstep. Don’t know why it wasn’t even considered to keep it as “other” but here we are

3 Likes

not following, all the information is there. just click other and known crawlers and you will see all the numbers you saw before.

no information was lost, we did not purge any of the old tables, you can still get at the old data.

4 Likes

@AstonJ @darkpixlz I’ve just merged a PR that will show the old pageview and consolidated pageview reports as “Legacy Pageviews” and “Legacy Consolidated Pageviews” respectively, hope this helps:

11 Likes

Thanks Martin!

Just upgraded two forums (3.4.0.beta3-dev) however All reports | Legacy Pageviews is empty on both of them - are you already aware?

2 Likes

Sorry no I was not aware of this, thanks for letting me know, this is my fault. Working on a fix now.

4 Likes

@AstonJ the problem was a typo, I’ve just merged a fix:

5 Likes

Is there any way to learn more where does the “other traffic” come from?

Here are pageviews on my website for a single day

Algorithm Logged Guests Crawlers Other Total
Old 4348 7092 4430 - 15870
New 3954 1848 4430 5638 15870

From what I understand, the difference between old and new is you learned to better differentiate real anonymous users from crawler agents. However, why would the number of pageviews attributed to logged users decrease? Also, in the same day we had 1.6k visits from search, so it’s somewhat incredulous that those 1.6k visits resulted in just 1.8k pageviews? I guess I wonder how confident are you that the “other traffic” comes from non humans.

Also, are LLM crawlers counted within “Known crawlers” or “other traffic”? I don’t see any in the crawler agents report, but it could be that they just respect my robots.txt settings[1]


  1. which I’d be surprised if they do ↩︎

4 Likes

Whats “Other Traffic” Support

Are these visitors coming from Google, Social Media and other sources?

2 Likes

I think those usually appear as anonymous pageviews

2 Likes

Thank you - it would be nice to know the “known crawlers” like a list…

Google
Bing
Yahoo
A
B
Others

3 Likes

We don’t have anything more granular at the moment to show what “other traffic” is. It’s just anything that isn’t a “real/human/browser” page view whether logged in or anonymous and not a known crawler.

We are fairly confident, but it’s an uphill battle these days to separate the badly behaving bots and crawlers from the legitimate users of the site. We may need to make further tweaks to our systems in future. As it stands, we count the view as “real” if the entire Ember app is booted, which is hard to replicate outside of an actual browser.

It depends…sometimes they respect robots.txt and represent themselves as a crawler, and other times they don’t which was part of the reason why page views were so inflated, and we switched to this new system.

Here are our known crawlers at this time, which is based on User-Agent:

We also rate limit certain crawlers/bots, mainly AI ones:

4 Likes