Hoe om te gaan met plotseling hoog "Overige verkeer" in site-analyse?

Monikas · 22 augustus 2025 om 04:43

Hi everyone,

I recently noticed a huge spike in “Other traffic” on my forum’s community health page (Discourse admin dashboard → Reports → Community health).

Here are the details:

•	Period: Around early August 2025

•	Daily traffic: jumped to 100k+ “Other traffic” per day

•	Example: On Aug 16, 2025

•	Logged-in pageviews: 12,531

•	Anonymous pageviews: 2,753

•	Known crawlers: 6,865

•	Other traffic: 102,054 (majority of my total 124k)

This “Other traffic” seems abnormal and is much higher than real user activity. Signups are stable, so it doesn’t look like genuine growth.

My questions are:

1\. What does “Other traffic” usually mean in Discourse?

2\. Could this be bots, spam, or misconfigured reverse proxy/CDN?

3\. How can I reduce or filter this traffic? (e.g. Nginx, firewall, Discourse settings)

4\. Is it safe to just ignore, or will it affect performance/cost?

Any suggestions or best practices to properly handle this kind of third-party/bot traffic would be very helpful.

Thanks in advance!

awesomerobot · 25 augustus 2025 om 19:00

“Other traffic” are likely bots or crawlers, some more detail here Understanding pageviews and the site traffic report

You can check the crawler report on your dashboard for a possible source, and if you’d like, slow it down or block it… some more details on doing that in: Controlling Web Crawlers For a Site

Canapin · 26 augustus 2025 om 20:49

I’ve had issues in July with tons of requests from Singapore. I blocked an IP range, which worked for a while, but the problem came back harder in August (from Singapore, Hong Kong and Mexico) with high and unexpected CDN cost

I noticed high pageviews from Amazonbot, DataForSeoBot, meta-externalagent, SeekportBot, etc…

This documentation Controlling Web Crawlers For a Site says:

This list doesn’t contain some of my most visiting bots, but I have a question nonetheless.
Would it be advisable to add this whole list to the Blocked crawler user agents setting?
Is there a way to bulk add bot names from a .txt file?

opcourdis · 26 augustus 2025 om 21:46

Crawlers come with the intention to index your site in search engines, so the raise of traffic from them should be minimal except if those are bots that hides themselves as crawlers. Many forums don’t want to be indexed by crawlers and this is the option to do that : Crawlers carry an identity/reference to their source, so here you can add anyname you want that will only allow this source for the crawling(ahah such a strange word )

image1681×521 67.9 KB
The most probable source responsible for your traffic increase are bots and you must check the logs of your server for that. If you know someone who just knows the very basic of linux, I would suggest this 2 minute setup tool to block countries having bad bots reputations.(you may find this easily online). After it is setup it is still good to let your community know that they may need a vpn to reach your site if they end up in holidays in those countries. Here is the tool, it’s efficient, it will cut by 80-90% unnecessary requests made to your server. You have 2 modes and must choose either : allowed countries, or prohibited countries.

https://github.com/friendly-bits/geoip-shell
You can also use Geo Blocking plugin but it blocks only the page view, but not the direct requests made to your server like the tooll above does.

Canapin · 27 augustus 2025 om 13:46

Well, I guess that wouldn’t solve my issue, because the bots will consume CDN bandwidth anyway.

Jagster · 27 augustus 2025 om 14:58

Are you totally sure about that? Because if that is true I’ll start a reverse proxy right away.

Edit

AI here said same. So, a reverse proxy it will be.

AI answer

The GeoBlock plugin for Discourse uses the MaxMindDB database to determine a user’s country or network (ASN) based on their IP address, but the actual blocking occurs at the application level (inside the Discourse app), not at the server or network/firewall level.

In practice:

If a visitor’s IP matches a blocked country or network, the Discourse application returns an error page to the visitor instead of forum content.
Blocking does not happen until the HTTP request reaches the Discourse application. In other words, requests still pass through your web server (e.g., nginx) and Docker container and reach the Discourse software before the user is blocked.
This means you will still see these requests in your server and proxy/nginx logs, even if the user is ultimately blocked by Discourse.
If you require a “hard” block (preventing access even before the request reaches the Discourse app), you would need a server-level GeoIP solution (such as nginx/iptables-level blocking or an external tool).

Sources and more info:

Geo Blocking plugin - meta.discourse.org
Plugin GitHub documentation: GitHub - communiteq/discourse-geo-blocking · GitHub

Summary:
The Discourse GeoBlock plugin does not block requests at the network/server level, but only after the Discourse application processes the request. If you need to prevent any access before your application sees the request, you must use a server-level GeoIP approach.

I didn’t use share conversation because I asked in Finnish and you guys propably can’t it

opcourdis · 27 augustus 2025 om 15:13

Implies that your page is reached so yes you are on a closer layer to the server than a block done at the firewall level, however it does not mean that it is a security issue that necessitates a reverse proxy.

The tool I proposed is already 80% less request and discourse is a secure app, now if you have other things hosted on your server like a website a reverse proxy may be useful, meanwhile there are other solutions to block IP with bad reputations like Crowdsec, ask you AI about crowdsec light

RGJ · 28 augustus 2025 om 21:43

(auteur van de geoblocking-plugin hier)
Ja, de geoblocking-plugin stopt verzoeken op applicatieniveau, hoewel het dat in een zeer vroeg stadium doet. De reden hiervoor is dat het is ontworpen om een gebruiksvriendelijke foutpagina weer te geven, dus het moet de Discourse-assets kunnen laden en die pagina kunnen weergeven. Het logt ook alle blokkades naar /logs indien geconfigureerd om dit te doen.

Andere voordelen van deze aanpak zijn de mogelijkheid om de geblokkeerde landen en netwerken vanuit Discourse te configureren en de mogelijkheid om niet alleen de toegang te blokkeren, maar ook moderatie af te dwingen.

Als u zich zorgen maakt over log-inflatie of CDN-bandbreedteverbruik, is de plugin niets voor u, maar eerlijk gezegd denk ik niet dat deze twee dingen veel uitmaken.

Topic		Antwoorden	Weergaven
Controlling a huge spike in "Other traffic" Support hosted-support	6	252	13 november 2025
Anonymous views suddenly very high Data & reporting	48	1601	10 december 2025
MegaIndex bot did about 4,000 pageviews on one day Community Building	40	4746	2 december 2023
View IP address of guests / anonymous visitors? Data & reporting	13	1474	13 januari 2022
Smarter handling of random crawler traffic Feature	1	3563	29 maart 2018

Hoe om te gaan met plotseling hoog "Overige verkeer" in site-analyse?

Gerelateerde topics