• Period: Around early August 2025
• Daily traffic: jumped to 100k+ “Other traffic” per day
• Example: On Aug 16, 2025
• Logged-in pageviews: 12,531
• Anonymous pageviews: 2,753
• Known crawlers: 6,865
• Other traffic: 102,054 (majority of my total 124k)
This “Other traffic” seems abnormal and is much higher than real user activity. Signups are stable, so it doesn’t look like genuine growth.
My questions are:
1\. What does “Other traffic” usually mean in Discourse?
2\. Could this be bots, spam, or misconfigured reverse proxy/CDN?
3\. How can I reduce or filter this traffic? (e.g. Nginx, firewall, Discourse settings)
4\. Is it safe to just ignore, or will it affect performance/cost?
Any suggestions or best practices to properly handle this kind of third-party/bot traffic would be very helpful.
You can check the crawler report on your dashboard for a possible source, and if you’d like, slow it down or block it… some more details on doing that in: Controlling Web Crawlers For a Site
I’ve had issues in July with tons of requests from Singapore. I blocked an IP range, which worked for a while, but the problem came back harder in August (from Singapore, Hong Kong and Mexico) with high and unexpected CDN cost
I noticed high pageviews from Amazonbot, DataForSeoBot, meta-externalagent, SeekportBot, etc…
This list doesn’t contain some of my most visiting bots, but I have a question nonetheless.
Would it be advisable to add this whole list to the Blocked crawler user agents setting?
Is there a way to bulk add bot names from a .txt file?
Crawlers come with the intention to index your site in search engines, so the raise of traffic from them should be minimal except if those are bots that hides themselves as crawlers. Many forums don’t want to be indexed by crawlers and this is the option to do that : Crawlers carry an identity/reference to their source, so here you can add anyname you want that will only allow this source for the crawling(ahah such a strange word )
The most probable source responsible for your traffic increase are bots and you must check the logs of your server for that. If you know someone who just knows the very basic of linux, I would suggest this 2 minute setup tool to block countries having bad bots reputations.(you may find this easily online). After it is setup it is still good to let your community know that they may need a vpn to reach your site if they end up in holidays in those countries. Here is the tool, it’s efficient, it will cut by 80-90% unnecessary requests made to your server. You have 2 modes and must choose either : allowed countries, or prohibited countries.
Are you totally sure about that? Because if that is true I’ll start a reverse proxy right away.
Edit
AI here said same. So, a reverse proxy it will be.
AI answer
The GeoBlock plugin for Discourse uses the MaxMindDB database to determine a user’s country or network (ASN) based on their IP address, but the actual blocking occurs at the application level (inside the Discourse app), not at the server or network/firewall level.
In practice:
If a visitor’s IP matches a blocked country or network, the Discourse application returns an error page to the visitor instead of forum content.
Blocking does not happen until the HTTP request reaches the Discourse application. In other words, requests still pass through your web server (e.g., nginx) and Docker container and reach the Discourse software before the user is blocked.
This means you will still see these requests in your server and proxy/nginx logs, even if the user is ultimately blocked by Discourse.
If you require a “hard” block (preventing access even before the request reaches the Discourse app), you would need a server-level GeoIP solution (such as nginx/iptables-level blocking or an external tool).
Summary:
The Discourse GeoBlock plugin does not block requests at the network/server level, but only after the Discourse application processes the request. If you need to prevent any access before your application sees the request, you must use a server-level GeoIP approach.
I didn’t use share conversation because I asked in Finnish and you guys propably can’t it
Implies that your page is reached so yes you are on a closer layer to the server than a block done at the firewall level, however it does not mean that it is a security issue that necessitates a reverse proxy.
The tool I proposed is already 80% less request and discourse is a secure app, now if you have other things hosted on your server like a website a reverse proxy may be useful, meanwhile there are other solutions to block IP with bad reputations like Crowdsec, ask you AI about crowdsec light
(auteur van de geoblocking-plugin hier)
Ja, de geoblocking-plugin stopt verzoeken op applicatieniveau, hoewel het dat in een zeer vroeg stadium doet. De reden hiervoor is dat het is ontworpen om een gebruiksvriendelijke foutpagina weer te geven, dus het moet de Discourse-assets kunnen laden en die pagina kunnen weergeven. Het logt ook alle blokkades naar /logs indien geconfigureerd om dit te doen.
Andere voordelen van deze aanpak zijn de mogelijkheid om de geblokkeerde landen en netwerken vanuit Discourse te configureren en de mogelijkheid om niet alleen de toegang te blokkeren, maar ook moderatie af te dwingen.
Als u zich zorgen maakt over log-inflatie of CDN-bandbreedteverbruik, is de plugin niets voor u, maar eerlijk gezegd denk ik niet dat deze twee dingen veel uitmaken.