איך להתמודד עם תעבורת "אחר" גבוהה פתאומית בניתוח אתר?

Hi everyone,

I recently noticed a huge spike in “Other traffic” on my forum’s community health page (Discourse admin dashboard → Reports → Community health).

Here are the details:

•	Period: Around early August 2025

•	Daily traffic: jumped to 100k+ “Other traffic” per day

•	Example: On Aug 16, 2025

•	Logged-in pageviews: 12,531

•	Anonymous pageviews: 2,753

•	Known crawlers: 6,865

•	Other traffic: 102,054 (majority of my total 124k)

This “Other traffic” seems abnormal and is much higher than real user activity. Signups are stable, so it doesn’t look like genuine growth.

My questions are:

1\. What does “Other traffic” usually mean in Discourse?

2\. Could this be bots, spam, or misconfigured reverse proxy/CDN?

3\. How can I reduce or filter this traffic? (e.g. Nginx, firewall, Discourse settings)

4\. Is it safe to just ignore, or will it affect performance/cost?

Any suggestions or best practices to properly handle this kind of third-party/bot traffic would be very helpful.

Thanks in advance!

לייק 1

“Other traffic” are likely bots or crawlers, some more detail here Understanding pageviews and the site traffic report

You can check the crawler report on your dashboard for a possible source, and if you’d like, slow it down or block it… some more details on doing that in: Controlling Web Crawlers For a Site

2 לייקים

I’ve had issues in July with tons of requests from Singapore. I blocked an IP range, which worked for a while, but the problem came back harder in August (from Singapore, Hong Kong and Mexico) with high and unexpected CDN cost :face_with_steam_from_nose:

I noticed high pageviews from Amazonbot, DataForSeoBot, meta-externalagent, SeekportBot, etc…

This documentation Controlling Web Crawlers For a Site says:

This list doesn’t contain some of my most visiting bots, but I have a question nonetheless.
Would it be advisable to add this whole list to the Blocked crawler user agents setting?
Is there a way to bulk add bot names from a .txt file?

לייק 1
  1. Crawlers come with the intention to index your site in search engines, so the raise of traffic from them should be minimal except if those are bots that hides themselves as crawlers. Many forums don’t want to be indexed by crawlers and this is the option to do that : Crawlers carry an identity/reference to their source, so here you can add anyname you want that will only allow this source for the crawling(ahah such a strange word :slight_smile: )

  2. The most probable source responsible for your traffic increase are bots and you must check the logs of your server for that. If you know someone who just knows the very basic of linux, I would suggest this 2 minute setup tool to block countries having bad bots reputations.(you may find this easily online). After it is setup it is still good to let your community know that they may need a vpn to reach your site if they end up in holidays in those countries. Here is the tool, it’s efficient, it will cut by 80-90% unnecessary requests made to your server. You have 2 modes and must choose either : allowed countries, or prohibited countries.

    GitHub - friendly-bits/geoip-shell: User-friendly and versatile geoblocker for Linux

  3. You can also use Geo Blocking plugin but it blocks only the page view, but not the direct requests made to your server like the tooll above does.

לייק 1

Well, I guess that wouldn’t solve my issue, because the bots will consume CDN bandwidth anyway.

לייק 1

Are you totally sure about that? Because if that is true I’ll start a reverse proxy right away.

Edit

AI here said same. So, a reverse proxy it will be.

AI answer

The GeoBlock plugin for Discourse uses the MaxMindDB database to determine a user’s country or network (ASN) based on their IP address, but the actual blocking occurs at the application level (inside the Discourse app), not at the server or network/firewall level.

In practice:

  • If a visitor’s IP matches a blocked country or network, the Discourse application returns an error page to the visitor instead of forum content.
  • Blocking does not happen until the HTTP request reaches the Discourse application. In other words, requests still pass through your web server (e.g., nginx) and Docker container and reach the Discourse software before the user is blocked.
  • This means you will still see these requests in your server and proxy/nginx logs, even if the user is ultimately blocked by Discourse.
  • If you require a “hard” block (preventing access even before the request reaches the Discourse app), you would need a server-level GeoIP solution (such as nginx/iptables-level blocking or an external tool).

Sources and more info:

Summary:
The Discourse GeoBlock plugin does not block requests at the network/server level, but only after the Discourse application processes the request. If you need to prevent any access before your application sees the request, you must use a server-level GeoIP approach.

I didn’t use share conversation because I asked in Finnish and you guys propably can’t it :winking_face_with_tongue:

לייק 1

Implies that your page is reached so yes you are on a closer layer to the server than a block done at the firewall level, however it does not mean that it is a security issue that necessitates a reverse proxy.

The tool I proposed is already 80% less request and discourse is a secure app, now if you have other things hosted on your server like a website a reverse proxy may be useful, meanwhile there are other solutions to block IP with bad reputations like Crowdsec, ask you AI about crowdsec light :wink:

2 לייקים

(geoblocking plugin author here)
Yes, the geoblocking plugin stops requests at the application level, although it does that in a very early stage. The reason for this is that it was designed to show a user-friendly error page, so it must be able to load the Discourse assets and show that page. It also logs any blocks to /logs if configured to do so.

Other advantages of this approach are the ability to configure the blocked countries and networks from within Discourse and the ability to not just block the access but to force moderation as well.

If you’re concerned about log inflation or CDN bandwidth consumption the plugin is not for you, but TBH I don’t think these two things matter a lot.

לייק 1