this is stale and out-of-date historic information
Our official Docker container ships with a custom NGINX config that logs very rich information that can be used to debug server performance issues.
Create a report
To run it (you will need to be on a very recent install, you must rebuild to update NGINX config):
cd /var/discourse
./launcher enter app
cd /var/www/discourse/plugins/discourse-nginx-performance-report/script
ruby nginx_analyze.rb
Report contents
The report provides the following:
Overview of request counts for the duration of the log
Total Requests: 11105 ( MessageBus: 6757 )
We can see that we processed 11105 requests total, 6757 were consumed by message_bus. To learn more about message bus see this post or search meta
Top 30 IPs by server load
Which IPs are responsible for the largest amount of CPU time?
IP Address Duration
---------- --------
100.43.90.12 16.71
10.66.0.21 7.45
10.66.0.22 7.31
... etc ...
So in the above we can see that Yandex crawling is causing a fair amount of load. As well as a couple of private IP addresses.
Top 30 Users by Server Load
Top 30 users by Server Load
Username Duration Reqs Routes
-------- -------- ---- ------
[Anonymous] 3746.81 132094 topics/show(1110.60) user_avatars/show(814.98)
user_avatars/show_letter(713.23)
forums/status(418.98) static/cdn_asset(206.63)
alep 232.11 1723 list/latest(232.11)
sam 36.82 518 topics/show(14.12) list/latest(6.86)
topics/timings(4.48) posts/create(1.69)
draft/update(1.09)
codinghorror 33.05 618 topics/wordpress(11.91) topics/show(11.14)
topics/timings(3.33) post_actions/create(1.88)
posts/create(1.75)
You can see counts of requests, total times and a breakdown by route cost.
For example … in this report @alep hit list/latest
our front page 1723 times resulting in 232 seconds of processing time.
Top 30 routes by Server Load
Route Duration
----- --------
topics/show 59.24
user_avatars/show_letter 21.07
user_avatars/show 14.92
forums/status 14.77
list/latest 4.94
topics/timings 4.10
static/cdn_asset 3.46
list/category_latest 0.98
topics/feed 0.97
This information is particularly useful for quite a few reasons and can be used for diagnostics.
Routes at the top of the list are ones that have the most impact on global performance so it makes sense for us to focus on improving them (or cutting down calls to them) first.
Top 30 URLs by server load
Url Duration
--- --------
GET /srv/status HTTP/1.1 14.77
HEAD /t/what-do-user-trust-levels-do/4924 HTTP/1.1 5.34
POST /topics/timings HTTP/1.1 4.10
GET /t/what-do-user-trust-levels-do/4924 HTTP/1.1 2.68
GET / HTTP/1.1 2.16
This shows us which URL are responsible for the most amount of load on the server. Perhaps there is a topic that has performance issues? Perhaps there is a page that is visited too often by bots?
Top 30 not found URLs (404)
Url Count
--- -----
GET //assets-meta-cdck-prod-meta.s3.dualstack.us-west-1.amazonaws.com/original/3X/e/1/e1fc4c5714ea45bc473feea787021d39f7133d22.png HTTP/1.1 9
GET //assets-meta-cdck-prod-meta.s3.dualstack.us-west-1.amazonaws.com/original/3X/7/2/72d2687d1d00c52c546727efbbc6d726f6e153de.png HTTP/1.1 2
GET /node/add HTTP/1.1 1
We can use this report to find broken links, in some cases it may make sense to place a redirect in place to help recover the traffic.
Automatic daily performance reports
You can also enable automatic daily performance reports on your site by checking:
daily performance report
in site settings.
This will give you daily reports delivered to a single discourse topic (under staff category)
You can shift it to another category and watch it and then you get the reports via email. I would strongly recommend not moving the topic to a public category as you will leak out ips and sensitive info.
These reports also enable you to discuss the results in a topic more easily
How it works
To achieve rich diagnostics, Discourse adds 2 additional headers to all requests in application_controller.rb
def set_current_user_for_logs
if current_user
Logster.add_to_env(request.env,"username",current_user.username)
response.headers["X-Discourse-Username"] = current_user.username
end
response.headers["X-Discourse-Route"] = "#{controller_name}/#{action_name}"
end
Furthermore we define a custom NGINX log format:
log_format log_discourse '[$time_local] $remote_addr "$request" "$http_user_agent" "$sent_http_x_discourse_route" $status $bytes_sent "$http_referer" $upstream_response_time $request_time "$sent_http_x_discourse_username"';
This means that we track: User Agent, Duration, Status, Rails Duration, Referrer and more for every request.
Out of the box NGINX does not log the time it spends in the “upstream” proxy, this is added to our logs using $upstream_response_time
Given this rich information I took a stab at creating a “reporting tool” that analyzes the logs and gives you some interesting reports, the tool is early in development, but already useful: https://github.com/discourse/discourse/blob/master/script/nginx_analyze.rb