Prometheus exporter plugin for Discourse

official

(Sam Saffron) #1

Official Prometheus Exporter for Discourse

Repo: https://github.com/discourse/discourse-prometheus

The Discourse Prometheus plugin collects key metrics from Discourse and exposes them in the /metrics path so prometheus can consume them.

These metrics can be used to Graph all sorts of data like:

Median and 99th percentile times for topic / categories / top and latest pages. Breaking down execution time between SQL/Redis and App.

Page view tracking

Error tracking

Ruby object space tracking including allocation rate, heaps and so on.

Hosted V8 memory statistics

Scheduled Job Queue and Sidekiq job durations and executions.

To see a full list of metrics available, install the plugin and visit SITENAME/metrics as an admin.

Out of the box we allow the metrics route to admins and private ips.


Discourse Prometheus is smart enough to aggregate data for all forked unicorn processes and present it as cohesive metrics on a single endpoint. We use it internally to keep track of our sites.

Sample dashboard at: Discourse Stats dashboard for Grafana | Grafana Labs


Install Prometheus plugin + Prometheus on your server
Webhook for Discourse Uptime Monitoring?
Prometheus exporter problem with SecureRandom (multisite?)
Huge increase in Redis use after changing hosts
Opinions on AWS vs DO Page Speed / Performance
Discourse disk space limits per user?
Help us build the new 2.0 Dashboard
(Jay Pfaffman) #2

This looks very cool. Thanks.

Looks like if login_required is set you can’t pull without an API key.


#3

@sam
Could you please add the “official” tag to the topic.
I find it really helpful while installing official plugins.


(Sam Saffron) #4

No repro here, getting stats just fine from our login required dev instance.

Sure, done.


(Jay Pfaffman) #5

Domain name resolved to an elastic IP, which made the request come from a non-local address.

Putting the host name in /etc/hosts solved the problem.


(Jay Pfaffman) #6

Does this work as expected on a multisite instance? (It’s not for me, am I doing something silly, or is something different needed for multisite?)

That is, grafana doesn’t see any data. Cannot read property 'replace' of undefined.

Apparently I (almost?) solved this before? Prometheus exporter problem with SecureRandom (multisite?).


(Sam Saffron) #7

Works great for us in multisite


(Chris Beach) #8

Plugin is working well, thank you!

However, I do see a lot of errors in the /admin/upgrade console during an upgrade - repeated error messages of the following form:

2018-09-13 22:22:08 +0000: Starting Prometheus Collector pid: 7919 port: 9405
2018-09-13 22:22:08 +0000: Prometheus Collector is monitoring 2526
/usr/local/lib/ruby/2.5.0/socket.rb:201:in `bind': Address already in use - bind(2) for [::]:9405 (Errno::EADDRINUSE)
	from /usr/local/lib/ruby/2.5.0/socket.rb:201:in `listen'
...
	from /var/www/discourse/plugins/discourse-prometheus/gems/2.5.1/gems/prometheus_exporter-0.3.0/lib/prometheus_exporter/server/web_server.rb:39:in `new'
	from /var/www/discourse/plugins/discourse-prometheus/gems/2.5.1/gems/prometheus_exporter-0.3.0/lib/prometheus_exporter/server/web_server.rb:39:in `initialize'
	from /var/www/discourse/plugins/discourse-prometheus/bin/collector:57:in `new'
	from /var/www/discourse/plugins/discourse-prometheus/bin/collector:57:in `<main>'
Detected dead worker 7944, restarting...
Attempting to kill pid 7954
2018-09-13 22:22:09 +0000: Starting Prometheus Collector pid: 7947 port: 9405
2018-09-13 22:22:09 +0000: Prometheus Collector is monitoring 2526
/usr/local/lib/ruby/2.5.0/socket.rb:201:in `bind': Address already in use - bind(2) for [::]:9405 (Errno::EADDRINUSE)
	from /usr/local/lib/ruby/2.5.0/socket.rb:201:in `listen'
	from /usr/local/lib/ruby/2.5.0/socket.rb:764:in `block in tcp_server_sockets'
...	from /usr/local/lib/ruby/2.5.0/webrick/httpserver.rb:47:in `initialize'
	from /var/www/discourse/plugins/discourse-prometheus/gems/2.5.1/gems/prometheus_exporter-0.3.0/lib/prometheus_exporter/server/web_server.rb:39:in `new'
	from /var/www/discourse/plugins/discourse-prometheus/gems/2.5.1/gems/prometheus_exporter-0.3.0/lib/prometheus_exporter/server/web_server.rb:39:in `initialize'
	from /var/www/discourse/plugins/discourse-prometheus/bin/collector:57:in `new'
	from /var/www/discourse/plugins/discourse-prometheus/bin/collector:57:in `<main>'
gzip -f -c -9 /var/www/discourse/public/assets/locales/th-398ba920acd1dc2ab6bf42fe9150f4826c58e58b5e203f2ccd2ba73513af53ee.js > /var/www/discourse/public/assets/locales/th-398ba920acd1dc2ab6bf42fe9150f4826c58e58b5e203f2ccd2ba73513af53ee.js.gz

I have several Discourse instances running on the same host - perhaps this is causing problems?

chris@gbyk1:/var/discourse$ docker ps
CONTAINER ID        IMAGE                  COMMAND             CREATED             STATUS              PORTS                                         NAMES
6d88aa2b39fe        local_discourse/app    "/sbin/boot"        2 weeks ago         Up 2 weeks          0.0.0.0:9092->80/tcp, 0.0.0.0:9093->443/tcp   app
33c4f8eccf9b        local_discourse/se6    "/sbin/boot"        2 weeks ago         Up 2 weeks          0.0.0.0:9098->80/tcp, 0.0.0.0:9099->443/tcp   se6
56adc9eb1b9d        local_discourse/se26   "/sbin/boot"        6 months ago        Up 2 weeks          0.0.0.0:9094->80/tcp, 0.0.0.0:9095->443/tcp   se26

(Sam Saffron) #9

Interesting, there could be some edge cases with the ui based upgrade

After it is done does the plugin continue to work?


(Chris Beach) #10

It did, the last time.

The current upgrade does seem to be taking a while this time … hasn’t finished yet. I closed the upgrade page, re-opened and the console output is currently blank, so I’m not sure what state it’s in :fearful:


(Sam Saffron) #11

If you see that wait 20-30 mins, I recommend resetting upgrade and rebuilding from console if it does not come good


(Chris Beach) #12

Thanks Sam. I’ve reset the upgrade and set up a cron job to rebuild the app at a quiet time in the middle of the night.


#13

I also see those repeated errors when upgrading from the UI with this plugin. I have removed the plugin for now so that UI upgrades continue to work. I only have one instance running on this server.