Discourse health check function

JagWaugh · June 14, 2017, 7:24am

Further to this it might be worth considering a built in health check which runs at some configurable interval.

It would be enough if it posted a message to admins indicating:

Your instance is at release X
The source is at release Y

Your disk space usage is X%
It has grown by Y% over the last month
at this rate your disk will be full in Z days.

The swap file has been used X times in the last month
Your RAM usage is at 100% Y% of the time.

OS Info: Blah Blah
Docker info: Blah Blah

Your instance sent X mails in the last month, and received Y, Z bounced or were rejected. (I’m not sure about this as a health check parameter)

The logs for the last month are here: URL

It wouldn’t have to include the generation of a bug report for upload, but that might be worth considering… If an admin had a text file which included all the info that support here on meta generally needs to know, with sensitive data already redacted, the handling of basic problems would become easier and faster.

codinghorror · June 14, 2017, 7:27am

Most of this is not Discourse specific, so would fall under “normal Linux monitoring” more or less

JagWaugh · June 14, 2017, 7:51am

I agree Jeff.

The thing is… the 30 min install is so damned straightforward and simple that you can get Discourse up and running with almost zero knowledge of Linux - the hardest part of the whole process is getting keys and SSH to work, the rest is copy/paste. (That’s more or less where I was when I installed Discourse - I’d worked with Unix/Linux in the past, but I’d forgotten most of it).

It would mean that users who really aren’t up to “normal Linux monitoring” have something to present to those here who are. Have a look at how many of the incoming support/bug threads follow this general scheme:

“My discourse stopped working”
“What release are you on?”
“How do I check that, it doesn’t run?”
“Do the following voodoo”
“Here is the screenshot”
“How much space do you have?”
“How do I check that?”
etc, etc.

For a Developer it’s obvious what most problems are… once they’ve got a picture of where things are going wrong. For many users their installation is binary, either it works, or it doesn’t: Why it works, or how well it works is a bit of a mystery. Giving those users (and the support team) an info package which gives a quick overview would speed up the loop a bit.

pfaffman · June 14, 2017, 2:50pm

Perhaps what should happen is that the installer could set up a cron job that did something like that. It has credentials to send mail…

The notion of some kind of health check for debugging has come up a number of times.

Topic		Replies	Views
Discourse-doctor :woman_health_worker: Dev docker	26	5070	July 21, 2018
What URL should we monitor to be sure Discourse is up Support	3	1534	April 25, 2016
MKJ's Opinionated Discourse Deployment Configuration Sysadmins explanation , install	30	6472	March 23, 2024
Discourse Docker HW reserved/used (CPU, RAM, Disk) and how to manage it Installation server-resources	5	806	May 16, 2023
Discourse having momentary "downs" - How to get more info from the logs Installation	8	1200	May 24, 2023

Discourse health check function

Related topics