Discourse health check function

Further to this it might be worth considering a built in health check which runs at some configurable interval.

It would be enough if it posted a message to admins indicating:

Your instance is at release X
The source is at release Y

Your disk space usage is X%
It has grown by Y% over the last month
at this rate your disk will be full in Z days.

The swap file has been used X times in the last month
Your RAM usage is at 100% Y% of the time.

OS Info: Blah Blah
Docker info: Blah Blah

Your instance sent X mails in the last month, and received Y, Z bounced or were rejected. (I’m not sure about this as a health check parameter)

The logs for the last month are here: URL

It wouldn’t have to include the generation of a bug report for upload, but that might be worth considering… If an admin had a text file which included all the info that support here on meta generally needs to know, with sensitive data already redacted, the handling of basic problems would become easier and faster.

Most of this is not Discourse specific, so would fall under “normal Linux monitoring” more or less

I agree Jeff.

The thing is… the 30 min install is so damned straightforward and simple that you can get Discourse up and running with almost zero knowledge of Linux - the hardest part of the whole process is getting keys and SSH to work, the rest is copy/paste. (That’s more or less where I was when I installed Discourse - I’d worked with Unix/Linux in the past, but I’d forgotten most of it).

It would mean that users who really aren’t up to “normal Linux monitoring” have something to present to those here who are. Have a look at how many of the incoming support/bug threads follow this general scheme:

“My discourse stopped working”
“What release are you on?”
“How do I check that, it doesn’t run?”
“Do the following voodoo”
“Here is the screenshot”
“How much space do you have?”
“How do I check that?”
etc, etc.

For a Developer it’s obvious what most problems are… once they’ve got a picture of where things are going wrong. For many users their installation is binary, either it works, or it doesn’t: Why it works, or how well it works is a bit of a mystery. Giving those users (and the support team) an info package which gives a quick overview would speed up the loop a bit.

2 Likes

Perhaps what should happen is that the installer could set up a cron job that did something like that. It has credentials to send mail…

The notion of some kind of health check for debugging has come up a number of times.