Server not response (DO SGP1)


(Richard Soutar) #1

Hi guys,

For the past few days I had a problem that DO server CPU spike for unknown reason.

Here’s the $ top command. It seems the postmaster was busy.

I only had 509 users on DO 2GB memory.


(Richard Soutar) #3

No. I use Mailgun.

What I did after that was ./lanucher rebuild app. now It’s error 502 bad gateway.


(Michael Downey) #4

Sorry, I totally misread your issue as postfix and not postmaster. :slight_smile:


(Richard Soutar) #5

I got his error from my log. After ran rebuild again.

FAILED
--------------------
RuntimeError: su postgres -c 'psql discourse -c "alter schema public owner to di                                                                                                                                                             scourse;"' failed with return #<Process::Status: pid 91 exit 2>
Location of failure: /pups/lib/pups/exec_command.rb:105:in `spawn'
exec failed with the params "su postgres -c 'psql $db_name -c \"alter schema pub                                                                                                                                                             lic owner to $db_user;\"'"
8c5331f9905f986b28e3c6fe278f3a4a876ccb3caea9c42341cba440295a1e71
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages,                                                                                                                                                              there may be more than one
root@forum:/var/discourse# ./launcher rebuild app
Ensuring launcher is up to date
Fetching origin
Launcher is up-to-date
Stopping old container

(Jeff Atwood) #6

Yep, you will need to read what is on the screen there, “scroll up and look for error”.

Edit: check free disk space and free memory.


(Richard Soutar) #7

I have found out that every couple of days. The disk will be full.

It was /var/discourse/shared/standalone/log/rails/unicorn.stderr.log that got 20+ GB in 9 hours. I deleted the log but some context of the log was

Failed to report error: EXECABORT Transaction discarded because of previous errors. 3 Job exception: MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.


(Matt Palmer) #8

… and what did the Redis logs have to say?


(Richard Soutar) #9

Sorry, I have no idea where Redis log is.


(Jeff Atwood) #10

It looks like there is something broken about your install. Did you follow our official install guide, to the letter? Did you deviate from any steps in the install in any way, however small?


(Richard Soutar) #11

I use a one-click installation from Digital Ocean.

It’s been running fine since December '15. This started to occur for the past 2 weeks.


(Jeff Atwood) #12

Ok, then try the following to free up some disk space:

apt-get autoclean
apt-get autoremove
cd /var/discourse
./launcher cleanup

See how much space that clears first.

Something is causing massive numbers of errors in your logs, I presume? Hard to say without looking at them. You’ll need to poke around.


(Richard Soutar) #13

Here’s before cleanup

root@forum:/var/discourse/shared/standalone/log/rails# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        40G   13G   26G  33% /
none            4.0K     0  4.0K   0% /sys/fs/cgroup
udev            991M  4.0K  991M   1% /dev
tmpfs           201M  352K  200M   1% /run
none            5.0M     0  5.0M   0% /run/lock
none           1001M  864K 1001M   1% /run/shm
none            100M     0  100M   0% /run/user
root@forum:/var/discourse/shared/standalone/log/rails#

After cleanup

Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        40G  4.7G   33G  13% /
none            4.0K     0  4.0K   0% /sys/fs/cgroup
udev            991M  4.0K  991M   1% /dev
tmpfs           201M  352K  200M   1% /run
none            5.0M     0  5.0M   0% /run/lock
none           1001M  868K 1001M   1% /run/shm
none            100M     0  100M   0% /run/user

I will keep monitoring the log again.


(Richard Soutar) #14

It happened again.

E, [2016-05-10T18:35:04.540407 #62] ERROR -- : config/unicorn.conf.rb:99:in `out_of_memory?'
E, [2016-05-10T18:35:04.540448 #62] ERROR -- : config/unicorn.conf.rb:122:in `check_sidekiq_heartbeat'
E, [2016-05-10T18:35:04.540489 #62] ERROR -- : config/unicorn.conf.rb:146:in `master_sleep'
E, [2016-05-10T18:35:04.540529 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/unicorn-5.0.1/lib/unicorn/http_server.rb:284:in `join'
E, [2016-05-10T18:35:04.540571 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/unicorn-5.0.1/bin/unicorn:126:in `<top (required)>'
E, [2016-05-10T18:35:04.540615 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/bin/unicorn:22:in `load'
E, [2016-05-10T18:35:04.540656 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/bin/unicorn:22:in `<main>'
E, [2016-05-10T18:35:04.540908 #62] ERROR -- : master loop error: Cannot allocate memory - ps -eo rss,args | grep sidekiq | grep -v grep | awk '{print $1}' (Errno::ENOMEM)
E, [2016-05-10T18:35:04.540975 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/activesupport-4.2.6/lib/active_support/core_ext/kernel/agnostics.rb:7:in ``'
E, [2016-05-10T18:35:04.541021 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/activesupport-4.2.6/lib/active_support/core_ext/kernel/agnostics.rb:7:in ``'
E, [2016-05-10T18:35:04.541072 #62] ERROR -- : config/unicorn.conf.rb:84:in `max_rss'
E, [2016-05-10T18:35:04.541120 #62] ERROR -- : config/unicorn.conf.rb:99:in `out_of_memory?'
E, [2016-05-10T18:35:04.541162 #62] ERROR -- : config/unicorn.conf.rb:122:in `check_sidekiq_heartbeat'
E, [2016-05-10T18:35:04.541213 #62] ERROR -- : config/unicorn.conf.rb:146:in `master_sleep'
E, [2016-05-10T18:35:04.541264 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/unicorn-5.0.1/lib/unicorn/http_server.rb:284:in `join'
E, [2016-05-10T18:35:04.541309 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/unicorn-5.0.1/bin/unicorn:126:in `<top (required)>'
E, [2016-05-10T18:35:04.541367 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/bin/unicorn:22:in `load'
E, [2016-05-10T18:35:04.541412 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/bin/unicorn:22:in `<main>'
E, [2016-05-10T18:35:04.541663 #62] ERROR -- : master loop error: Cannot allocate memory - ps -eo rss,args | grep sidekiq | grep -v grep | awk '{print $1}' (Errno::ENOMEM)
E, [2016-05-10T18:35:04.541729 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/activesupport-4.2.6/lib/active_support/core_ext/kernel/agnostics.rb:7:in ``'
E, [2016-05-10T18:35:04.541775 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/activesupport-4.2.6/lib/active_support/core_ext/kernel/agnostics.rb:7:in ``'
E, [2016-05-10T18:35:04.541819 #62] ERROR -- : config/unicorn.conf.rb:84:in `max_rss'
E, [2016-05-10T18:35:04.541862 #62] ERROR -- : config/unicorn.conf.rb:99:in `out_of_memory?'
E, [2016-05-10T18:35:04.541908 #62] ERROR -- : config/unicorn.conf.rb:122:in `check_sidekiq_heartbeat'
E, [2016-05-10T18:35:04.541950 #62] ERROR -- : config/unicorn.conf.rb:146:in `master_sleep'
E, [2016-05-10T18:35:04.541991 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/unicorn-5.0.1/lib/unicorn/http_server.rb:284:in `join'
E, [2016-05-10T18:35:04.542033 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/unicorn-5.0.1/bin/unicorn:126:in `<top (required)>'
E, [2016-05-10T18:35:04.542076 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/bin/unicorn:22:in `load'
E, [2016-05-10T18:35:04.542117 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/bin/unicorn:22:in `<main>'
E, [2016-05-10T18:35:04.542359 #62] ERROR -- : master loop error: Cannot allocate memory - ps -eo rss,args | grep sidekiq | grep -v grep | awk '{print $1}' (Errno::ENOMEM)
E, [2016-05-10T18:35:04.542426 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/activesupport-4.2.6/lib/active_support/core_ext/kernel/agnostics.rb:7:in ``'
E, [2016-05-10T18:35:04.542472 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/activesupport-4.2.6/lib/active_support/core_ext/kernel/agnostics.rb:7:in ``'
E, [2016-05-10T18:35:04.542515 #62] ERROR -- : config/unicorn.conf.rb:84:in `max_rss'
E, [2016-05-10T18:35:04.542557 #62] ERROR -- : config/unicorn.conf.rb:99:in `out_of_memory?'
E, [2016-05-10T18:35:04.542597 #62] ERROR -- : config/unicorn.conf.rb:122:in `check_sidekiq_heartbeat'
E, [2016-05-10T18:35:04.542638 #62] ERROR -- : config/unicorn.conf.rb:146:in `master_sleep'
E, [2016-05-10T18:35:04.542684 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/unicorn-5.0.1/lib/unicorn/http_server.rb:284:in `join'
E, [2016-05-10T18:35:04.542732 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/unicorn-5.0.1/bin/unicorn:126:in `<top (required)>'
E, [2016-05-10T18:35:04.542776 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/bin/unicorn:22:in `load'
E, [2016-05-10T18:35:04.542817 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/bin/unicorn:22:in `<main>'
E, [2016-05-10T18:35:04.543081 #62] ERROR -- : master loop error: Cannot allocate memory - ps -eo rss,args | grep sidekiq | grep -v grep | awk '{print $1}' (Errno::ENOMEM)
E, [2016-05-10T18:35:04.543158 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/activesupport-4.2.6/lib/active_support/core_ext/kernel/agnostics.rb:7:in ``'
E, [2016-05-10T18:35:04.543207 #62] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/activesupport-4.2.6/lib/active_support/core_ext/kernel/agnostics.rb:7:in ``'
E, [2016-05-10T18:35:04.543251 #62] ERROR -- : config/unicorn.conf.rb:84:in `max_rss'
E, [2016-05-10T18:35:04.543293 #62] ERROR -- : config/unicorn.conf.rb:99:in `out_of_memory?'
E, [2016-05-10T18:35:04.543334 #62] ERROR -- : config/unicorn.conf.rb:122:in `check_sidekiq_heartbeat'
E, [2016-05-10T18:35:04.543375 #62] ERROR -- : config/unicorn.conf.rb:146:in `master_sleep'

infinite loop this message.


(Jeff Atwood) #15

Are you using any third-party plugins? Any unusual settings? I am not aware of any persistent out of memory issues on our Docker based DO installs.

Do you have a swapfile? It shouldn’t matter except for upgrades, but given the problems you’re having, I would set a 2GB swapfile up quickly.


(Richard Soutar) #16

No, I did not install any third-party plugins. I do not have a swapfile. I’ll have it setup now.


(Sam Saffron) #17

Please try

cd /var/discourse
./launcher rebuild app
./launcher cleanup

We had a rather urgent fix related to tagging that could cause disks to fill up