Unicorn log files using up all the space on my box

Hey all,

I can’t find a thread about this. I recently installed a fresh Discourse on a box last week, and today I noticed that all available disk space was used up. Looking further, the largest file on my box was a 1.3 gigabyte log file.

ubuntu@<hostname>:/var/discourse/shared/web-only/log/rails$ ls -lrt
total 1314412
-rw-r--r-- 1 ubuntu www-data          0 Oct 20 21:41 production_errors.log
-rw-r--r-- 1 ubuntu www-data      73182 Oct 21 22:29 production.log-20151022
-rw-r--r-- 1 ubuntu www-data       1494 Oct 22 03:10 unicorn.stdout.log-20151022
-rw-r--r-- 1 ubuntu www-data          0 Oct 22 07:35 production.log
-rw-r--r-- 1 ubuntu www-data          0 Oct 22 07:35 unicorn.stdout.log
-rw-r--r-- 1 ubuntu www-data        538 Oct 22 07:35 unicorn.stderr.log-20151022.gz
-rw-r--r-- 1 ubuntu www-data 1345859584 Oct 23 07:37 unicorn.stderr.log-20151023
-rw-r--r-- 1 ubuntu www-data          0 Oct 26 18:24 unicorn.stderr.log

I didn’t have enough available disk space to open and read what the logs were. Am currently on version v1.5.0.beta3 +19.

Am I missing some config that rotates these logs? Or is this is a bug?

I think there is something wrong with your install. Did you follow our official guide?

Yup, as close as I could, using discourse docker. It completed successfully, and it looks like Discourse booted up just fine.

I had to run ./launcher bootstrap app multiple times due to bad database configs, not sure if that changes anything.

The only difference from the official guide should be that this is running on EC2 instead of Digital Ocean. Am on Ubuntu 14.04.

You will need to look at that Unicorn error log. Something is broken in your install.

Unfortunately, I’ve already deleted it. I’ll monitor this instance to see if this happens again.

What’s the process that rotates these logs? Is there anything I can look into in the meantime that could be busted?

Here’s the log from the previous day, if that reveals anything.

I, [2015-10-21T20:36:03.152600 #50]  INFO -- : Refreshing Gem list
I, [2015-10-21T20:36:08.416051 #50]  INFO -- : listening on addr=0.0.0.0:3000 fd=15
Loading Sidekiq in process id 73
I, [2015-10-21T20:36:10.588168 #142]  INFO -- : worker=0 ready
I, [2015-10-21T20:36:11.578290 #278]  INFO -- : worker=1 ready
I, [2015-10-21T20:36:12.571632 #50]  INFO -- : master process ready
I, [2015-10-21T20:36:12.583514 #305]  INFO -- : worker=2 ready
find: `/var/www/discourse/public/tombstone/default': No such file or directory
E, [2015-10-21T22:12:52.377261 #50] ERROR -- : reaped #<Process::Status: pid 73 SIGKILL (signal 9)> worker=unknown
E, [2015-10-21T22:12:52.402294 #50] ERROR -- : reaped #<Process::Status: pid 278 SIGKILL (signal 9)> worker=1
E, [2015-10-21T22:12:52.402356 #50] ERROR -- : reaped #<Process::Status: pid 305 SIGKILL (signal 9)> worker=2
Detected dead worker 73, restarting...
I, [2015-10-21T22:12:58.139406 #7622]  INFO -- : worker=1 ready
I, [2015-10-21T22:12:58.139752 #7626]  INFO -- : worker=2 ready
Loading Sidekiq in process id 7632
D, [2015-10-21T22:13:27.161680 #50] DEBUG -- : waiting 16.0s after suspend/hibernation
I, [2015-10-22T07:35:12.745462 #50]  INFO -- : master reopening logs...
I, [2015-10-22T07:35:12.812351 #142]  INFO -- : worker=0 reopening logs...
I, [2015-10-22T07:35:12.817665 #7622]  INFO -- : worker=1 reopening logs...
I, [2015-10-22T07:35:12.821608 #7626]  INFO -- : worker=2 reopening logs...

Yes, we have log rotation configured, and it did work in this case – the huge file wasn’t the one “currently” in use, but rather was rotated a few days ago. Something was causing Unicorn to puke up huge amounts of error logs for less than a day, and then it stopped (could have been because the disk filled up, I guess). Without knowing what it was, it’s hard to give a solid diagnosis, but if the problem isn’t happening now, I wouldn’t worry about it. If it starts happening again, keep the logs!

2 Likes

If that’s the case, then it’s kind of alarming that logging so much in one day can use up all my disk space. It brought forum down, and nginx started responding with 502’s. Perhaps you guys can also rotate the logs based on file size, to avoid these kinds of situations?

Like I said, there was something catastrophically wrong with your specific install. Never seen this happen before across hundreds of installs…

If we had the log file, we could look, but… we don’t

Ok… interesting.

I’ll report back if this happens again. I’m planning on re-doing the install.

Logrotate still only runs once a day. Size-based rotation wouldn’t have saved you here. Perhaps you could monitor your server for free space, so you could catch these kinds of situations?

1 Like

I’m not too familiar with logrotate, as I don’t use it, but it seems like you might be able to configure it to run hourly. I rotate logs on my own servers based on size, but I’ve always used my own logging module within my server rather than a separate process.

Anyway. If this has never happened to anybody else before I can understand that it’s not a big deal, but I will report back if it does happen to me again, plus whatever was being logged out.

2 Likes

Ok, I’ve got the same issue occurring on my Discourse install. I followed the Docker install instructions originally and everything seemed to be fine, but have started having this trouble recently. I did have several beta updates go through that fell over part way through and I had to cancel.

Is doing an upgrade to the latest version likely to resolve this problem (I’ll be doing it via SSH as that’s always been successful for me), or am I looking at backing up my config file and reinstalling?

What does the unicorn log file actually say, if it is so large?

1 Like

In short, I’m struggling to find out as nano / vi won’t actually open it due to it being so big.

Try tailing the file?

3 Likes

I’ve just deleted the log again in order to get the forum back up, but it seems to suddenly leap in size about once every 2 days, so I’ll do that next time. Thanks for the tip!

I have also encountered this issue, and it seems caused by out-of-memory. 1G memory + 2G SWAP seems not enough for hosting discourse, I have upgraded to 2G though.

the tail of the error log:

E, [2016-04-19T07:53:47.411276 #54] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/activesupport-4.2.6/lib/active_support/core_ext/kernel/agnostics.rb:7:in ``'
E, [2016-04-19T07:53:47.411301 #54] ERROR -- : config/unicorn.conf.rb:84:in `max_rss'
E, [2016-04-19T07:53:47.411325 #54] ERROR -- : config/unicorn.conf.rb:99:in `out_of_memory?'
E, [2016-04-19T07:53:47.411349 #54] ERROR -- : config/unicorn.conf.rb:122:in `check_sidekiq_heartbeat'
E, [2016-04-19T07:53:47.411381 #54] ERROR -- : config/unicorn.conf.rb:146:in `master_sleep'
E, [2016-04-19T07:53:47.495358 #54] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/unicorn-5.0.1/lib/unicorn/http_server.rb:284:in `join'
E, [2016-04-19T07:53:47.495412 #54] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/unicorn-5.0.1/bin/unicorn:126:in `<top (required)>'
E, [2016-04-19T07:53:47.495464 #54] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/bin/unicorn:22:in `load'
E, [2016-04-19T07:53:47.495489 #54] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.0.0/bin/unicorn:22:in `<main>'
I, [2016-04-19T07:53:47.495527 #54]  INFO -- : master reopening logs...

Right, finally had it explode when I had the time to do some more diagnostics, piped the last 5000 lines from unicorn.stderr.log to a new file:

unicornsample.txt (443.8 KB)

Failing there being anything obvious in the error log, is there a straightforward way to reinstall the software without touching the database and the config file?

2 Likes

Hmm, from what I can see in the logs it looks like it might be related to Redis, but the instructions I can see for reinstalling that are definitely not part of what I did to set up Discourse!