Discourse docker automatically down

Hi people my problems automatically down discourse forum
And sometimes I am also getting 502 Bad Gateway
unicorn.stderr.log

D, [2020-07-15T16:29:57.037389 #32767] DEBUG -- : waiting 16.0s after suspend/hibernation
E, [2020-07-15T18:49:48.649399 #32767] ERROR -- : worker=0 PID:8593 timeout (31s > 30s), killing
E, [2020-07-15T18:49:50.220209 #32767] ERROR -- : reaped #<Process::Status: pid 8593 SIGKILL (signal 9)> worker=0
E, [2020-07-15T18:50:25.881312 #32767] ERROR -- : worker=2 PID:13929 timeout (31s > 30s), killing
E, [2020-07-15T18:50:25.881493 #32767] ERROR -- : worker=1 PID:32508 timeout (31s > 30s), killing
E, [2020-07-15T18:50:25.949739 #32767] ERROR -- : reaped #<Process::Status: pid 13929 SIGKILL (signal 9)> worker=2
E, [2020-07-15T18:50:25.949869 #32767] ERROR -- : reaped #<Process::Status: pid 32508 SIGKILL (signal 9)> worker=1
I, [2020-07-15T18:51:00.385865 #19149]  INFO -- : worker=0 ready
I, [2020-07-15T18:51:00.385899 #19193]  INFO -- : worker=2 ready
I, [2020-07-15T18:51:00.385899 #19189]  INFO -- : worker=1 ready
E, [2020-07-15T18:51:44.033303 #32767] ERROR -- : worker=2 PID:19193 timeout (31s > 30s), killing
E, [2020-07-15T18:51:44.051941 #32767] ERROR -- : reaped #<Process::Status: pid 19193 SIGKILL (signal 9)> worker=2
I, [2020-07-15T18:51:49.476608 #19302]  INFO -- : worker=2 ready
E, [2020-07-15T18:51:55.064179 #32767] ERROR -- : worker=1 PID:19189 timeout (31s > 30s), killing
E, [2020-07-15T18:51:55.085863 #32767] ERROR -- : reaped #<Process::Status: pid 19189 SIGKILL (signal 9)> worker=1
I, [2020-07-15T18:52:00.812373 #19324]  INFO -- : worker=1 ready

That means your web process is taking over 30s to respond. Can you remove all custom plugins and rebuild?

1 Like

started ./launcher rebuild app
just only one docker manager plugin

What is your server? Is it very slow? How much ram? Do you have SSD or spinning disks? How big is your database?

2 Likes

system is working normally
information
Cpu: 50% i3 4 core
Disk Usage of /: 7.9% of 1.79TB
Memory usage: 61% 8g
Swap usage: 19% 4g

I rebuild app done

 new_subscriber_thread'"] 
I, [2020-07-15T19:56:10.094624 #72]  INFO -- : Refreshing Gem list
I, [2020-07-15T19:56:41.824138 #72]  INFO -- : listening on addr=127.0.0.1:3000 fd=9
I, [2020-07-15T19:57:06.077895 #72]  INFO -- : master process ready
I, [2020-07-15T19:57:17.979526 #229]  INFO -- : worker=2 ready
I, [2020-07-15T19:57:17.979509 #218]  INFO -- : worker=1 ready
I, [2020-07-15T19:57:17.979637 #241]  INFO -- : worker=3 ready
I, [2020-07-15T19:57:17.979868 #211]  INFO -- : worker=0 ready

my problem still continues

tail -100 unicorn.stderr.log

    I, [2020-07-16T07:51:49.785061 #72] INFO -- : master done reopening logs

    I, [2020-07-16T07:52:05.423701 #18420] INFO -- : worker=3 done reopening logs

    I, [2020-07-16T07:52:05.439574 #10177] INFO -- : worker=2 done reopening logs

    I, [2020-07-16T07:52:06.614121 #11282] INFO -- : worker=1 done reopening logs

    I, [2020-07-16T07:52:06.626403 #30350] INFO -- : worker=0 done reopening logs

    E, [2020-07-16T13:43:49.118620 #72] ERROR -- : worker=1 PID:11282 timeout (31s > 30s), killing

    E, [2020-07-16T13:43:49.325644 #72] ERROR -- : reaped #<Process::Status: pid 11282 SIGKILL (signal 9)> worker=1

    D, [2020-07-16T13:44:19.448200 #72] DEBUG -- : waiting 16.0s after suspend/hibernation

    I, [2020-07-16T13:44:31.441735 #10639] INFO -- : worker=1 ready

    E, [2020-07-16T14:24:40.454209 #72] ERROR -- : worker=1 PID:10639 timeout (31s > 30s), killing

    E, [2020-07-16T14:24:40.611580 #72] ERROR -- : reaped #<Process::Status: pid 10639 SIGKILL (signal 9)> worker=1

    D, [2020-07-16T14:25:10.744135 #72] DEBUG -- : waiting 16.0s after suspend/hibernation

    I, [2020-07-16T14:25:14.973408 #13472] INFO -- : worker=1 ready

    E, [2020-07-16T16:03:01.918109 #72] ERROR -- : worker=2 PID:10177 timeout (31s > 30s), killing

    E, [2020-07-16T16:03:02.200133 #72] ERROR -- : reaped #<Process::Status: pid 10177 SIGKILL (signal 9)> worker=2

    I, [2020-07-16T16:03:51.690756 #20266] INFO -- : worker=2 ready

    E, [2020-07-16T18:29:27.607372 #72] ERROR -- : worker=1 PID:13472 timeout (31s > 30s), killing

    E, [2020-07-16T18:29:27.831050 #72] ERROR -- : reaped #<Process::Status: pid 13472 SIGKILL (signal 9)> worker=1

    I, [2020-07-16T18:29:59.339086 #30397] INFO -- : worker=1 ready

    E, [2020-07-16T18:51:56.470192 #72] ERROR -- : worker=0 PID:30350 timeout (31s > 30s), killing

    E, [2020-07-16T18:51:57.004078 #72] ERROR -- : reaped #<Process::Status: pid 30350 SIGKILL (signal 9)> worker=0

    I, [2020-07-16T18:52:43.150079 #31968] INFO -- : worker=0 ready
D, [2020-07-16T19:13:52.263197 #72] DEBUG -- : waiting 16.0s after suspend/hibernation

Could you answer the rest of Jay’s questions?

Is this on SSD? 2TB suggests that this might be a conventional spinning SATA disk, which is going to be too slow to use with Discourse.

yes 2tb sata disk normally works fast but it is down

SSD is the minimum and is documented in the discourse requirements. You’re going to need SSD, we can’t help you if you are on a spinning drive.

Can you enter the container and tail some other logs?

My bet is that PostgreSQL is failing to start, start looking into that.

1 Like

hi what log file should look at

If it helps, the discourse server I help administer has started picking up 502 bad gateway error messages for about a month now. Both the server and myself are located in Germany. It cannot be a recent discourse regression because we haven’t upgraded in months. We run from a really basic hosted platform contract. The server is also now really slow when it does successfully connect. I  don’t have a good explanation for this degraded service but imagined it was simply our cheap plan. On reading this thread, there could perhaps be other explanations? R.

thanks for the answer.
server transferred ssd, the problem has solved

3 Likes

HI ! Can you tell me that using life type hard drives can improve performance? thank you

SSD is much faster than spinning magnetic plates. It’s widely recognized that SSD is required, though I’m aware of one largish site that used magnetic disks. It resulted in at least one change to core to support it. It took weeks to get configured. If you do use magnetic disks you’ll need more ram to provide more cache. It’s really not recommended.

1 Like