I have a Discourse install running on a GCE server. Users reported having problems with the system randomly returning 502 errors. I can replicate the situation by clicking through the Latest, New, Unread, Top, and Categories links. Sooner or later one of them is going to return a 502 error.
I have checked the logs from my proxy server and it is logging entries like this for the failed urls:
“upstream prematurely closed connection while reading response header from upstream”. There are a very large quantity of these errors, for seemingly random URLs.
Here are steps I have taken to try to solve the problem based on posts I have seen:
- Upgraded the OS
- Upgraded Docker
- Upgraded Discourse
- Rebooted the server
The original install was done using the Docker Cloud Setup guide. I then followed a guide to switch backups and images to use S3.
My server is running:
Ubuntu 14.04.6 LTS (GNU/Linux 4.4.0-148-generic x86_64)
Per discourse-doctor:
DOCKER VERSION: Docker version 18.06.3-ce, build d7080c1
==================== MEMORY INFORMATION ====================
RAM (MB): 4820
total used free shared buffers cached
Mem: 4707 2206 2501 140 101 948
-/+ buffers/cache: 1156 3550
Swap: 2047 0 2047
==================== DISK SPACE CHECK ====================
---------- OS Disk Space ----------
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 50G 33G 15G 70% /
/dev/sda1 50G 33G 15G 70% /var/lib/docker
==================== DISK INFORMATION ====================
Disk /dev/sda: 53.7 GB, 53687091200 bytes
255 heads, 63 sectors/track, 6527 cylinders, total 104857600 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000
Device Boot Start End Blocks Id System
/dev/sda1 * 16065 104856254 52420095 83 Linux
Partition 1 does not start on physical sector boundary.
==================== END DISK INFORMATION ====================
I have run top and watched the CPU and Memory numbers and I’m not seeing anything concerning. I’ve looked in the logs and am not seeing anything that points me to the problem.
Any other details I can provide to help troubleshoot the issue? What steps should I take to track this down?
Thank you,
Stephen