Discourse updated from admin, not working after rebuild. Fatal 502 error after rebuild

My current website Discourse was on image

I did a admin upgrade when it showed me this,

Admin upgrade from UI worked fine and no issues found. but After that when i did rebuild, and when i goto my site on browser, it says Host not found and my site was not working. Now our site is broken forever whenever we do rebuild and we r not able to bring back our site.

Now i had an older droplet, so i restored from my older DO droplet snapshot, and now site is up but when i am trying to restore it gives this error.

When i goto admin/upgrade page, it shows this,

Now, we have tried to do a rebuild again, and again site is not working,

Looks like Discourse team did some code changes on 14th and 15th aug 2021 which is causing this fatal bug. Here are logs of rebuild

We also did a fresh installation of discourse in a new folder, and we ran it on a new port to see if it works, and did rebuild but it still did not work and site was not up.

Please help us resolve our issue, our site is down. After we did 1st rebuild site has stopped working and not working at all, and we are not able to restore our site from our old backup, please help us.

Our site is down, https://howtodiscuss.com/

This issue is happening on latest discourse version , after upgrading from GUI, when we do rebuild, site never works again.

Installed

2.8.0.beta4

(0e53769f71)

Also, the only change 3 weeks ago i did in app.yml was i increased

db_shared_buffers: “4096MB”
db_work_mem: “400MB”
UNICORN_WORKERS: 16

because i moved to AMD droplet with 16 GB RAM and 8 CPUs.

But at that time rebuild worked fine, but now whenever i do rebuild, site does not work any more.

Also ran ./discourse doctor a few times it did not help.

Also, our DO droplet CPU usage is too high, even though we currently our site is on dedicated intel 8 CPU and 16GB RAM, 200 GB SSD.

How can we scale to make efficient use of CPU and RAM? Should we do more physical core CPU upgrade or should we tweak these settings in our app.yml

current app.yml file looks like,

What should be the ideal settings for our specs server? or should we upgrade physically and then do what settings in app.yml to scale high traffic coming on our site and still make our site load faster?

First thing I would do is turn cloudflare to dns only gray cloud.

yeah but i don’t think that’s the problem site isn’t loading even in localhost.

1 Like

We have tried that, still site does not work because it says site cannot be reached. I have even tried going to my dropletIP:PORT but it still says site can’t be reached.

I do remmebr when everything was working fine, i did ran below command based on @pfaffman suggestion to measure my site rebuild time, i saw your suggestion somewhere on meta

time ./launcher rebuild app

The rebuild completed fine but since then my site is not loading and i have tried all solutions but still my site is not working. My current site if you see now it working on an older snapshot of droplet with a latest backup restored. But i have upgraded it in UI to latest discourse version, but now whenever i do rebuild, site stops loading. so i m afraid of doing another rebuild.

I did more research and my problem might be same as 502 Error after upgrading from 2.8.0 beta2 to beta because of cloudflare template
Can someone confirm if its the same problem and update us once PR is merged and is fixed so we can try rebuild again?

What should i do?

And i did upgraded my CPU to AMD as per your advice but again CPU usage was high, may be because the site traffic is increasing, i might need to move to 12 or 16 core CPU on DO with a 16 or 32 GB RAM?

Could you show us the errors logged at a console rebuild (not restore) attempt ? I assume you run something else on your host as there are php and nginx ? If you don’t use sockets could you stop nginx temporarily just to be sure it doesn’t interfere?

ps. as for the google pagespeed, I’m sorry I’ve no idea, but if I remember correctly it had something to do with too much ads and/or some less than optimized-for-that-kind-of-heavy-use plugin ?

Which is the best command to get rebuild logs and all error logs of my discourse?

or I have to manually copy and paste all logs in a raw file from terminal output at the time of rebuild?

PHP is only used to serve AMP pages and everything is served via NGINX reverse proxy.

Regarding speed, you suggest me to not use any plugins ? My server CPU usage was very high on AMD 16GB RAM, shared 8 core CPU

now my live site is running on intel dedicated 8vCPU, 16 GB RAM

what settings should I use in app.yml for

db_shared_buffers: “4096MB”
db_work_mem: “400MB”
UNICORN_WORKERS: 16

and do I need to further increase my physical server specs? if yes, RAM or CPU? My site has high traffic and is growing so need to scale it. Or any other change I should do in my app.yml file to speed up more site?

Is there anyway I can support or add multi clustering parallel threading on discourse so it makes maximum use of my server specs and load balance correctly to cope up with heavy traffic?

Oh no! not any plugin! but you might want to try to select those not too resource intensive.

…that would be the enterprise level of service that Discourse Enterprise | Discourse - Civilized Discussion provides I think.

1 Like

@Faizan_Zahid, you asked whether this issue was likely the cause of your bug here. The answer is yes: that was most likely the cause.

You also asked when a fix would be released. I did author a release as a pull request (PR), but I have no affiliation with Discourse; it was just a fix I implemented to fix my own site and figured I’d suggest it publicly as a patch. The PR has since been accepted by the Discourse team and is now live.

1 Like

I did ./launcher logs app and got below output,

nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87
nginx: [emerg] host not found in set_real_ip_from "131.0.72.0/222400:cb00::/32" in /etc/nginx/conf.d/discourse.conf:87

but may be they have solved it here, so i m going to try a rebuild again and see.

Also, i am on 16 GB RAM and 8vCPU dedicated CPU, should i keep my below settings in app.yml?

db_shared_buffers: “4096MB”
db_work_mem: “400MB”
UNICORN_WORKERS: 16

@Benjamin_D are they okay settings for my droplet specs?

That issue happened in my non-discourse site. Cloudflare ips can be retrieved as plain texts from the urls they expose, but in a week or so ago they don’t return a newline after the ips, so the last ipv4 concatenated with the first ipv6 (I fixed in my non-discourse site adding a newline in the script that concatenates them).

You aren’t able to access discourse even after disabling cloudflare proxy probably because you have the cloudflare template that blocks ips not from cloudflare, and nginx itself is giving an error and not starting. If you remove the cloudflare template, disable the proxy (DNS only) and rebuild it should work, but I haven’t tried.

It should work with the fix in the topic you mentioned (in the meantime you may be able to enter the container and change the nginx configuration file that has the cloudflare ips and include the newline, editing with nano or vi, and then restart nginx).