Discourse 论坛挂起并超时,无法加载

最近,我们的 Discourse 论坛在大多数请求上开始出现超时:https://caddy.community

问题出现前几天,我们没有进行任何更新或更改。有时加载正常,但几分钟后加载时间会延长至数分钟。(因此,如果上述链接对您加载很快,请稍等几分钟后再次尝试。您也可以使用 curl 进行测试。)

我们托管在 DigitalOcean 上。

机器资源并未用满:CPU、内存、磁盘和网络 I/O 都有充足可用。

日志中没有错误,机器上也没有运行任何未知进程。该论坛已正常运行约三年,但现在无法加载。

有人知道如何让论坛运行得更快吗?我们已重启了机器,但论坛仍然运行缓慢。

Hey @mholt :wave:,

I did try going back and forth between /latest and /categories and it’s pretty fast.

Are you seeing slowdown in specific routes of the app?

Also, do you have the MiniProfiler enabled?

Please follow this to enable it:Long loading times for user summary page with slow database

After enabling it, when you hit a slow down you will know exactly where it is.

Huh… dang, it is loading faster today. :thinking: (Others in our community have also experienced it, I know it’s not just me, haha – but it can be intermittent, it seems.)

I’ll try that out when I have a chance, thanks!

Hmm, it seems my ssh connections also time out sometimes.

I wonder if DigitalOcean is having network issues (they haven’t reported anything, though) – maybe I will have to open a ticket with DO to find out. It might not be specific to Discourse.

This has started happening again recently: takes several minutes for the forum to load.

Ping times to the DigitalOcean droplet are nominal: ~80ms. Server load is also nominal:

The timings in the debug thing in the corner (MiniProfiler) don’t reveal any problems: all times are within ~300ms:

(This page took about 3 minutes to load.)

Is there any part of the loading process between reaching the server and rendering the page that are not counted by the MiniProfiler?

Time spent on Redis isn’t counted and can cause what you see. You will have to dig into the server and check if Redis is having trouble to persist changes to disk.

Interesting, any tips on how to do that? Or link to a relevant guide? This is outside my pay grade :sweat_smile:

Oh let me correct myself a bit here. The time spent on redis won’t be on the broken down part of mini profiler, but be counted in the overall time in the first column. So looking at your screenshot, this doesn’t appear to be the case.

Does Caddy logs time waiting on the backend and overall time spent on each request? Is there a possibility that the reverse proxy was waiting?

I will add this to the logs and try to find out next time it happens.

This could be time it is taking it to grab static assets ? Maybe have a look in chrome dev tools next time this happens?

Thanks for the idea Sam. I have looked at the network inspector before and I do not believe I remember anything too telling – but I’ll check it in more detail next time.

This tends to happen every few days or so, I’ll report back when it does!