Awesome that this topic raised some interest, as I am not a back-end guru. Unfortunately I don’t run a proper server monitoring tool, so the data is limited. From my (non-guru) point of view hitting the CPU cap was logical, as according to Digital Ocean’s panel the typical load hovers around 20-30% and yesterday we got hit by two concurrent traffic spikes, and we definitely had 4-5x the usual activity by guests and members alike.
I wan’t to emphasize I am not putting any blame on Discourse. It was my estimate all along that I need to upgrade at some point. Just seeking council on hosting options, regarding the CPU cores/power.
First thing to understand that this is a sports fan forum and the user behavioral patterns are nothing like what you see on tech forums. People get emotional, things escalate. This is hockey, our national sport, so think of it as a Super Bowl in a miniature scale. Positive news raise interest on the national level, lots of guests come to read. Negative news cause strong reactions from members who come to the forum to share their frustration. Yesterday we had both. First there as a big player transaction in the morning (knocked off the team’s official site), which caused a massive spike (1st minor lag) in traffic but also increased guest traffic (2-3x) through the whole day. Then in the evening the team was being destroyed in an away game, which activated the frustrated fan base and caused a more serious period of lag.
Google analytics reports for yesterday:
- 9000 sessions
- 78.000 page views
- At least 220+ active concurrect sessions. Could have been more at some point, when I wasn’t looking.
This is the 24h graph of Discourse panel. On the left you see a typical/quiet evening traffic (20% load), night hours (near zero) and then the two events I mentioned in the first chapter. Note that the 24h graph shows the load as an average for a time period and the real-time report showed spikes at 95-100%. The IO graphs shows some spiking for read, which to some extent correlates to CPU load.
This top screenshot I took around 8:24PM, which is slightly after the evening spike and lag, which was at it’s worst around 8PM and a little after. At that time I saw load averages >3.50, and here stings start to settle down. The status of RAM always looks pretty much like that, no matter what the load.
Any thoughts? If RAM is the issue, then I am more than happy to just upgrade one step at Digital Ocean and stay there, even though they are the only VPS provider that has dual-core on their 4GB RAM plan and they seem to have no intentions of increasing that.