Real-time updating of topics freezes under high activity

The games have been scarce (thanks to COVID), so we have had very few opportunities to measure and tinker with this.

What we found out that even with our improved hardware resources (6+4 vCores and 16+8GB RAM), even a modestly active crowd is able to produce 429 client freezes. We saw this with the U20 WC games, that attracted about ~50% of our regular game audience for the chats.

With measuring, trial and error we have settled with the following tweaks:

  DISCOURSE_REJECT_MESSAGE_BUS_QUEUE_SECONDS: 0.4
  DISCOURSE_MAX_REQS_PER_IP_PER_MINUTE: 400
  DISCOURSE_MAX_REQS_PER_IP_PER_10_SECONDS: 100

This seems to eliminate 80% of the 429’s, thus enabling a relatively smooth experience for a majority of users.

The next step would have been buying different kind of hardware resources, either using dedicated boxes for single threaded speed or switching to a VPS provider that offers plans with gazzillion vCores. For us however, the next step is to work with the Discourse hosting team, as @sam hinted earlier.

Hopefully these tweaks might be useful for @iceman, @alec or anyone else. Be sure to have an eye on the CPU usage and queuing. Also what I learned from this exercise, is that 2 containers are way better than one - tweaks can be applied with near zero downtime, and hardware resources exploited more granulary.

I am still interested in any new tweaks or findings that might help to improve the performance/UX for fast paced discussions driven by real world events.

1 Like