Dropping to 4 workers freed a significant amount of memory.
Then I raised the Sidekiq RSS limit from the default (~500 MB) to 700 MB, which allows Sidekiq a little more breathing room before it’s automatically restarted.
So far Sidekiq has stabilized, and memory usage now sits in a much safer zone, with just over 1 GB moved from used memory to cached and available memory.
Leaving this here if it proves helpful or as a hint of what to look at for anyone else with similar issues. Will be interesting to see if this holds and is more stable after a week of uptime, if it does, I will mark solved.
The forums here has useful threads (linked above) that were helpful. Hopefully this also helps someone else facing similar issues.
What I’ve found is that my forum does not get anywhere near the amount of traffic required for 8 workers. Even 2 would have worked fine.
That said, on my server, memory seems to be the main/future bottleneck, but I plan to continue to run the VM at the same size. Since swap is on very fast NVMe in RAID 10, I will eventually in the future add zswap and update this thread in the years to come if/when traffic requires that.