I’m running a small server (1G RAM) and also a small site (official discourse install been running for almost 8 years now). There’s more disk swapping than I would like so I started looking at memory usage.
I noticed that I had set the number of unicorns to only 2 a while ago to limit memory usage (and reduce swapping). Running discourse version 3.1.0
What am I missing here? Why do I have 20 unicorns running while the app.yml states 2?
Also are there any other ways to reduce the memory usage (for example if I reduce the db_shared_buffers to 128MB any side effects?) Can I reduce sidekiq memory usage?
I suspect htop is showing threads rather than processes - anyhow, I see the same on htop as you do, but only two unicorns according to
ps uaxf|egrep unicorn.?worker
Also my free is like yours:
# free -h
total used free shared buff/cache available
Mem: 985M 782M 61M 60M 141M 32M
Swap: 2.0G 992M 1.0G
BTW, to see some swap in use is not itself a problem. It’s the actual swapping (paging) which would matter. Try vmstat 5 5 and look at the si and so columns.
# vmstat 5 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 1041832 63176 5716 127408 367 325 601 393 8 10 2 1 95 2 0
0 0 1041576 60976 5724 127408 399 0 399 21 212 653 1 1 96 2 0
0 0 1043544 77036 2296 120688 807 803 807 837 404 1144 1 2 94 3 0
0 0 1043288 65040 3704 129476 254 0 2292 5 255 780 1 1 96 2 0
0 0 1048736 81936 2916 119016 762 1499 919 1565 470 1171 3 2 90 5 0
I’d prefer not to see anything over 1000 but I’m not too worried. A second run showed a much calmer picture:
# vmstat 5 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 1048452 82712 2532 120848 367 325 601 393 8 10 2 1 95 2 0
0 0 1047684 74552 2548 124816 285 0 1049 10 230 655 2 1 95 2 0
0 0 1046660 66556 3692 129008 196 0 1261 16 219 672 1 1 96 2 0
1 0 1046404 65812 3700 129284 54 0 97 13 137 364 1 0 98 0 0
0 0 1046148 65280 3700 129288 50 0 50 3 132 344 1 0 98 0 0
Edit: the H key in htop will switch from threads to processes:
CPU[ 0.0%] Tasks: 66; 1 running
Mem[||||||||||||||||||||||||||||||||||||||||||||||||||824M/985M] Load average: 0.19 0.12 0.05
Swp[|||||||||||||||||||||||||||||| 1015M/2.00G] Uptime: 52 days, 00:50:42
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
13246 1000 20 0 966M 362M 6448 S 0.0 36.8 51:01.52 unicorn worker[0] -E production -c config/unicorn.conf.rb
13237 1000 25 5 1004M 194M 3780 S 0.0 19.8 22:38.19 sidekiq 6.5.9 discourse [0 of 5 busy]
13258 1000 20 0 919M 70176 3632 S 0.0 7.0 5:02.87 unicorn worker[1] -E production -c config/unicorn.conf.rb
12412 systemd-r 20 0 212M 60928 56916 S 0.0 6.0 0:00.23 postgres: 13/main: discourse discourse [local] idle
12818 systemd-r 20 0 212M 39228 34868 S 0.0 3.9 0:00.07 postgres: 13/main: discourse discourse [local] idle
12719 systemd-r 20 0 211M 28400 25336 S 0.0 2.8 0:00.03 postgres: 13/main: discourse discourse [local] idle
13117 1000 20 0 541M 13768 2048 S 0.0 1.4 1:08.11 unicorn master -E production -c config/unicorn.conf.rb
Edit: I set db_shared_buffers: "128MB" very early on, and haven’t seen any problem with that.
Thanks, very helpful. What’s the downside to moving to just one unicorn worker? Would that improve response time or worsen it (is one unicorn worker limited to one incoming connection, ie. would 2 workers make a single incoming connection faster to process?) ,assuming I have just few connections per minute?
When I’m browsing the website this is what vmstat 5 5 looks like, any suggestions on how to reduce swapping (I’ve set the swappiness to 10)?
It certainly looks like your site would be quicker if you had more RAM. But if the response time isn’t a problem, there’s no problem. Just look at your personal cost/benefit equation.
You might be interested in reading MKJ’s Opinionated Discourse Deployment Configuration. There are a couple of system-level kernel tweaks which are a good idea. I don’t know whether or not they will make a difference.
I don’t know, but I think each unicorn can deal with a request. So if you have just one unicorn, and enough traffic for a second request to come in before the first is done, that second request will have to wait. You can see from my htop output that one unicorn has run up 10x the CPU time of the other. I’d take that to mean that my forum is 90% of the time in need of only one unicorn, and 10% of the time that second unicorn is helpful. I don’t feel any need to add a third, and it mightn’t be a big deal to my forum members if I went down to one. But I see no reason to: it may use memory, but if it’s idle then it will get swapped out. No big deal: let the virtual memory system deal with it.
Edit: I’ve never tweaked swappiness. Seems to be at 60. More aggressive swapping might be useful if it frees more RAM for I/O buffers. I don’t know.