MKJ's Opinionated Discourse Deployment Configuration

If you do the updates from the UX, eventually you’ll get a message that says that you have to do a command-line update. It depends not on Debian, but on the base Discourse image.

2 Likes

And with the 2-container method there would be no GUI update button at all, correct?

The GUI update comes from the discourse_docker plugin. If you have that plugin, you have the GUI update.

When there are vulnerabilities discovered in image manipulation tools, remote code exception has definitely happened in the past, which means you are one image upload away from a compromised system.

Clear Linux set the standard for how fast you can boot on Linux. It’s awesome work, wholeheartedly endorse.

1 Like

Oh, that changes things a bit then. For some reason I was thinking that the GUI updater wouldn’t work with a non-standard 2-container installation. In that case, as long as the admin is technically competent it seems like there aren’t a lot of downsides to a 2-container installation. I definitely want to have GUI updates, for example if I’m traveling with just my phone and a major Discourse security update comes out I can at least apply that without SSH access.

1 Like

That’s my belief. You basically need to pay attention enough just to know when there is a Postgres or Redis upgrade that requires rebuiding the data container. You also need to know to ./launcher bootstrap web_only && ./launcher destry web_only; ./launcher start web_only, but that isn’t so hard. You can just do a ./launcher rebuild web_only, but that takes down the site while it’s rebuilding.

2 Likes

Just to be complete: The web UI build normally has actually zero downtime; the bootstrap/destroy/start does have minimal downtime and I would only do it as normal with a maintenance page provided externally as with external nginx as documented here. But that is a good practice anyway if only for getting IPv6 addresses into the container.

Very good, thanks. And with a 2-container installation do you still get Discourse dashboard notifications when the container needs to be rebuilt? And then in that case I could determine whether to rebuild just the app or also the data container?

1 Like

Yes. I see it right now because I haven’t applied the “only the version has changed” 3.1.0.beta1 update. :slightly_smiling_face:

1 Like

This is a case of “it’s fine until it’s not” — people panic when the update fails in the UI and they don’t know to git pull; ./launcher rebuild app to work around the problem. This happens every time there is a change that invalidates the GUI update, I think. It happened again:

I feel like this panic highlights the value of having a consistent, normal update mechanism that avoids this experience.

At the same time, I encountered the also-infrequent case of the bootstrap breaking the running system: Zero-ish-downtime updates do occasionally break like this, once or twice a year maybe on average? So don’t delay between the bootstrap and the destroy/start.

I should update the text to make that clear, so I’ll do that next.

I haven’t yet deployed LibreTranslate, but I’m considering doing that to make my site more internationally available.

If I successfully do so, I intend to edit that into the top post. :smiling_face:

2 Likes

A lot of this goes over my head, but I want to say thanks, because the few settings you mention with adjustment suggestions and an explanation of the consequences on how the community runs was already super precious for me!

4 Likes

Glad it was helpful even beyond my stated target audience. :tada: I used it a couple days ago to stand up a new Discourse instance, and it helped me too, because I don’t remember all this myself. :smiley:

6 Likes

I think there may be an issue with the THP configuration in the wiki.

I was having issues with Redis on Ubuntu:

Your Redis network connection is performing extremely poorly. Last RTT readings were [96585, 101554, 97189, 99769, 94618], ideally these should be < 1000. Ensure Redis is running in the same AZ or dat

I believed that THP was already disabled (from following the wiki) - but turned out it was still enabled :sweat_smile:. Disabling THP ended up resolving the above for me iirc (late last year).


Ubuntu 24.04 LTS (following current wiki):

cat /sys/kernel/mm/transparent_hugepage/enabled
# output:
# [always] madvise never

echo 'sys.kernel.mm.transparent_hugepage.enabled=never' > /etc/sysctl.d/10-huge-pages.conf

cat /etc/sysctl.d/10-huge-pages.conf
# output:
# sys.kernel.mm.transparent_hugepage.enabled=never

sudo sysctl --system
# output:
* Applying /usr/lib/sysctl.d/10-apparmor.conf ...
* Applying /etc/sysctl.d/10-bufferbloat.conf ...
* Applying /etc/sysctl.d/10-console-messages.conf ...
* Applying /etc/sysctl.d/10-huge-pages.conf ...
* Applying /etc/sysctl.d/10-ipv6-privacy.conf ...
* Applying /etc/sysctl.d/10-kernel-hardening.conf ...
* Applying /etc/sysctl.d/10-magic-sysrq.conf ...
* Applying /etc/sysctl.d/10-map-count.conf ...
* Applying /etc/sysctl.d/10-network-security.conf ...
* Applying /etc/sysctl.d/10-ptrace.conf ...
* Applying /etc/sysctl.d/10-zeropage.conf ...
* Applying /usr/lib/sysctl.d/50-pid-max.conf ...
* Applying /etc/sysctl.d/99-cloudimg-ipv6.conf ...
* Applying /usr/lib/sysctl.d/99-protect-links.conf ...
* Applying /etc/sysctl.d/99-sysctl.conf ...
* Applying /etc/sysctl.conf ...
kernel.apparmor_restrict_unprivileged_userns = 1
net.core.default_qdisc = fq_codel
kernel.printk = 4 4 1 7
net.ipv6.conf.all.use_tempaddr = 2
net.ipv6.conf.default.use_tempaddr = 2
kernel.kptr_restrict = 1
kernel.sysrq = 176
vm.max_map_count = 1048576
net.ipv4.conf.default.rp_filter = 2
net.ipv4.conf.all.rp_filter = 2
kernel.yama.ptrace_scope = 1
vm.mmap_min_addr = 65536
kernel.pid_max = 4194304
net.ipv6.conf.all.use_tempaddr = 0
net.ipv6.conf.default.use_tempaddr = 0
fs.protected_fifos = 1
fs.protected_hardlinks = 1
fs.protected_regular = 2
fs.protected_symlinks = 1

cat /sys/kernel/mm/transparent_hugepage/enabled
# output:
# [always] madvise never

AlmaLinux 10 (following current wiki):

cat /sys/kernel/mm/transparent_hugepage/enabled
# output:
# [always] madvise never

echo 'sys.kernel.mm.transparent_hugepage.enabled=never' > /etc/sysctl.d/10-huge-pages.conf

cat /etc/sysctl.d/10-huge-pages.conf
# output:
# sys.kernel.mm.transparent_hugepage.enabled=never

sudo sysctl --system
# output:
* Applying /usr/lib/sysctl.d/10-default-yama-scope.conf ...
* Applying /etc/sysctl.d/10-huge-pages.conf ...
* Applying /usr/lib/sysctl.d/10-map-count.conf ...
* Applying /usr/lib/sysctl.d/50-coredump.conf ...
* Applying /usr/lib/sysctl.d/50-default.conf ...
* Applying /usr/lib/sysctl.d/50-libkcapi-optmem_max.conf ...
* Applying /usr/lib/sysctl.d/50-pid-max.conf ...
* Applying /usr/lib/sysctl.d/50-redhat.conf ...
* Applying /etc/sysctl.d/99-sysctl.conf ...
* Applying /etc/sysctl.conf ...
kernel.yama.ptrace_scope = 0
vm.max_map_count = 1048576
kernel.core_pattern = |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
kernel.core_pipe_limit = 16
fs.suid_dumpable = 2
kernel.sysrq = 16
kernel.core_uses_pid = 1
net.ipv4.conf.default.rp_filter = 2
net.ipv4.conf.eth0.rp_filter = 2
net.ipv4.conf.eth1.rp_filter = 2
net.ipv4.conf.lo.rp_filter = 2
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.eth0.accept_source_route = 0
net.ipv4.conf.eth1.accept_source_route = 0
net.ipv4.conf.lo.accept_source_route = 0
net.ipv4.conf.default.promote_secondaries = 1
net.ipv4.conf.eth0.promote_secondaries = 1
net.ipv4.conf.eth1.promote_secondaries = 1
net.ipv4.conf.lo.promote_secondaries = 1
net.ipv4.ping_group_range = 0 2147483647
net.core.default_qdisc = fq_codel
fs.protected_hardlinks = 1
fs.protected_symlinks = 1
fs.protected_regular = 1
fs.protected_fifos = 1
net.core.optmem_max = 81920
kernel.pid_max = 4194304
kernel.kptr_restrict = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.eth1.rp_filter = 1
net.ipv4.conf.lo.rp_filter = 1

cat /sys/kernel/mm/transparent_hugepage/enabled
# output:
# [always] madvise never

Perhaps this would be a suitable for the wiki? Seemed to work on Ubuntu 24.04 and AlmaLinux 10 from testing now:

echo 'w /sys/kernel/mm/transparent_hugepage/enabled - - - - never' | sudo tee /etc/tmpfiles.d/10-huge-pages.conf
sudo systemd-tmpfiles --create /etc/tmpfiles.d/10-huge-pages.conf

To confirm:

cat /sys/kernel/mm/transparent_hugepage/enabled

Expected output:

always madvise [never]

That’s good to hear - if we don’t know whether it helps, we might just be imitating each other!

1 Like

I don’t think that /etc/sysctl.d has been deprecated. Can you look through the other files listed there and see which one(s) are overriding /etc/sysctl.d/10-huge-pages.conf? Maybe one of those 50-priority files?

The better solution will probably be to change the priority of the huge-pages setting to win. But I am not running either of those versions right now on my systems.

Also check whether tuned is overriding the setting.

1 Like

I only had an issue with the THP configuration applying, vm.overcommit.memory applied as expected via /etc/sysctl.d. This was noticed and resolved on a server late last year. So I tried checking via a couple of micro VPSs yesterday.

Just tried this on a fresh AlmaLinux 9 micro VPS, in attempt to see if any of the default .conf files are affecting the THP configuration:

echo always | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
# output:
# always

cat /sys/kernel/mm/transparent_hugepage/enabled
# output:
# [always] madvise never

sysctl --system
# output:
* Applying /usr/lib/sysctl.d/10-default-yama-scope.conf ...
* Applying /usr/lib/sysctl.d/50-coredump.conf ...
* Applying /usr/lib/sysctl.d/50-default.conf ...
* Applying /usr/lib/sysctl.d/50-libkcapi-optmem_max.conf ...
* Applying /usr/lib/sysctl.d/50-pid-max.conf ...
* Applying /usr/lib/sysctl.d/50-redhat.conf ...
* Applying /etc/sysctl.d/99-sysctl.conf ...
* Applying /etc/sysctl.conf ...
kernel.yama.ptrace_scope = 0
kernel.core_pattern = |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
kernel.core_pipe_limit = 16
fs.suid_dumpable = 2
kernel.sysrq = 16
kernel.core_uses_pid = 1
net.ipv4.conf.default.rp_filter = 2
net.ipv4.conf.eth0.rp_filter = 2
net.ipv4.conf.eth1.rp_filter = 2
net.ipv4.conf.lo.rp_filter = 2
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.eth0.accept_source_route = 0
net.ipv4.conf.eth1.accept_source_route = 0
net.ipv4.conf.lo.accept_source_route = 0
net.ipv4.conf.default.promote_secondaries = 1
net.ipv4.conf.eth0.promote_secondaries = 1
net.ipv4.conf.eth1.promote_secondaries = 1
net.ipv4.conf.lo.promote_secondaries = 1
net.ipv4.ping_group_range = 0 2147483647
net.core.default_qdisc = fq_codel
fs.protected_hardlinks = 1
fs.protected_symlinks = 1
fs.protected_regular = 1
fs.protected_fifos = 1
net.core.optmem_max = 81920
kernel.pid_max = 4194304
kernel.kptr_restrict = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.eth1.rp_filter = 1
net.ipv4.conf.lo.rp_filter = 1

cat /sys/kernel/mm/transparent_hugepage/enabled
# output:
# [always] madvise never
echo never | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
# output:
# never

cat /sys/kernel/mm/transparent_hugepage/enabled
# output:
# always madvise [never]

sysctl --system
# output:
* Applying /usr/lib/sysctl.d/10-default-yama-scope.conf ...
* Applying /usr/lib/sysctl.d/50-coredump.conf ...
* Applying /usr/lib/sysctl.d/50-default.conf ...
* Applying /usr/lib/sysctl.d/50-libkcapi-optmem_max.conf ...
* Applying /usr/lib/sysctl.d/50-pid-max.conf ...
* Applying /usr/lib/sysctl.d/50-redhat.conf ...
* Applying /etc/sysctl.d/99-sysctl.conf ...
* Applying /etc/sysctl.conf ...
kernel.yama.ptrace_scope = 0
kernel.core_pattern = |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
kernel.core_pipe_limit = 16
fs.suid_dumpable = 2
kernel.sysrq = 16
kernel.core_uses_pid = 1
net.ipv4.conf.default.rp_filter = 2
net.ipv4.conf.eth0.rp_filter = 2
net.ipv4.conf.eth1.rp_filter = 2
net.ipv4.conf.lo.rp_filter = 2
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.eth0.accept_source_route = 0
net.ipv4.conf.eth1.accept_source_route = 0
net.ipv4.conf.lo.accept_source_route = 0
net.ipv4.conf.default.promote_secondaries = 1
net.ipv4.conf.eth0.promote_secondaries = 1
net.ipv4.conf.eth1.promote_secondaries = 1
net.ipv4.conf.lo.promote_secondaries = 1
net.ipv4.ping_group_range = 0 2147483647
net.core.default_qdisc = fq_codel
fs.protected_hardlinks = 1
fs.protected_symlinks = 1
fs.protected_regular = 1
fs.protected_fifos = 1
net.core.optmem_max = 81920
kernel.pid_max = 4194304
kernel.kptr_restrict = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.eth1.rp_filter = 1
net.ipv4.conf.lo.rp_filter = 1

cat /sys/kernel/mm/transparent_hugepage/enabled
# output:
# always madvise [never]

That’s why I’m asking you to look through the actual files to find what is overriding it, so that I can make an informed change to the priority I recommend for the override.

1 Like

My current (limited) understanding is that the commands/outputs from my previous post indicate that there is no override.

Looking through the files on a fresh AlmaLinux 9 instance:

These came up empty:

grep -r "transparent_hugepage" /usr/lib/sysctl.d/ /etc/sysctl.d/ /etc/sysctl.conf
grep -r "transparent" /usr/lib/sysctl.d/ /etc/sysctl.d/ /etc/sysctl.conf
grep -r "huge" /usr/lib/sysctl.d/ /etc/sysctl.d/ /etc/sysctl.conf
grep -r "page" /usr/lib/sysctl.d/ /etc/sysctl.d/ /etc/sysctl.conf

Default values in the conf files:

/usr/lib/sysctl.d/50-redhat.conf:kernel.kptr_restrict = 1
/usr/lib/sysctl.d/50-redhat.conf:net.ipv4.conf.default.rp_filter = 1
/usr/lib/sysctl.d/50-redhat.conf:net.ipv4.conf.*.rp_filter = 1
/usr/lib/sysctl.d/50-redhat.conf:-net.ipv4.conf.all.rp_filter
/usr/lib/sysctl.d/10-default-yama-scope.conf:kernel.yama.ptrace_scope = 0
/usr/lib/sysctl.d/50-libkcapi-optmem_max.conf:net.core.optmem_max = 81920
/usr/lib/sysctl.d/50-coredump.conf:kernel.core_pattern=|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
/usr/lib/sysctl.d/50-coredump.conf:kernel.core_pipe_limit=16
/usr/lib/sysctl.d/50-coredump.conf:fs.suid_dumpable=2
/usr/lib/sysctl.d/50-default.conf:kernel.sysrq = 16
/usr/lib/sysctl.d/50-default.conf:kernel.core_uses_pid = 1
/usr/lib/sysctl.d/50-default.conf:net.ipv4.conf.default.rp_filter = 2
/usr/lib/sysctl.d/50-default.conf:net.ipv4.conf.*.rp_filter = 2
/usr/lib/sysctl.d/50-default.conf:-net.ipv4.conf.all.rp_filter
/usr/lib/sysctl.d/50-default.conf:net.ipv4.conf.default.accept_source_route = 0
/usr/lib/sysctl.d/50-default.conf:net.ipv4.conf.*.accept_source_route = 0
/usr/lib/sysctl.d/50-default.conf:-net.ipv4.conf.all.accept_source_route
/usr/lib/sysctl.d/50-default.conf:net.ipv4.conf.default.promote_secondaries = 1
/usr/lib/sysctl.d/50-default.conf:net.ipv4.conf.*.promote_secondaries = 1
/usr/lib/sysctl.d/50-default.conf:-net.ipv4.conf.all.promote_secondaries
/usr/lib/sysctl.d/50-default.conf:-net.ipv4.ping_group_range = 0 2147483647
/usr/lib/sysctl.d/50-default.conf:-net.core.default_qdisc = fq_codel
/usr/lib/sysctl.d/50-default.conf:fs.protected_hardlinks = 1
/usr/lib/sysctl.d/50-default.conf:fs.protected_symlinks = 1
/usr/lib/sysctl.d/50-default.conf:fs.protected_regular = 1
/usr/lib/sysctl.d/50-default.conf:fs.protected_fifos = 1
/usr/lib/sysctl.d/50-pid-max.conf:kernel.pid_max = 4194304

I’m running AlmaLinux 9 for my discourse instances, and the configuration I provided successfully disabled THP on all of them. If disabling THP through sysctl.d isn’t working when no overrides are in place, and tuned isn’t overriding it, I’d think that’s a bug.

I thought you were saying that it no longer worked on AlmaLinux 10, which was why I was asking what was stopping it from being applied there.