No secure Connection to self-hosted Discourse after latest update

Hi. I have updated some hours ago Discourse Image and Discourse using ./launcher rebuild app . In process, i’ve got some errors which were fixed by removing deprecated plugin installation lines from app.yml . No other configuration changes were made. Now i have running Docker container listening to 80 and 443 TCP ports, but nginx in the container refuses to accept SSL/TLS connections. error.log inside container shows no errors, access.log shows no requests, telnet to 443 port shows connection refusal, HTTP access at TCP port 80 works, but we have SSO authentication problems (probably it’s a problem with secure-only cookies). Launcher restart and discourse-doctor doesn’t help. Restarting nginx inside container doesn’t help also. Where to look now and what to do?

2 Likes

it will do if ufw is enabled and not set-up to allow connections on https port 443


Edit: you’ll need this port to be open for mail-receiver to work properly

I did nothing with ufw, iptables etc configuration. The port was exactly opened before update. Could configuration be changed with Discourse updating process?

it absolutely could, Discourse docker container should step in front of ufw. However, not have prt 443 open causes issues in cross-talk between different containers or issue with telnet


what deos .\launcher logs app return? (you can use MS word to redact your domain name, etc.)

do you access your server through PuTTy or another SSH client? opening port 22 ?

I am also seeing this issue on a self-hosted instance after a recent rebuild. No changes in configuration except the rebuild itself. I can access the server through SSH and this is the output of ./launcher logs app.


run-parts: executing /etc/runit/1.d/00-ensure-links
run-parts: executing /etc/runit/1.d/00-fix-var-logs
run-parts: executing /etc/runit/1.d/01-cleanup-web-pids
run-parts: executing /etc/runit/1.d/anacron
run-parts: executing /etc/runit/1.d/cleanup-pids
Cleaning stale PID files
run-parts: executing /etc/runit/1.d/copy-env
run-parts: executing /etc/runit/1.d/install-ssl
Started runsvdir, PID is 45
ok: run: redis: (pid 55) 0s
supervisor pid: 53 unicorn pid: 76

Docker container is running as evidenced by my output from docker ps. (container id redacted)

local_discourse/app “/sbin/boot” 16 minutes ago Up 16 minutes 0.0.0.0:80->80/tcp, [::]:80->80/tcp, 0.0.0.0:443->443/tcp, [::]:443->443/tcp, 0.0.0.0:5432->5432/tcp, [::]:5432->5432/tcp app

An important note to highlight, we don’t use LetsEncrypt for our certs, due to requiring a specific issuer. However, this cert has not changed and was working fine before the rebuild (and certs issued in this manner have been working on our instances for years).

There seems to be a mismatch between the IP nginx expects (local IP 127.0.0.1) and the one found assigned to the container. Looks like the container may be running in bridge mode? Here are the network settings from the container. (Please note this log is from when I first identified this issue on Friday and started investigating)

"Labels": {
    "org.opencontainers.image.created": "2025-07-25T21:40:36+00:00"
},
"NetworkSettings": {
    "Bridge": "",
    "SandboxID": "[REDACTED]",
    "SandboxKey": "[REDACTED]",
    "Ports": {
        "443/tcp": [
            {
                "HostIp": "0.0.0.0",
                "HostPort": "443"
            },
            {
                "HostIp": "::",
                "HostPort": "443"
            }
        ],
        "5432/tcp": [
            {
                "HostIp": "0.0.0.0",
                "HostPort": "5432"
            },
            {
                "HostIp": "::",
                "HostPort": "5432"
            }
        ],
        "80/tcp": [
            {
                "HostIp": "0.0.0.0",
                "HostPort": "80"
            },
            {
                "HostIp": "::",
                "HostPort": "80"
            }
        ]
    },
    "HairpinMode": false,
    "LinkLocalIPv6Address": "",
    "LinkLocalIPv6PrefixLen": 0,
    "SecondaryIPAddresses": null,
    "SecondaryIPv6Addresses": null,
    "EndpointID": "[REDACTED]",
    "Gateway": "172.17.0.1",
    "GlobalIPv6Address": "",
    "GlobalIPv6PrefixLen": 0,
    "IPAddress": "172.17.0.2",
    "IPPrefixLen": 16,
    "IPv6Gateway": "",
    "MacAddress": "[REDACTED]",
    "Networks": {
        "bridge": {
            "IPAMConfig": null,
            "Links": null,
            "Aliases": null,
            "MacAddress": "[REDACTED]",
            "DriverOpts": null,
            "GwPriority": 0,
            "NetworkID": "[REDACTED]",
            "EndpointID": "[REDACTED]",
            "Gateway": "172.17.0.1",
            "IPAddress": "172.17.0.2",
            "IPPrefixLen": 16,
            "IPv6Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "DNSNames": null
        }
    }
}

There looks to be a recently merged PR that has introduced a new var requirement to the app.yml . This doesn’t appear to be documented yet but you need to add ENABLE_SSL: true to your app.yml file

That sounds like a bug. Ssl has been on by default for some years. Can you link to the commit?

I see it in the code in the ssl template. I may be missing something since I’m in my phone and github is having issues, but it looks like it’ll break every self hosted site.

1 Like

It’s this one here, definitely an unintended bug with merging the ssl-on-boot configuration:

I’ve updated the ENABLE_SSL to be 1 by default here:

Thanks for the catch @tanya_byrne

3 Likes

Good save, Jeff! Thanks!

1 Like

Thanks for the fix @featheredtoast

2 Likes

Thanks, guys. We’re solved the problem also.

2 Likes