Installation without subdomain not working

Hi there, I’m trying to install Discourse on a testbed fresh Ubuntu 20.04 VM (also tried CentOS Stream 9 and Ubuntu 22.04 and openSUSE MicroOS). I have some experience with Discourse since the early days of the project, and I’m now evaluating it for a migration. In that case it would be to mydomain.tld (the production domain is only a forum and has “forum” in its name and it’s well known as such, so I definitely don’t want discourse.mydomain.tld ). All of my recent attempts at installing Discourse without a subdomain have failed. I know it used to be possible because I ran a Discourse forum like that about 6(?) years ago without a subdomain. Now the installation appears to complete successfully, but the site won’t load. In Ubuntu it automatically switches to https:// even when I explicitly put http://, and it won’t load at all. And in CentOS and MicroOS it loads the http:// Nginx welcome page, and nothing loads with https:// .

All of my attempts on the above operating systems in the same VM work fine when Discourse is installed to a subdomain at discourse.mydomain.tld , including the Let’s Encrypt autoconfig. As far as I can tell my DNS records are correct on the domain registrar and I have proper rDNS resolution. The server’s host name in /etc/hosts shows 127.0.1.1 mydomain.tld mydomain and the discourse-install script succeeds with the domain name resolution check.

Here’s the discourse-doctor output, I also have the full discourse-install log if anybody wants it:

DISCOURSE DOCTOR Sun Oct 9 13:32:47 UTC 2022
OS: Linux mydomain 5.4.0-125-generic #141-Ubuntu SMP Wed Aug 10 13:42:03 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux


Found containers/app.yml

==================== YML SETTINGS ====================
DISCOURSE_HOSTNAME=mydomain.tld
SMTP_ADDRESS=mail.mydomain.tld
DEVELOPER_EMAILS=REDACTED 
SMTP_PASSWORD=REDACTED 
SMTP_PORT=587
SMTP_USER_NAME=admin@mydomain.tld
LETSENCRYPT_ACCOUNT_EMAIL=REDACTED 

==================== DOCKER INFO ====================
DOCKER VERSION: Docker version 20.10.12, build 20.10.12-0ubuntu2~20.04.1

DOCKER PROCESSES (docker ps -a)

CONTAINER ID   IMAGE                 COMMAND        CREATED          STATUS         PORTS                                                                      NAMES
d6f7f53a81db   local_discourse/app   "/sbin/boot"   10 minutes ago   Up 4 minutes   0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp   app


Discourse container app is running


==================== PLUGINS ====================
          - git clone https://github.com/discourse/docker_manager.git

No non-official plugins detected.

See https://github.com/discourse/discourse/blob/main/lib/plugin/metadata.rb for the official list.

========================================
Discourse version at mydomain.tld: NOT FOUND
Discourse version at localhost: NOT FOUND


==================== MEMORY INFORMATION ====================
OS: Linux
RAM (MB): 2029

              total        used        free      shared  buff/cache   available
Mem:           1935         823         547          30         564         934
Swap:          2047           0        2047

==================== DISK SPACE CHECK ====================
---------- OS Disk Space ----------
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        38G  8.0G   28G  23% /

==================== DISK INFORMATION ====================
Disk /dev/sda: 38.15 GiB, 40961572864 bytes, 80003072 sectors
Disk model: QEMU HARDDISK   
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 6643DB1B-E542-4DE1-A04C-C8EB4DAAD77E

Device      Start      End  Sectors  Size Type
/dev/sda1  528384 80003038 79474655 37.9G Linux filesystem
/dev/sda14   2048     4095     2048    1M BIOS boot
/dev/sda15   4096   528383   524288  256M EFI System

Partition table entries are not in disk order.

==================== END DISK INFORMATION ====================

==================== MAIL TEST ====================
For a robust test, get an address from http://www.mail-tester.com/
Mail test skipped.

==================== DONE! ====================
1 Like

Hard to help without knowing the domain. What happens when you try a verbose curl from another machine in the internet to your domain.tld ?

1 Like

Hi thanks for the reply. OK that’s a good idea, looks it’s not accepting the connection:

$ curl -v mydomain.tld
*   Trying 1.2.3.4:80...
* connect to 1.2.3.4 port 80 failed: Connection refused
* Failed to connect to mydomain.tld port 80: Connection refused
* Closing connection 0
curl: (7) Failed to connect to mydomain.tld port 80: Connection refused

$ curl -v https://mydomain.tld
*   Trying 1.2.3.4:443...
* connect to 1.2.3.4 port 443 failed: Connection refused
* Failed to connect to mydomain.tld port 443: Connection refused
* Closing connection 0
curl: (7) Failed to connect to mydomain.tld port 443: Connection refused

Is it possibly due to some limitation in the Discourse setup logic where it expects .tld to be something common like .com or .org? Mine is just a $5 .tech domain I created for testing.

It’s unlikely.

Where’s the server hosted? What sits between it and the client?

Giving us the FQDN helps us do some troubleshooting. As-is you’re asking us to help diagnose this whilst blindfolded, so it may take a while to pinpoint.

2 Likes

The server is a VM hosted in Hetzner’s USA datacenter. The domain is blfdev.tech . I really didn’t think it would make any difference what the specific domain is, and it’s a bog-standard setup.

Was this instance installed using discourse-setup, or did you manually create the YML file?

Have you verified that 80/443 are open at Hetzner?

Let’s encrypt is enrolled as standard these days, hence the redirect to a secure port.

I used discourse-setup. Yes, the ports are open. Installing to a subdomain works fine, and I also setup a Docker installation of a mailserver with a web frontend on this same VM (but I later reformatted it).

Have you read:

Yet?

Hmm no. The registrar is Hover, they’re normally pretty good. That’s bizarre, in 20 years of setting up servers I’ve never had problems with websites at the root of a domain…

I don’t recall it working for the one domain I had at Hover, but that was a while back.

You could try swapping your NS to CloudFlare and testing whether DNS is the issue from there at no cost.

Thanks a lot for pointing me to this.

You could try swapping your NS to CloudFlare and testing whether DNS is the issue from there at no cost.

Sorry for the dumb question, do you mean setting my local DNS server to Cloudflare? (I’m currently using 8.8.8.8) Or using a different DNS service for my domain?

I asked Hover about it, and they pointed me to this:

What you could do is try using an Glue record. This will make your server as the DNS manager and route the domain name to a nameserver you can set up using Glue records. Basically your server becomes the nameserver

https://help.hover.com/hc/en-us/articles/217282437-How-to-Add-or-modify-your-own-name-servers-glue-records-

This still feels like a red herring to me. l don’t understand why Discourse wouldn’t work at the root of the domain in the same situation where Wordpress or Drupal would work?

No, I mean you don’t need to move your domain between registrars, but you will need to update the NS records at Hover for your domain to point it at a different providers DNS to test this theory. At present they’re set to ns1.hover.com and ns2.hover.com

It’s a very quick and pretty painless process. If you sign up for CloudFlare then try to add the domain there they will give you two new name servers which need to be input at Hover. There’s a guide to the hover side here:

1 Like

It’s been a while since I used the apex with anything other than CloudFlare. I’m going to test this in a bit myself to see if I can spot any other gotchas. Most of the issues with the apex apply to cnames, but I can now see that you’re using an a record.

1 Like

Thanks very much @Stephen for the pointers. OK, I’ll see if I can test it with Cloudflare. And what about pointing my domain at the Hetzner DNS servers? It looks pretty comprehensive, although a lot of what it automatically detected from my current Hover records doesn’t look right:

My best guess is that you did a bunch of rebuilds with something misconfigured and now let’s encrypt won’t issue a certificate because of rate limiting.

If that’s the case you can wait a week or try using www as the subdomain, which is really a good idea these days anyway.

You can look at the logs in /var/discourse/shared/log/var-log/nginx/access.log or perhaps

docker logs app

I expect you’ll see issues with the certificate not existent or invalid.

1 Like

Thanks @pfaffman , that’s what I was suspecting too, although I thought that the cert was created according to the initial setup log. But it looks like that was exactly the problem according to the app log, thanks for the pointer:

[Sun 09 Oct 2022 01:22:49 PM UTC] Create new order error. Le_OrderFinalize not found. {
  "type": "urn:ietf:params:acme:error:rateLimited",
  "detail": "Error creating new order :: too many certificates (5) already issued for this exact set of domains in the last 168 hours: blfdev.te
ch, retry after 2022-10-10T03:12:09Z: see https://letsencrypt.org/docs/duplicate-certificate-limit/",
  "status": 429
}
[S

Looks like I can retry in a few hours though. Thanks, I’ll report back tomorrow.

I’m still not hopeful that will fix this core problem though, because curl -v https://blfdev.tech shows that the server is rejecting calls to port 443. If it were just a certificate problem then I should be just getting a certificate warning in the browser when trying to access the site, correct?

The good part about this is that I had trouble with Hover not accepting 2048 DKIM keys in TXT records for the mailserver, whereas the Hetzner DNS does accept them, which I found while investigating this issue and playing around with Hetzer’s DNS service.

For now I disabled SSL in the app.yml and did a rebuild, and Discourse is now loading on port 80 without a subdomain.

I’m also now using the Hetzner DNS as authoritative. I’m not sure if this made the difference or if it was the failed Let’s Encrypt cert issue. I’ll report back again after another rebuild once I can create Let’s Encrypt certs again and re-enable SSL.

If you’ve tried rebuilding more than a couple times, You probably might be getting rate-limited by letsencrypt. You can workaround this by adding another hostname to your cert.

1 Like

Yes, but I believe that those directions no longer work.

The reason they you’re not connecting to Port 443 is that the certificate of broken and it causes an error in nginx n

3 Likes