I have been attempting to do a fresh install of Discourse on a fresh installation of Ubuntu Server 16.04 but the installation fails during bootstrapping.
I have uncommented DOCKER_OPTS in /etc/default/docker so docker uses googles DNS but this didnt fix the issue.
I checked with my server host to see if anything was being blocked on their end that would cause this, i was told no. They did however mention that the docker0 interface using an MTU of 1500 instead of their recommended 1400 may be causing the issue. I have noticed that the Veth interface that appears only during the bootstrapping process also uses an MTU of 1500.
Can anyone confirm that the MTU may cause this issue? I have attempted to change the MTU that docker uses but havent had any luck, i added --mtu=1400 to the DOCKER_OPTS but this doesnt appear to change anything. Am i changing the MTU in the wrong location?
Its entirely possible that my server host was mistaken and there is indeed something on their network causing this issue but id like to first rule out the MTU before i contact them again.
Does anyone have any suggestions? Im happy to provide some more information if requested.
I’m unable to reproduce this, on a fresh DO droplet running Ubuntu 16.04. Is there any chance you’re behind some sort of HTTPS proxy, or something else that could be getting in the way? If you’re quite sure your Internet connection is clean, could you do a full-packet capture of the TLS session to github, and put it up somewhere I can download it (PM me the link to download). The command to run on the host is something like sudo apt-get install tcpdump && sudo tcpdump -i eth0 -n port 443 -s 0 -w /tmp/https.pcap -v, and then re-run the ./launcher bootstrap app in a separate terminal window. Once the bootstrap has errored out, use Ctrl-C to stop the tcpdump, and copy /tmp/https.pcap somewhere I can get it.
I havent set up a proxy on my server but its possible that the host network is using one, i have contacted my host to check but i am still waiting on a reply.
In the meantime i have run the packet capture you requested and will PM you the results shortly.
There’s definitely a proxy, or something else untoward, in the way. It’s dropped the first segment of the server’s response to the TLS handshake, leaving a corrupted session. Time to shake your ISP until some sense falls out – or just use DigitalOcean. We recommend them because this sort of thing doesn’t happen.
I have just heard back from my server host and they have told me there is no proxy setup on their network.
There are no proxys setup on our network, if your running 1400 mtu the only other hinderance stopping it from working would be that their servers are in China or Russia which we are currently blocking all in and outgoing data due to malicious activity.
The bootstrap fails to access github which as far as im aware has their servers located in the US so the China/Russia blocks should not be causing an issue.
Once again they have mentioned the MTU needing to be 1400, are you able to assist me with testing this? i have tried adding --mtu=1400 in DOCKER_OPTS but this doesnt appear to change the MTU when i run an ifconfig
As this isn’t a Discourse problem, we can’t really help you. I’d suggest getting further assistance from your hosting provider, because the packet capture clearly shows something funky happening in the network.
My main question here is, COULD this issue be the MTU of the network adapter that Discourse is using (the docker interface)?
Because if the issue COULD be with the MTU i can make further attempts to change it, if the issue CANT be the MTU then i can pressure my hosting provider further.
They will not go any further until i have confirmed that the issue is NOT the MTU as they are adamant there will be connection issues using an MTU of 1500.
If you cant assist with changing the MTU of the network adapter that discourse is using then who would be able to assist? Docker support?
The --mtu=1400 also helped me.
The cause of my problem was a VPN. When running the docker without VPN it worked.
Somehow VPN and other virtual environments may cause docker to lower the MTU
We’ve had a similar issue with a fresh install of Discourse on the UKCloud provider, which we haven’t used for Discourse before. The install or ./launcher rebuild app would fail when it tried to update the gems:
FAILED
--------------------
Pups::ExecError: cd /var/www/discourse && gem update bundler failed with return #<Process::Status: pid 308 exit 1>
Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn'
exec failed with the params {"cd"=>"$home", "hook"=>"web", "cmd"=>["gem update bundler", "find $home ! -user discourse -exec chown discourse {} \\+"]}
ca66e90e8be984c2f8975b788649d387ba7ce5ce743f8b8305ca30f5a90fd3e8
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one
This command helped when finding out what the MTU was on the cloud server (thanks to this blog post for some help.) ping 8.8.8.8 -s 2000 -M do
It sets the packet size to 2000 bytes using the -s flag, and forces the ping not to fragment packets using -M do. This causes the ping to fail because the packet is too large. It will tell you what the MTU is, or you can find it by trial and error, reducing the -s 2000 gradually until it works.
I edited /lib/systemd/system/docker.service and it worked immediately
I’m putting this here a) for my own future information b) to help anyone else Googling that error with UKCloud as their provider.
I don’t actually know - the box was set up for me by a person who has access to that UKCloud panel - I just have an IP address and login with SSH key.
We eventually fixed this problem by changing MTU, by creating daemon.json file in /etc/docker/ folder with the following in:
{
"mtu": 1400
}
Then restart the docker service sudo service docker restart. It doesn’t change the default MTU, but any newly created docker services will use the figure in there.