Discourse Update Failed - Rebuild Now Fails


(Sisko) #1

I was updating my Discourse but it was taking a long time so i left it over night, the following morning it had not moved so i cancelled it.

I am using Docker so I attempted to start the container, it starts up but has a 502 Bad Gateway error.

Attempts to rebuild have not gone well, it fails attempting to run “gem update bundler” with the error

ERROR: While executing gem … (Gem::RemoteFetcher::UnknownHostError)
timed out (https://api.rubygems.org/specs.4.8.gz)

The main failure message has some more information


FAILED

Pups::ExecError: cd /var/www/discourse && gem update bundler failed with return #<Process::Status: pid 274 exit 1>
Location of failure: /pups/lib/pups/exec_command.rb:108:in `spawn’
exec failed with the params {“cd”=>"$home", “hook”=>“web”, “cmd”=>[“gem update bundler”, “chown -R discourse $home”]}
3d9e6b7b840aef42b0b45e17f09013769ed7af4f46fffe40d2e2049b7b102616
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one


I have tried to use alternate sources for the Gem and have even tried adding the china template to my app.yml but i still receive the same error.

I am able to enter the container and manually run “gem update bundler” but it says that there is nothing to update.


root@server1:/var/discourse# ./launcher enter app
root@server1-app:/# gem update bundler
Updating installed gems
Nothing to update


I have a feeling that there are other issues causing the problem.

The full rebuild log can be found here.

Reading through the logs i have found that there are a few errors with postgresql that might be related, however i am not sure as i havent read through a successful rebuild log. I originally had some screenshots of the errors but as a new user i cant post them…

Some errors are about the database already existing and another is about failing to bind to the port because its already in use even though a few lines before it successfully bound to the port. This may be normal behaviour but i thought id mention it just in case.

The server definitely has enough RAM (32GB) and enough storage space.

Posting here is my last resort before i decide to scrap the whole thing and install a fresh copy.

I can provide more information if required.

Any help will be much appreciated!

Regards,
Sisko

P.S This post was formatted much nicer to begin with but for some reason new users arent allowed to post more than 2 links an 1 image…


(Matt Palmer) #2

That’s the crux of the issue. Until you can fix that, nothing else really matters. It’s an issue between you and rubygems, and possibly your not-really-an-Internet-service-provider.


(Sisko) #3

Are you able to provide any suggestions?

I am able to manually run gem update bundler and it successfully runs.

If there was an issue between me and rubygems wouldnt the manual run also fail?


(Sisko) #4

I just performed a bunch of network tests from multiple locations.

I am located in Australia and I am unable to ping rubygems.org from anywhere at all(including a US server), i can only assume they are configured not to respond to ping.

My ISP is different than the ISP my Discourse Server uses but the traceroutes are almost identical.

The results can be found here

I have been using the domain name (rubygems.org) to perform all my tests and every test has resolved the same IP address(54.186.104.15).

I just added “templates/web.china.template.yml” to my app.yml and i still get the same error

I, [2016-08-19T04:15:03.822285 #14] INFO – : > gem sources --add https //gems.ruby-china.org/ --remove https //rubygems.org/
I, [2016-08-19T04:16:13.362650 #14] INFO – : Error fetching https //gems.ruby-china.org/:
timed out (https://gems.ruby-china.org/specs.4.8.gz)

I removed the semicolons from the links in the error message above because i cant have more than 2 links :confused:

I can even ping ruby-china.org but the rebuild still fails there.

PING ruby-china.org (61.174.15.167) 56(84) bytes of data.
64 bytes from 61.174.15.167: icmp_seq=1 ttl=45 time=355 ms
64 bytes from 61.174.15.167: icmp_seq=2 ttl=45 time=350 ms
64 bytes from 61.174.15.167: icmp_seq=3 ttl=45 time=365 ms
64 bytes from 61.174.15.167: icmp_seq=4 ttl=45 time=349 ms
64 bytes from 61.174.15.167: icmp_seq=5 ttl=45 time=347 ms
64 bytes from 61.174.15.167: icmp_seq=6 ttl=45 time=454 ms
64 bytes from 61.174.15.167: icmp_seq=7 ttl=45 time=357 ms
64 bytes from 61.174.15.167: icmp_seq=8 ttl=45 time=360 ms
64 bytes from 61.174.15.167: icmp_seq=9 ttl=45 time=359 ms
64 bytes from 61.174.15.167: icmp_seq=10 ttl=45 time=535 ms
64 bytes from 61.174.15.167: icmp_seq=11 ttl=45 time=363 ms

This is why i said in my OP that i dont think the issue is with my connection to rubygems.org, plus the fact that i can manually run “gem update bundler” without issue.

I may be wrong and the issue may indeed be with my connection to rubygems so any suggestions would be helpful.


(Matt Palmer) #5

If you’re in Australia, why are you using a Chinese rubygems mirror? That just seems like an invitation for all manner of backdoored shenanigans.

When you run gem update bundler successfully, is that inside the container, or on the host? If the latter, then there’s something mysterious about the setup on your machine that’s causing the container to not have networking – could be all sorts of things, but messed firewalls and/or Docker daemon config parameters would be the first port of call for me.


(Sisko) #6

Using the Chinese rubygems mirror was purely a test as it was the only mirror i knew of.

I am running “gem update bundler” from inside the container. Ruby is not installed on the host itself.


(Matt Palmer) #7

I guess it’s pcaps time. Inside the container, on the veth interface on the host, on docker0, on the host’s physical interface, and preferably on your CPE device, if you’ve got access to it.