Unable to start discourse due to rubygems rate limiting

I ran some standard updates on the Ubuntu machine I have hosting my discourse instance. Now I cannot start discourse again.

I’m not a docker user, but the instructions I inherited from a previous staff member who set this up are to run:

./launcher stop app
./launcher rebuild data
./launcher rebuild app
./launcher start app

But when I run

./launcher rebuild app

It takes forever and finally fails. The reason for the failure seems to be below:

URGENT: could not connect to server: No route to host
         Is the server running on host "172.17.0.1" and accepting
         TCP/IP connections on port 5432

Can anyone please help with this? I’ve been messing around for hours now.

We need more information in order to help you. Can you start by providing us with the full log of the rebuild operation?

The full log? I’ve just discovered it is over 32K lines. Do you actually want it all?

There shouldn’t be 32K lines in the rebuild log. What Régis is looking for the the console after from after you run ./launcher rebuild app. Share everything that appears in the console after that command.

Thanks for clarifying, but that’s actually what I am talking about. I run the rebuild command, and after that there are 32K lines that appear in the console, right up to the failure notification.

Wow. In that case please do upload the output to something like pastebin or gist, and share the link here.

OK, I’ve pasted here: gist:6464801c6cde865ab60e3f10dba21e05 · GitHub

1 Like

Thanks.

Here is something concerning

Bundler::HTTPError: Net::HTTPTooManyRequests: <html>
<head><title>429 Too Many Requests</title></head>
<body bgcolor="white">
<center><h1>429 Too Many Requests</h1></center>
<hr><center>nginx</center>
</body>
</html>

You are hitting the rate limit for rubygems.org.

2 Likes

Thanks for spotting that. I have no idea what I can do about that though - I’m just running the commands that are meant to work with Discourse. I can’t even find any results for rubygem rate limits.

Every time we have had to do anything that involved restarting this server, Discourse breaks and I have to spend hours or days trying to fix it. It doesn’t seem to really work.

I’m afraid there’s nothing you can do about that short of waiting a bit before re-issuing your rebuild command or contacting rubygems.

Do you mean that the rebuild could work if I try it later on? It seems to me that the problem is that the rebuild job itself involves too many requests and is therefore impossible. I can see others have stumbled across this recently too.

Pretty sure it will unless your IP has been blacklisted. The only way to know for certain is to contact rubygems.org.

From what I’ve seen, it only happens when using a VPS (where you don’t know what the other/previous tenants are doing) and/or when you rebuild your Discourse several times in a row in a short period of time.

FYI: we’ve never been rate limited, even when we deploy all our customers.

When I’ve contacted them I’ve gotten no response.

I can’t make rhyme or reason of the cause when I’ve been rate limited. It also doesn’t make sense that you guys have never been rate limited when I have on several occasions. Stranger still, sometimes I see the rate limit errors, but the build appears to complete successfully.

But yes, @shaneoh, it should work eventually.

You have a multicontainer setup and is starting app before data. Try the other way around.

Would you mind telling me the exact series of commands you would recommend running? I’ve just tried this and get the same issue:

  ./launcher stop data
  ./launcher stop app
  ./launcher start data
  ./launcher start app
  ./launcher rebuild app
./launcher start data
# wait 10 seconds
./launcher start app

Yeah, general feedback online seems the same, nobody gets a response.

I’ve just restarted the server and made sure it was assigned a new IP address. On the first attempt I got the same issue. This rate limit thing doesn’t seem to be very logical. As it’s a brand new attempt, I still have to suspect something about the way discourse is attempting to do this is breaking things, but I don’t know enough about how it works to know for sure.

The result there was the same as it’s been all when the rebuild fails. It seems fine but the URL shows a bad gateway.

Enter the container /launcher enter app and tail the logs tail -f /var/www/discourse/log/* and try to find why.

How many times did you use the rebuild command? Running it multiple times in quick sucession will 100% put you on RubyGems rate limit.

That’s probably a good sign. If you reload about . . . now, it’ll probably be up and running.