Update failing due to rate limiting by RubyGems


(Hosein Naseri) #16

Ok so I remove this and add all others to see what happens. However I dont know why it is the culprit? It has not been updated for a long time.


(Jay Pfaffman) #17

Plugins get broken if Discourse gets updated in a way that breaks the plugin. The other plugin you listed is maintained by a team member, so it’s somewhat more likely he’ll notice something’s wrong before other people.


(Hosein Naseri) #18

Nope. It failed again. So I’m removing backup to dropbox this time.


(Hosein Naseri) #19

But even if it is a broken plugin, why it would trigger rubbygem threshold? Its not making sense.


(Jay Pfaffman) #20

I couldn’t figure that out either. I suspect that it’s unrelated.

Are you still getting the rubygem error?


(Hosein Naseri) #21

I removed the backup to dropbox plugin and added all others and now it rebuilded without any error.


FAILED TO BOOTSTRAP - Switch to older docker image?
(Jay Pfaffman) #22

In that case, @Falco might want to have a look at the dropbox plugin.


(Matt Palmer) #23
  1. If you get rate-limited by rubygems, you need to talk to rubygems. They explain how in their rate-limiting documentation. We can do absolutely nothing to help with rate-limiting problems. (This doesn’t just apply to the OP; everyone giving advice on this problem any time it occurs needs to refer people to rubygems, loudly, explicitly, and using words of no more than two syllables.)

  2. I can’t see any specific error messages relating to the dropbox plugin in this thread. Without a complete error message and backtrace (if displayed in the logs), @Falco isn’t going to be able to do anything to fix whatever error might be going wrong.


FAILED TO BOOTSTRAP - Switch to older docker image?
(Hosein Naseri) #24

It doesn’t make sense to me why in such situations we need to talk to rubygems. We are installing/Updating something that passes their thresholds. So I think their response would simply be for us to check the plugins we are trying to install. Am I right?


(Jay Pfaffman) #25

That’s what I think too, but this has been coming up lately, and I’m using a host I’ve used for a while and have not had problems previously.

Their site says:

Is it possible that the bootstrap script is doing that as a matter of course? Here’s an excerpt from a build just now:

HTTP GET https://index.rubygems.org/info/aws-sdk-cloudwatchlogs
HTTP 200 OK https://index.rubygems.org/info/aws-sdk-cloudsearchdomain
HTTP 200 OK https://index.rubygems.org/info/aws-sdk-cloudwatchevents
HTTP 429 Too Many Requests https://index.rubygems.org/info/xattr
HTTP GET https://index.rubygems.org/info/aws-sdk-codecommit
HTTP 200 OK https://index.rubygems.org/info/aws-sdk-cloudsearch
HTTP 200 OK https://index.rubygems.org/info/aws-sdk-applicationdiscoveryservice
HTTP 200 OK https://index.rubygems.org/info/aws-sdk-cloudhsm
HTTP GET https://index.rubygems.org/info/aws-sdk-codedeploy
HTTP GET https://index.rubygems.org/info/aws-sdk-codepipeline
Bundler::HTTPError: Net::HTTPTooManyRequests: <html>
<head><title>429 Too Many Requests</title></head>
<body bgcolor="white">
<center><h1>429 Too Many Requests</h1></center>
<hr><center>nginx</center>
</body>
</html>

/usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.15.1/lib/bundler/fetcher/downloader.rb:36:in `fetch'
  /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.15.1/lib/bundler/fetcher/compact_index.rb:116:in `call'
  /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.15.1/lib/bundler/compact_index_client/updater.rb:43:in `block in update'
  /usr/local/lib/ruby/2.4.0/tmpdir.rb:89:in `mktmpdir'
  /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.15.1/lib/bundler/compact_index_client/updater.rb:30:in `update'
  /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.15.1/lib/bundler/compact_index_client.rb:81:in `update'
  /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.15.1/lib/bundler/compact_index_client.rb:97:in `update_info'
  /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.15.1/lib/bundler/compact_index_client.rb:54:in `block in dependencies'
  /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.15.1/lib/bundler/fetcher/compact_index.rb:87:in `block (3 levels) in compact_index_client'
  /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.15.1/lib/bundler/worker.rb:63:in `apply_func'
  /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.15.1/lib/bundler/worker.rb:58:in `block in process_queue'
  /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.15.1/lib/bundler/worker.rb:55:in `loop'
  /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.15.1/lib/bundler/worker.rb:55:in `process_queue'
  /usr/local/lib/ruby/gems/2.4.0/gems/bundler-1.15.1/lib/bundler/worker.rb:89:in `block (2 levels) in create_threads'
HTTP GET https://index.rubygems.org/info/aws-sdk-cognitoidentity
HTTP GET https://index.rubygems.org/info/aws-sdk-cognitoidentityprovider
HTTP GET https://index.rubygems.org/info/aws-sdk-cognitosync
HTTP GET https://index.rubygems.org/info/aws-sdk-configservice
HTTP GET https://index.rubygems.org/info/aws-sdk-databasemigrationservice
HTTP 200 OK https://index.rubygems.org/info/aws-sdk-configservice
HTTP GET https://index.rubygems.org/info/aws-sdk-datapipeline
HTTP 200 OK https://index.rubygems.org/info/aws-sdk-cloudwatchlogs
HTTP GET https://index.rubygems.org/info/aws-sdk-devicefarm

And more disturbingly, at the end of that run:

Successfully bootstrapped, to startup use ./launcher start webonly-multi

So does that mean that I’m missing that gem and won’t find out until something tries to call them? I don’t use AWS, so I guess I don’t care? I’m confused.

I’ve read where you (or someone on team) said that you guys aren’t using any kind of proxy and rebuild all the time, so I can’t make sense of what I’m seeing.


(Matt Palmer) #26

Because they’re the ones rate-limiting your requests.

I have no idea what their response would be. I don’t presume to speak for rubygems, which is why I suggest talking to them, to find out what they would say, and save all this guesswork.

It also says

Some endpoints may be cached by our CDN at times and therefore may allow higher request rates.

and

The RubyGems.org team may occasionally blackhole user IP addresses for extreme cases to protect the platform.

Which means that the correct description of the rate limit is “10 requests per second, unless it isn’t”. Which is less than entirely helpful.

If there were timestamps on that log, it would help to identify how quickly requests are actually being made. As it is, all we can see is “requests were made, and some got 429’d”. Also, according to that exerpt, as is, you got rate-limited after two successful requests, so either you’re hitting a blacklist (it’s hard to see how making three requests could get you to hit a 10 req/sec limit) or there’s other requests you didn’t show. Either way, though, without timestamps nobody can tell whether you might be hitting a limit or not.

Exactly. We’ve got customers with some pretty monstrous plugin lists, and their rebuilds never get rate-limited that I’ve seen. Perhaps we do builds often enough that the CDN node we hit is kept constantly fresh with all the packages we need, which is… nice, I guess, but not helpful to anyone else.

Which is why, if you are seeing this problem, you should talk to rubygems so you can make sense of what you’re seeing, because our pontificating on meta about what may or may not be going on isn’t going to get the problem solved. If rubygems comes back with some actionable advice as to how Discourse and its environs can be modified to work better with rubygems rate-limits, great. Let us know, and we’ll look into it. However, having someone from Discourse directly talk to rubygems isn’t going to help, because we’re not seeing the problem. We can’t provide any useful diagnostic data (such as source IPs, timestamps, or anything) to rubygems, nor can we describe how to reliably (or otherwise) reproduce the problem.

So… once again, if you are seeing rate-limiting problems with rubygems, talk to rubygems. If you’re not willing to do that, then please don’t complain about being rate-limited here on meta, because all you’re doing is adding to the noise, not the signal, about this problem.


FAILED TO BOOTSTRAP - Switch to older docker image?
(Jay Pfaffman) #27

I’ve got an idea now. I’ll see if I can figure out how to ask ruby gems about the problem. If I find out anything useful, I’ll let you know. :wink:


(Robby O'Connor) #28

@pfaffman – Why not work with @hnaseri – I recall @11145 having an issue as well – maybe the three of you gather: outputs, source ips and such and see if there is a common denominator (privately!)…


(Hosein Naseri) #29

Is there any update on this issue @pfaffman?


(Jay Pfaffman) #30

I’ve not heard back from them.


(Discourse.PRO) #31

Has anybody solved the problem?
I have the same issue: every attempt to run ./launcher rebuild <container> fails with the «429 Too Many Requests» from rubygems.org:

Here is the only solution I have found: The only solution I have found to workaround «429 Too Many Requests» failure from rubygems.org


(Thomas Abraham) #32

This issue still exists. “Talking to rubygems” cannot be an option. If an application is hitting a ratelimit by 3rd party, the application has to take care of building requests not exceeding the limits and/or to retry failed downloads after a safe period until the downloads are done.


(Thomas Abraham) #33

Probably interesting to know that rubygems.org is using GEO location based DNS. Therefore they are able to respond very fast at every location.

Nevertheless there’s a difference between aws.eu-central-1 and aws-us-east-1. In both datacenters rubygems.org is available with ~ 1ms latency. But, at least currently, I am not hit by ratelimits in us-east-1, but in eu-central-1.


(Jeff Atwood) #34

This is the official rubygems response:


Hi Jeff!

I conferred with the bundler folks about this issue and they told me it was a known issue with older versions of bundler. If your users update their bundler version, the issue should go away.

The errors are actually fastly throttling individual users, which as you might expect means that those versions of bundler are sending A LOT of requests.


(Allen - Watchman Monitoring) #35

Awesome @codinghorror! Thank you for tracking that down & providing some resolution to this issue.