How to bypass rate limiter when using an API key?


(Ryan Fox) #1

I’ve noticed that even when using an API key, I run into both the 12 requests per second limit in nginx and the per-user post limits. It makes scripting against Discourse very difficult. Is there a way that this can be disabled for requests that include a valid API key?


Whitelisting an api username
(Jeff Atwood) #2

Not really, no, since it implies nginx (the thing enforcing the limits) would be able to test the validity of the API key when it is far upstream of all that…

Also that’s by design, wouldn’t you say :wink:


(Ryan Fox) #3

So, just spit-balling here: I don’t know very much about configuring nginx, so maybe this isn’t possible.

How about a /api/ location that would require HTTP auth using the API key, and then rewrites the URL to the normal URL, and doesn’t have a rate limit?

You could perhaps write to the auth config file when API keys are generated or revoked.


(Kane York) #4

The per-user post rate limits will be disabled if the acting user is an admin.

As for requests per second, what kind of scripting are you doing?


Whitelisting an api username
(Ryan Fox) #5

Automated posting as users, grabbing a bunch of topics to get stats, migrating data from older systems. Stuff like that.


(Kane York) #6

Have you looked at the import scripts? They run as Ruby code, much more suited to one-shot creating a lot of posts.

For grabbing stats from a bunch of topics, I suggest either funneling the requests through a queue or grabbing “request tokens” from a emitter going every 1/10th second, depending on what programming language you’re using.


(Dylan Hunt) #7

This is old, but … it’s pretty relevant. I was looking for the same thing. I’m using the API to report bad players for a game. This shouldn’t be limited. However, our BOT account doesn’t need to access any admin settings, just simply post. I read that you need to be an ADMIN to do this:

This is pretty nasty … why would I have to make the bot account an admin to bypass rate limitations just for posting? In 2017, is there a better way to do this?


(Kane York) #8

You could simply increase the problematic rate limits in the site settings.

This isn’t really a problem that has come up in the past - usually, bots that needed to bypass rate limits also were doing something else that required either admin or acting as arbitrary users.


(Dylan Hunt) #9

Actually, I did try adding my bot admin to test it – I am still running into rate limitations, every now and then. Do you happen to know anything about this? While debugging, I’m testing once every few mins and my admin account is still getting rate-limited 422. So strange …

EDIT: Strange, the rate limiters seem to default to 5~15 SECONDS (I figured minutes). Something is wrong

image

Either way, shouldn’t an admin bypass these?


(Kane York) #10

It’s probably the nginx web server rate limits - it has additional short-term request-counted rate limits. The Discourse rate limits are there to protect against bad behavior and generally expire per day. The nginx ratelimits are there to protect against flooding.

As you just noticed in your edit, the nginx ratelimits are on the order of seconds and minutes.


(Dylan Hunt) #11

Ahh that makes more sense.

I wonder if there’s a way to delay requests instead of just 422 them


(Kane York) #12

Yes, there is: have a model of the rate limits in your client and force it to sleep if you are about to violate them! :slight_smile:

Also, check the code for whether or not mods bypass the rate limits because I don’t actually remember.

Also, I don’t believe there is a once/x ratelimit on editing your post - maybe you could make the bot edit posts if data comes in too fast?


(Ryan Fox) #13

It has been a long while, but I think I ended up turning off nginx’s rate limiting, and enabled rate limiting in haproxy (which I already had running in front of Discourse.) In haproxy, I was able to whitelist the IP for my bot.


(Dean Taylor) #14

All HTTP clients should really use “Truncated Exponential Backoff” or something similar:

Exponential Backoff is an algorithm that retries requests to the server based on certain status codes in the server response. The retries exponentially increase the waiting time up to a certain threshold. The idea is that if the server is down temporarily, it is not overwhelmed with requests hitting at the same time when it comes back up.

It’s quite common to find an open source implementation in all of Google’s API libraries.

Some example implementations listed here for Google Storage calls:
https://cloud.google.com/storage/docs/exponential-backoff