I had a general question how crawler throttling is implemented.
According to Change Googlebot crawl rate - Search Console Help the recommended HTTP status is 429 (Too many requests) or 503 (Site unavailable).
But reading through the source code it looks like throttling is implemented by
throwing an error: https://github.com/discourse/discourse/blob/85fddf58bc1e751d0ac5b8192a630c59a34aed7d/lib/rate_limiter.rb#L129
My Ruby on Rails days are long behind me, but I am assuming that this raises a generic 505?
The Google crawler doesn’t quite understand discourse’s throttling and in Google Search Console I can see that our indexing and therefore impressions drastically reduced after throttling was implemented, but not due to throttling, but due to 5xx server errors.
I understand that throttling instances may be sometimes necessary if they cause to much traffic, but I was expecting that discourse reports a HTTP 429, instead of serving the crawler a 505 Internal Error.