Because those bots are “resource sucking bot hogs” which provide very little value to sites compared to the amount of resources these bots consume.
Of course, you can customize the Discourse
robots.txt file and permit them if you wish; but we block these bots on our sites long before Discourse was released and keep them blocked.
I forgot to mention that many of these “resource sucking bot hogs” do not respect
robots.txt and they must be blocked at the HTTP User Agent level. We block these "disrespectful resource sucking bot hogs” with
mod_rewrite at the reverse proxy level, generally speaking (one of the many good reasons to run behind a reverse proxy, BTW).