I have a fun little story about scaling way too late one of my communities of 1.2 million members.
SkyscraperCity is one of the largest, if not the largest community dedicated to architecture, urban environment, and metropolitan structures in the world. Each area of the world with any significant level of urban development and associated larger structures has its own section. Naturally, it’s got a massive amount of pictures shared within it, because, well, the members of that community really, really like to show and tell on the skyscrapers being discussed.
The problem didn’t emerge in the place where you’d think it would.
The problem wasn’t scaling image handling and attachments late. That was going to be a known factor, and the platform was built to handle.
The problem happened as the number of distinct metropolitan areas grew, and with each individual new category, subcategory, sub-sub-subcategory based on continent, nation, state, and finally city, of which there could be any number of neighborhoods. The information architecture for the community had scaled beyond the platform’s ability to cope with each of those areas becoming active. It was buckling under the weight of all those categories, and performance had slowed to an utter crawl. It was downright painful to navigate from page to page, let alone make a post!
The specific platform was an older XenForo custom build built on node trees for categories, it wasn’t designed to handle hundreds and thousands of individual categories as part of the structure. The way it was built, each node occupies a good amount of resources, and so the more nodes at a certain point, the less stable the platform. Nerd that I am, the way I described it to the community for visualization was likening it to a particular episode of Star Trek: The Next Generation, when they discovered that warp engines were poking holes in the fabric of space/time a little too frequently, weakening the space in an area which was highly trafficked.
The zillion categories (sub-sub-sub-subcategories and so on) were causing “damage” to the platform stability the same way hundreds of warp drives being activated in and around a popular area of space was causing that area to collapse.
The solution was not dissimilar to the Federation’s, in that you throttle back and cap the punctures. So we ran a fulsome revision of the information architecture, merged a bunch of the lower level categories and even some of the higher ones, and the system stabilized.
Of course, this issue doesn’t show up with Discourse, since Discourse treats categories leagues better as first-class containers, rather than one big honkin’ structural tree. Mycelial network or transwarp conduits, if you will. But this is definitely one of the most prime examples I have of scaling too soon, and not exactly where you’d think the hit would come.