All communities start at level zero, and will grow until the team of administrators feels comfortable … and begins to analyze why there are topics of debate that have been published for years but do not receive visits. Do you need to keep that theme? Shall we archive it? Do we delete it?
If we archive it, search engine spiders will be able to continue indexing it, so it could continue to bring new readers to the forum. But … is it really useful? Or would it be better to delete it?
On the other hand, if we delete it, we free up resources and refresh the possibility that those who come to the community open new and fresh discussions about content that is not published (or archived).
In my case, my forum is approaching 500,000 pageviews in the last 30 days, and I want to better optimize the content that I show the world.
How to optimize content? How to properly clean old content?
Some tasks that are being carried out right now:
In tutorial category or knowledge posts, new questions such as new topics are being moved to the support category, and messages with content are left in the appropriate sector, with the timer that new responses will be automatically removed.
I have changed the settings to some categories so that the search engine bots cannot index the content, which is only for registered users. (Categories sensitive to search engines only).
Thanks for your reply Sarah. I know that you are a consultant in digital communities so your experience in the field would be of great help. It would be great to know your opinion for both types of communities.
In my case, since 2008 I dedicate myself to consulting in ERP, exactly the SAP system, so I have been providing information and support to companies and end users for many years, who like me, once started from scratch with the system. And over time we have developed some kind of addiction for the consulting profession.
I usually manage support communities, this project is the largest I have. I have used other systems and I attest that Discourse has exceeded all my expectations. That is why I want to enhance it, and maintain ideas for debugging and cleaning the content, to offer my readers a quality space and fresh information.
Yes, of course, I forgot to clarify that I am doing this cleaning management; crawler by crawler, almost weekly analysis to detect which one is the most invading the website, and I block it one by one.
What I’m afraid of is that adding so many bots to the blacklist will somehow affect the performance of the site.
So I thought that maybe instead of blocking one by one, the best thing could be to add the essential bots to the whitelist and block all the others. But … essential trackers, do they exist?
I searched the forum to see if there is a debate “dedicated” to essential crawlers but I could not find it. If you know of a topic related to this, please let me know.
If your site is public and SEO has value to you then any bot which adds your data to a useful index is “essential”. Look at your sources of traffic and compare that to the bots, is there any correlation?
A bot whitelist may be the better solution here, right?
If you use Cloudflare it’s possible to block “bad bots” with it. I’m not sure how well it works though. I tried it temporarily and it blocks anything that looks automated, even things like curl and my own scripts that use Discourse’s API.