We have a request from a customer that I’ve been expecting for a while:
We want to remove the beta and alpha categories, and all topics in those categories
Meaning, there are a whole lot of posts in those alpha and beta categories that represent the state of their software at a much older point in time, such that all the topics there are basically irrelevant, forever. (This is also video game software, where after 3 years a game is forgotten forever and never played again, so old beta/alphas of games are especially irrelevant in the world of software.)
The intent of the “archive” action on a topic is to prepare it for eventual archiving. Now that may mean deleting it. Or it could mean moving it to some kind of historical long term offline archives. At any rate, an archive is something that is
not of practical current interest
might be useful very rarely to someone digging through long term history for obscure reasons
helpful to remove from the current active instance to make room for newer, more relevant current content
I believe archiving should be triggered either by:
archiving out all topics with a state of archived
archiving out all topics in a particular category
I’m not sure we can physically remove the posts and topics from the database without extreme trauma to our codebase, so perhaps the only alternatives are to
delete (all our deletes are soft deletes) every topic in the category
mark the category as archived, and have special handling for topics in archived categories
Then, produce an export file of the archived topics.
This also implies there is a way to selectively bring back a set of archived posts, or an archived category, which is probably way too hard.
I guess the simple thing to do, for now, is
just mark every topic in the category (or all archived topics) as deleted for now.
make sure we have an ‘archived’ category state (perhaps a date of archive) to look at later.
Should there not be some way for normal users who are not forum admins to access the archives somehow? Or would that be part of the “special handling”.
Perhaps if we made sure that the archive file was sanitized of secret information, then they could be put up for download. Even if we don’t want it to be searchable.
If the posts get actually removed from the database, we could eventually get a screen like this:
This topic has been archived.
If you want to view it, click below and give us a minute to dig it up.
If the removal of the content is the point, then a soft delete like you mention (assuming hard deletes are a problem) should be applied on all of the content, and all of these topics & posts should be excluded from the default backup. Instead, when you do the archiving, they should be put in a separate “Archives” backup of their own.
If you just don’t want your searches getting polluted & slowed down with deprecated posts, and you don’t want archived categories cluttering up your category list, then a “do not search” flag on archived post and a special category that doesn’t show even in staff lists (but retrievable through admin panel somehow) would suffice. Essentially making all topics Unlisted, plus special treatment of the category visibility.
There may also be some mapping here for moving a category to a different Discourse instance, which we have at least one paying customer wanting to do
What we need:
Copy one category
Copy all of its sub-categories (10)
Copy all posts in these categories (<50)
Copy all members that belong to the custom group; keep current usernames, email and passwords of these users on the new forum
Same basic area of work, IMO. Splitting one Discourse into another is a logical thing to do, only difference I can see in this case is that instead of transferring the entire category to /dev/null we are copying it to another Discourse instance…
Anyway keep that in mind as we work on this @neil!
I know this is rather old, but as the Ask Fedora discourse site starts to get to several years old itself, I’m thinking about this issue. That forum is for end-user troubleshooting, and while some topics are evergreen (and spelunking in the past often somewhat useful to figure out if something is a new problem, or maybe even some history of why it is like it is), the earliest topics are from the Fedora Linux 29 era. We’re about to release Fedora Linux 36, and quite a bit has changed.
I think I’d like for topics which are from very old releases (in our move-fast timeframe, say, from more than three years in the past) to:
move out of site search results except when asked for
definitely move out of being suggested as similar topics!
maybe be hidden from Google search? not sure.
no more replies, but…
more prominent + New Topic link so people who do find it and have something to say can refer to it easily
(Actually, um, that last one seems like it’d be nice for Closed posts!)
The discourse staff is pretty ruthless about deleting closing and deleting stuff that’s not useful. I’m sometimes sad not to find something that I look for, but not often. I think aggressive moderation rather than some automated system (which will make terrible mistakes).