How should category archiving work?

We have a request from a customer that I’ve been expecting for a while:

We want to remove the beta and alpha categories, and all topics in those categories

Meaning, there are a whole lot of posts in those alpha and beta categories that represent the state of their software at a much older point in time, such that all the topics there are basically irrelevant, forever. (This is also video game software, where after 3 years a game is forgotten forever and never played again, so old beta/alphas of games are especially irrelevant in the world of software.)

The intent of the “archive” action on a topic is to prepare it for eventual archiving. Now that may mean deleting it. Or it could mean moving it to some kind of historical long term offline archives. At any rate, an archive is something that is

  • not of practical current interest
  • might be useful very rarely to someone digging through long term history for obscure reasons
  • helpful to remove from the current active instance to make room for newer, more relevant current content

I believe archiving should be triggered either by:

  • archiving out all topics with a state of archived
  • archiving out all topics in a particular category

I’m not sure we can physically remove the posts and topics from the database without extreme trauma to our codebase, so perhaps the only alternatives are to

  1. delete (all our deletes are soft deletes) every topic in the category
  2. mark the category as archived, and have special handling for topics in archived categories

Then, produce an export file of the archived topics.

This also implies there is a way to selectively bring back a set of archived posts, or an archived category, which is probably way too hard.

I guess the simple thing to do, for now, is

  • just mark every topic in the category (or all archived topics) as deleted for now.
  • make sure we have an ‘archived’ category state (perhaps a date of archive) to look at later.

Any thoughts here?

6 Likes

There is another option that can work quite well, make the category a staff only category.

This effectively “removes” all the posts in one very quick go. Its clean, has simple undo and very minimal side effects that are “staff only”.

That is the way they have it now (I actually checked for that earlier), and it is not what they want.

It is just unnecessary clutter for them at this point, a bunch of obsolete content.

maybe you could provider an admin setting for “archived categories” or some such - add category names there that you want to disappear forever.

nice aspect of that would be that it really removes all the posts in one go, and is reversible, and even the admins won’t be able to see the posts.

Should there not be some way for normal users who are not forum admins to access the archives somehow? Or would that be part of the “special handling”.

Perhaps if we made sure that the archive file was sanitized of secret information, then they could be put up for download. Even if we don’t want it to be searchable.

If the posts get actually removed from the database, we could eventually get a screen like this:

This topic has been archived.

If you want to view it, click below and give us a minute to dig it up.

[ View Archived Topic ] Download archive of Alpha (2013-2014) (0.5GB)

If the removal of the content is the point, then a soft delete like you mention (assuming hard deletes are a problem) should be applied on all of the content, and all of these topics & posts should be excluded from the default backup. Instead, when you do the archiving, they should be put in a separate “Archives” backup of their own.

If you just don’t want your searches getting polluted & slowed down with deprecated posts, and you don’t want archived categories cluttering up your category list, then a “do not search” flag on archived post and a special category that doesn’t show even in staff lists (but retrievable through admin panel somehow) would suffice. Essentially making all topics Unlisted, plus special treatment of the category visibility.

We’ve talked about similar things before:

3 Likes

There may also be some mapping here for moving a category to a different Discourse instance, which we have at least one paying customer wanting to do

What we need:

  • Copy one category
  • Copy all of its sub-categories (10)
  • Copy all posts in these categories (<50)
  • Copy all members that belong to the custom group; keep current usernames, email and passwords of these users on the new forum

Same basic area of work, IMO. Splitting one Discourse into another is a logical thing to do, only difference I can see in this case is that instead of transferring the entire category to /dev/null we are copying it to another Discourse instance…

Anyway keep that in mind as we work on this @neil!

4 Likes

Moving category spec is here:

3 Likes

I know this is rather old, but as the Ask Fedora discourse site starts to get to several years old itself, I’m thinking about this issue. That forum is for end-user troubleshooting, and while some topics are evergreen (and spelunking in the past often somewhat useful to figure out if something is a new problem, or maybe even some history of why it is like it is), the earliest topics are from the Fedora Linux 29 era. We’re about to release Fedora Linux 36, and quite a bit has changed.

I think I’d like for topics which are from very old releases (in our move-fast timeframe, say, from more than three years in the past) to:

  • move out of site search results except when asked for
    • definitely move out of being suggested as similar topics!
  • maybe be hidden from Google search? not sure.
  • no more replies, but…
  • more prominent + New Topic link so people who do find it and have something to say can refer to it easily

(Actually, um, that last one seems like it’d be nice for Closed posts!)

1 Like

The discourse staff is pretty ruthless about deleting closing and deleting stuff that’s not useful. I’m sometimes sad not to find something that I look for, but not often. I think aggressive moderation rather than some automated system (which will make terrible mistakes).

2 Likes