Clean up old topics

Hello,

Our Discourse site for high school users has been very successful. Too successful. We’ve had 2.3 million posts and 2.9 million likes in the last 2 years since we began in January 2021.

We would like to clean up the place a bit, to save on costs and to also reduce long-term risks from hacking or similar. Starting fresh would be painful, but the discovery of adding /print to the end of a topic URL to make 1000-comment pages for a PDF helps. We can’t find (and I really doubt) the existence of any clean way to remove, say, the oldest year’s worth of topics.

Any thoughts? Do we need to start fresh, or are there any other options?

Gabriel Sieben

2 Likes

Hi! I’m happy to know that Discourse worked very well for your project :smiley: :+1:

Just to be sure to understand; your criteria to make a topic eligible for deletion would be only its age?

There’s no built-in feature in the interface that would allow that, but you could create and manually trigger a rails script that would delete topics older than a certain date.

Topic.where("created_at.....").destroy_all

Between brackets is a SQL query.

1 Like

Correct, if it wouldn’t cause broken links and context everywhere. I would hope that any quotes wouldn’t be affected.

Would it clean up the uploads?

1 Like

The quotes’ content would remain.
As for internal links to a deleted topic, they would lead to a “not found” page.

Yes, the uploads will be deleted after 2 days by default (unless they are used somewhere else, like inside a quote in another topic), see the clean orphan uploads grace period hours setting.

But note that topics and posts are soft-deleted. They are hidden, but still stored in the database.


I’ll add that I’m close to, but not 100% sure of what I’m saying… Better waiting for a more savvy user to reply here. :slight_smile:

1 Like

One thing you might do going forward is turn on chat, which I’m guessing is most of what is happening, especially if you’re wanting to delete it. That way stuff gets pruned automatically (I think the default is 90 days, which is likely enough).

And NONE of the stuff that’s old is important? And if some is, how are you going to keep what’s good? By category?

It looks like Topic.destroy calls the PostDestroyer, so I think that destoying those topics, should do what you want.

I’d recommend that you set up a test site to do some testing before you go doing this on your production server.

I might be tempted to create a set of Discourse sites, though, perhaps one per graduating class, so you could just shut them down when it was time. You might also have a separate one for school-wide stuff and have them share an authentication server (either the school-wide discourse or, hopefully, whatever auth server your school/district uses). Oh, or maybe this is for high school students, but not a high school. So this reorg might not make sense.

3 Likes