Cleaning up Uploads and Purging Uploads from S3

:bookmark: This is a reference guide describing how orphaned and deleted uploads are automatically purged from a Discourse site. This guide applies to both self-hosted and hosted Discourse sites.

Have you ever wondered what happens to files and images that were uploaded to a Discourse site but are no longer referenced, or how to remove uploads from a site? You’re in the right place!

Discourse doesn’t currently have any built in ways to delete uploads directly from the user interface, however, Discourse does have an automatic sidekiq job scheduled to remove orphaned and deleted uploads called clean up uploads.

Orphaned and Deleted Uploads

:information_source: Orphaned Uploads are files that have been uploaded to a Discourse site, but are no longer referenced.

An upload is considered orphaned if and only if it’s not referenced:

  • In the latest version of a post
  • In a draft
  • In a queued post
  • In a logo site setting
  • In a custom emoji
  • In a theme
  • In a user avatar/background/card image
  • In a category logo/background image

:information_source: Uploads are considered “deleted” when the topic/post they are contained within is deleted.

Cleaning up Uploads

To fully remove an upload from Discourse, you’ll have to do one of the following:

  • Force the upload to become orphaned by removing any reference of the upload. This can be done by editing the upload link out of the post that it’s in, or any other places the upload may be referenced.

  • Delete any and all topics/posts containing the upload, causing the upload to be considered as “deleted”

All orphaned unreferenced uploads and deleted uploads will then be removed from storage (after a grace period) once the clean up uploads job runs.

Site Settings

The following site settings are available at .../admin/site_settings/category/files for modifying how Discourse automatically purges uploads.

clean up uploads: default true
clean orphan uploads grace period hours : default 48
clean orphan uploads grace period hours : default 30

The clean up uploads setting can be used enable or disable the automatic deletion of orphaned unreferenced uploads.

The clean orphan uploads grace period hours and purge deleted uploads grace period days are the two settings that control how long after a detected orphaned or deleted upload is purged and permanently removed from the site.

Additional details about the clean up uploads job are available here:
https://github.com/discourse/discourse/blob/master/app/jobs/scheduled/clean_up_uploads.rb

Purging S3 Uploads

:warning: The following section is only applicable to self-hosted Discourse sites.

:information_source: If you are currently hosted on our Enterprise Plan, please reach out to team@discourse.org if you have any questions about deleting uploads from your S3 storage.

Cleaning up orphan and deleted uploads works similarly for both local and S3 storages.

The only difference between the local storage and S3 storages is that the cleanup of the S3 uploads are automatically handled by S3 via a tombstone policy. See Managing your storage lifecycle for additional details about how this is handled on S3.

By default, the clean up uploads job includes S3 uploads, however, if you would like to disable this feature, you can uncheck the s3 configure tombstone policy site setting.

3 Likes