Best Practices for Disk Space Management in Community w/ lots of Images

I’m running a very visual oriented community (www.realtimevfx.com) and have watched my disk space creep up over the past year (obviously). I’m about to do some cleanup (per this discussion “https://meta.discourse.org/t/low-on-disk-space-cleaning-up-old-docker-containers/15792/83”), but I wanted to start a general conversation to get people’s thoughts on the best way to approach this - If I can avoid it, I’d rather not have exponential disk space growth as the community grows.

Are there best practices / standards that I should implement regarding user uploads? Should I disable uploads entirely and have people only host through imgur, etc?

4 Likes

Serve static resources via AWS is the best You can do as it will offload a significant amount of less important data from the server.

Won’t the local copies of all uploaded images still reside on your server?

Rebuild Your container once the Images are offloaded to s3 bucket and then all those can be safely removed from your local server. system is supposed to automatically clean up the content once it is properly uploaded to the AWS cloud.

I see, didn’t know about it.
What about making a backup of a Discourse instance, it won’t contain those images, will it?

it will contain necessary references to those images and when You restore such a backup, it will link them to Your instance’s AWS s3 configuration and look for resources in the relative directories afaik but maybe someone from the discourse team can provide better details about how backups are handled in case of files.

@HAWK @sam can You provide some extra details about how backups handle the media files uploaded by users?

When images are stored on S3, they won’t be backed up by Discourse anymore. This would be pointless since most of the time, you’re using S3 because you don’t have enough space locally.

If you want to backup your images, I’m sure there are plenty of tools available to do that by talking to S3 directly :wink:

4 Likes

@zogstrip one Question though!
what if that backup is restored when the site is moved to a new host? will that break all the images or will it be as simple as configuring with the same AWS details again to restore all the media?

1 Like

Backup will contain all your site settings. So new restored instance also will have the same S3 configuration.

6 Likes

I have several question about images storage/processing

  1. I have two folders
    /var/discourse/shared/standalone/uploads/default/original — 9GB in my case
    …/3X
    …/2X
    …/1X
    /var/discourse/shared/standalone/uploads/default/optimized — 4GB in my case
    …/3X
    …/2X
    …/1X

Is there way to keep only optimized images?

  1. Does folder
    /var/discourse/shared/standalone/postgres_data/base
    store the same images as folders above?

  2. Is there any Discourse settings for storing images in lower quality?

  3. Is there any client-side image optimization to make it lower size?

1 Like

Original images are shown when their optimised counterparts within topics/replies are clicked.

6 Likes

Any other comments on this @sam – I know we have a rake task that optimizes images, is that available to self hosters @jomaxro?

1 Like

Not a rake task, but the script is indeed available to self-hosters. https://github.com/discourse/discourse/blob/master/script/downsize_uploads.rb

To use, run:

./launcher eter app
cd /var/www/discourse
RAILS_ENV=production bundle exec ruby script/downsize_uploads.rb
9 Likes

This is also no longer the full story about S3 backups.

3 Likes

I would like to see us make this available directly from the admin panel, should I queue it for next release or the release after next?

6 Likes