Is there any reason why daily backups would see a reduction in size?


(AstonJ) #1

Last few days have been 3.4GB and today’s backup is reporting 3.3GB - is this normal or does it sound like something’s up? I don’t recall any large (or image) threads being deleted…


(Sam Saffron) #2

Yes, this is very normal.

Recently @tgxworld did a massive data refactor that removed a large amount of bloat around the email logs table. This data saving is expected.

One interesting thing though is how image heavy your forum is (cause if it was not the savings would have been even more noticeable) , images really do eat up the vast majority of backups in so many cases:)


(AstonJ) #3

Thanks Sam.

I keep meaning to ask about images - is there anyway we can set Discourse to resize images to something like 1024px wide?

Also… what about storing images outside of the DB? (Though would we need to use something like Rsync of backups?)

I remember there was some discussion around these topics a few years ago, not sure if they’re possible yet?


(Sam Saffron) #4

Images are stored outside the DB already!

Going forward you can lower max image size and megapixels on your forum to avoid the large images. But fixing the history here… I am not sure if we have a rake task for that quite yet.


(AstonJ) #5

Ah nice!

Will that mean they get resized automatically, or will users be told the images are too big and they need to resize themselves first?

That would be awesome! I’m not sure how we go to using that much since I don’t think we have ‘that’ many images… I wonder if we could have something in the ACP that lets us see all uploads? (And maybe resize them individual or by batch or something?)


(Sam Saffron) #6

Try it out, we attempt to downsize on upload automatically to match the megapixel value you are allowing.


(Stephen Chung) #7

The images are resized for display but the original is always stored for retrieval, so it is not going to help your storage.

Your best solution is to store the images in an online bulk storage like S3.


(Sam Saffron) #8

Automatic downsizing of originals AND history is definitely an opt-in feature we want to fully support.


(Jeff Atwood) #9

What do you mean? Set a max image size in your site settings and that is indeed the max image size.


(Stephen Chung) #10

That’s true. That setting will prevent a large image from being uploaded at all.

However, I think @AstonJ may be asking that if there is a feature in the system to:

  1. Accept an image upload from the user (up to some reasonable upper limit of course)
  2. Auto-resize that image to some max size (e.g. 1024)
  3. Not keep that original large image (in order to save storage space)

(Jeff Atwood) #11

That’s already the case… if you specify a max allowed, the images can’t be larger than that. I think it is in megapixels, so it’s about dimensions rather than absolute filesize, but the effect is the same. Rule 3 is met.

There’s only so big a 128 x 128 image file can be, in bytes :wink:


(Sam Saffron) #12

I can definitely see a real use case for:

Hell no… we are not an image storage service, maximal size of every image on the forum is 500K, we will drop fidelity till we get there on all of history… sorry everyone.

I get that many forums that tire of giant old backups may may want to go down this path.


(Jeff Atwood) #13

Right but we have exactly that feature it is just gated on pixel dimensions, not filesize per se.

Remember we have to deal with pasted in images too.


(AstonJ) #14

Sam, I’ve been thinking about why our backup is so big when we don’t really have that many images uploaded… I think it might be because of the welcome DiscourseBot.

From what I remember, it asks users to upload photos to help familiarise them with the upload system. I’m wondering, would it be a good idea to have Discourse automatically delete these PMs (and all uploads inside them) after 2 or 3 months after registration?

It seems a waste hosting these uploads when they are not going to be seen by anyone and are effectively for testing purposes only… what do you think?


(John Sweeney) #15

I like this idea. :slight_smile:

It might be worth updating one of the welcome bot’s text strings to let people know that, as they’re only testing the feature, their images won’t be stored long-term unless they re-upload them.

It could be spun in an encouraging manner; “feel free to pick any image - this is just a test run!” or similar would convey the information, while focusing on the positive - in this case the user not needing to worry about finding the best image, and instead just using whatever file is convenient. This is one of the things I really like about the Discourse team - their messaging always feels friendly and welcoming.


(Rafael dos Santos Silva) #16

Can you run a data-explorer query to give some real world data about how much space DiscourseBot uploads really takes?


(AstonJ) #17

I don’t have that plugin installed but can add it :smiley: Could you give me the query you would like me to execute please?