Migrate from gz compression to zstd for backups

zstd is pretty common now and compresses better and is faster than gz. Maybe it is worth changing to this for backups.

considering all the discourse instances around the world, this could be a big saver in disk and transfer.

Choosing Between gzip, Brotli and zStandard Compression | Paul Calvano.

4 Likes

I think this is actually a good idea. Putting a pr-welcome tag on this.

2 Likes

I’d be curious to know the average weight ratio between compressible and already compressed data in a Discourse backup[1], and how much data (in %) would be saved using zstd.

It’s not the same feature request, but it’s also about backup compression so I’m crossposting this:

I wouldn’t be surprised if the percentage was about the same on all my Discourse forums.


  1. Of course, some forums rely very much on image uploads, and some won’t even allow file uploads ↩︎

Occasionally the backup process causes us some availability issues due to the additional load. So I did a quick experiment with zstd today.

These were my results of compressing the same 73GiB dump.sql file with gzip (level 4, as in the Discourse backup) and zstd (default level 3, of 19):

Compression size: 15.8% smaller (.zst was 84% of .gz size)
Compression time (-T1): 71% faster (29% of gzip time)
Compression time (-T0): 89% faster (11% of gzip time)

YMMV, didn’t run multiple times, my own machine (6 cores), it was doing other things too, etc, etc — didn’t aim for precision. Still, I think the benefits are clear.

I’m not sure if -T0 would necessarily be a good choice for everyone as leaving some room for Discourse itself seems like a good idea, hence the sample with -T1 for a more apples to apples comparison.

Feels like a win-win and would likely also have a significant impact on Discourse’s hosting infra too. That said, I don’t have the chops for a PR, so just sharing what I found.

2 Likes

Can confirm that the tar inside the docker does support --zstd compression.

Edit: oops, no. The tar does support it, but the ‘zstd’ utility is missing. It’s available as an apt-get install.

2 Likes