إضافة خيار لتعطيل ضغط النسخ الاحتياطي

Note to self:

Recently hit space issues again and backups failing so this post needs revisiting soon.

Note that in order to create a backup resulting in a 12GB gzip file I require ~36GB+ of space dedicated to backups, ~24GB+ of free space:

  • 12GB for the backup from the day before
  • 12GB for the new backup file
  • ~12GB+ tar archive to be compressed into gzip file (original DB file backup + original image files

So as backup sizes increase by 1GB, the backup / space requirements are actually increasing by a ~3GB.

This assumes that you are only keeping a single previous backup - where the Discourse setting maximum backups is set to 1.

The Discourse default is 5, so using defaults I would need ~84GB dedicated to backups to allow them to work.

إعجابَين (2)

What is the lion’s share of the backup? I assume uploaded images and so on? Wouldn’t it be easier to specify the backup is database-only, and thus make it a tiny fraction of the overall size?

(Yes, we’d still need some way to back up the images independently, but at least then the urgent need for 100GB+ of space would not be present.)

Yes the lions share of the backup content is “uploads”.

However a backup is not complete without the “uploads”.

My personal target is to move the images / “uploads” to Amazon S3 to avoid this issue for this specific instance, however there is still some testing to be done on a high topic / post count instance before I can trust the migration to S3, some issues already highlighted in that thread (more specifically avoiding a rebake of all posts).

I have other Discourse instances that would benefit in the backups being created in a more streamlined way.

إعجابَين (2)

I have the same problem as this thread, I have many GB of images and while I want to migrate them to S3 from what I have read the migration script seems a bit buggy still. So, I still have images locally but am running out of disk space given the high ceiling needed to allow a backup. Even if I could delete the old backup before creating the new one it would be OK for me. In fact I have been doing that manually.

Note that the backup system also seems to be failing me on the free disk space calculation, it will fill up the whole disk before giving up, and not even delete the partial files. Then the whole computer gets unhappy. There should be a calculation to not do a backup if there is no disk space for it, taking into account the space needed for the compression etc.

Edit: I am going to run a cron job which will delete the (sole) local backup every day. That should solve my immediate problem, but I think it would also be nice to have an option to immediately delete any (local) backup that was already successfully copied to S3.

ما هي خيارات gzip الحالية المستخدمة لضغط النسخ الاحتياطية؟

وعلى عكس ما ورد في الموضوع، كنت مهتمًا بتوفير المساحة باستخدام طريقة ضغط أكثر كفاءة. قمت بإجراء بعض الاختبارات السريعة وغير الدقيقة مقارنةً بملف SQL الذي قمنا بتصديره، مع مستويات مختلفة من gzip وأيضًا brotli.

2630702226 level1.sql.gz
2276305530 level1.sql.br
2216602536 level5.sql.gz
2147212204 level9.sql.gz
2036157791 level2.br
1851831279 level4.br

كما نرى، يتفوق Brotli المستوى 4 على Gzip المستوى 5 من حيث الكفاءة، بينما كان وقت الضغط في نفس النطاق تقريبًا. لم تكن نتيجة Brotli المستوى 1 سيئة، خاصةً أنه أداة سريعة جدًا.

على أي حال، أجد أن المكاسب بنسبة 10% أو أكثر مثيرة للاهتمام.

إعجاب واحد (1)

مثير للاهتمام، ما هي أوقات الضغط الفعلية؟ :thinking:

سيكون Zopfli أكثر إثارة للاهتمام هنا، نظرًا لأنه متوافق مع أدوات فك ضغط gzip العادية. أما Brotli فيحتاج إلى أداة فك ضغط مختلفة.

إعجابَين (2)

يُعد zopfli بطيئًا بشكل مخيف، وأتساءل عما إذا كنا سنرغب في استخدامه لأي شيء مثل نسخة احتياطية ضخمة. على الأقل فإن brotli محسّن إلى حد ما من حيث السرعة.

3 إعجابات

مستوى ضغط gzip الاحتياطي للتحميلات

هل هناك طريقة لتعطيل ضغط gzip. نظرًا لأن تحميلاتي هي في الغالب صور مضغوطة بالفعل، فمن إهدار للموارد والوقت محاولة ضغطها مرة أخرى.

إعجاب واحد (1)

من موضوع ذي صلة:

إعجاب واحد (1)