لذا، لقد كنت أواجه صعوبة لعدة أيام في محاولة رفع الملفات المخزنة محليًا إلى DigitalOcean Spaces، وبعد الكثير من المعاناة تمكنت من تشغيل الأمر ```rake uploads:migrate_to_s3``، رغم أن كل محاولاتي حتى الآن لم تنجح.
بعض المعلومات الخلفية:
- أستخدم Discourse الإصدار 2.2 بيتا 7 +117
- قمت بإعداد Discourse بنجاح لتخزين الملفات المرفوعة والنسخ الاحتياطية على Spaces، وهي تعمل بشكل موثوق منذ أيام
- عرّفت المتغيرات البيئية في ملف app.yml كما هو موضح هنا، لكنني اضطررت إلى استبعاد DISCOURSE_S3_CDN_URL (بينما تم ضبط DISCOURSE_S3_REGION وDISCOURSE_S3_ACCESS_KEY_ID وDISCOURSE_S3_SECRET_ACCESS_KEY وDISCOURSE_S3_BUCKET جميعها)، وإلا فإن هذا التغيير سيجعل Discourse، بعد إعادة البناء، يبحث عن جميع الأصول (بما في ذلك ملفات الجافا سكريبت) على Spaces، وهو أمر غير مرغوب فيه ولا ما أحتاجه

- قمت بتشغيل مهمة rake عدة مرات لساعات، لكن في كل مرة بعد فترة ينتهي بي الأمر بالخطأ أدناه، وينهار السكربت.
الأسئلة:
- كيف يمكنني تقييد السرعة التي يعمل بها هذا السكربت؟
- لماذا نعتمد على إعداد المتغيرات البيئية التي قد تكسر واجهة المستخدم؟ ألا يجب أن تكون البيانات التي أدخلناها بالفعل في الإعدادات عبر واجهة الويب كافية لتشغيل مهمة migrate_to_s3؟
Back here just to report about the way I finally moved forward on this issue, hoping this can be useful for some other users while our developers work on a Discourse solution for this:
-
There was no way to complete the migration of the uploads using rake uploads:migrate_to_S3 task. I tried at least 10 times but it always failed, sooner or later. It might just be something related to DigitalOcean Spaces and not to Amazon S3 service, but the script was failing throwing the Aws::S3::Errors::SlowDown: Please reduce your request rate. exception. This tells me (but I might be wrong, in case I apologise) that this exception is not properly handled and that could be dealt with via exponential backoff.
-
I manually copied all the uploads to Spaces using the s3cmd utility, with something like s3cmd sync --skip-existing /var/discourse/shared/standalone/uploads/default/ s3://my-discourse-data-bucket --acl-public (please note that this command must be understood and adapted for your discourse set-up. Do not just throw around commands expecting magic to happen…)
-
After a few hours the copy completed and I had just to update the database. I have then once again launched the rake uploads:migrate_to_S3 task, expecting it to “see” all the copied files and just proceed with the DB updates. Unfortunately that was not the case. The number of files detected on S3/Spaces was once again just 1000. Which brings me to another thing that I could not understand: why, despite some runs of the script were lasting for longer hence copying more files over to Spaces, after a crash and re-launch of the script the number of detected files of S3/Spaces was fixed to 1000 (see picture above) as if newly copied files were just like ignored?
-
Then I decided to proceed manually also with the DB update: I read carefully the migrate_to_S3 task source code and cherry picked from the final part of it, executing instructions directly from the rails console. Specifically, I run this (once again, if you are reading this thinking to repeat what I did, please consider that this can lead to disaster if you dont’t know what you are doing and you don’t have a very recent backup of your DB, just in case.):
db = RailsMultisite::ConnectionManagement.current_db
bucket, folder = GlobalSetting.s3_bucket, "" # I have no folder on my bucket
excluded_tables = %w{
email_logs
incoming_emails
notifications
post_search_data
search_logs
stylesheet_cache
user_auth_token_logs
user_auth_tokens
web_hooks_events
}
from = "/uploads/#{db}/original/(\\dX/(?:[a-f0-9]/)*[a-f0-9]{40}[a-z0-9\\.]*)"
to = "#{SiteSetting.Upload.s3_base_url}/#{folder}#{prefix}\\1"
DbHelper.regexp_replace(from, to, excluded_tables: excluded_tables)
from = "#{Discourse.base_url}#{SiteSetting.Upload.s3_base_url}"
to = SiteSetting.Upload.s3_cdn_url
DbHelper.remap(from, to, excluded_tables: excluded_tables)
if Discourse.asset_host.present?
# Uploads that were on local CDN will now be on S3 CDN
from = "#{Discourse.asset_host}#{SiteSetting.Upload.s3_base_url}"
to = SiteSetting.Upload.s3_cdn_url
DbHelper.remap(from, to, excluded_tables: excluded_tables)
end
- Database update worked like a charm, and I then started rebaking my 297000 posts with
rake posts:rebake, which is going to take anorher few hours.
That’ s all folks. The rebake is still on-going while I type this, but all spot checks I made have shown good results. The issue of migrating my locally stored uplink to DigitalOcean Spaces is completed, but it was quite an ordeal.
Discourse remains a superb product and with my criticism to the migrate_to_S3 script I meant no bashing, rather it wants to be an encouragement to solve some annoying issues and making it more robust.
Final note - Even though the the manual copy of my uploads to Spaces was successfully completed, I have not removed the local hosted copies, as a precaution. I’ll keep them around for some time just in case, although in the long term I’ll delete them.
Weird I never hit that rate limit error… Do you have lots of small files?
Also, the s3cmd sync will “break” any attachments (they will lose their file name).
In theory, it should be. But it’s not convenient to set the arguments in one place (settings UI) and use them in another (rake task).
Yes. About 37000 “originals” and of course 3 times more in the “optimized” directory. In total about 120000.
This I do not understand, what do you mean? The uploads I copied manually to Spaces were identical in name and path placement on Spaces with those on my local disk. I made several tests before the final rebake, that just ended and was fully successful.
I see your point, but then please consider that setting environment variables might have some unwanted consequences and, at least in my experience, required to rebuild the container and broke the UI as I explained here.