I’m working on an import to Discourse using a bulk importer. This works very well for topics and posts, but right now the slow part is files. We have about 50,000 users with avatars and while the user data imports to the DB in just a few seconds, the avatars are taking hours to import. It is only processing about one upload per second.
Is there any way to speed this up? I’m not sure what part of this process is slowest. If there is no avatar file found (photo_filename didn’t exist) then it executes very quickly, but I’m getting a bit lost trying to dig into the UploadCreator class that is ultimately invoked by this importer code.
We have over 600,000 attachments so I’m very concerned how long that will take to import using the same create_upload call.
upload = create_upload(u.id, photo_filename, File.basename(photo_filename))
if upload.persisted?
u.import_mode = false
u.create_user_avatar
u.import_mode = true
u.user_avatar.update(custom_upload_id: upload.id)
u.update(uploaded_avatar_id: upload.id)
else
puts "Error: Upload did not persist for #{u.username} #{photo_real_filename}!"
end