Occasional backup to S3 failure: "multipart upload failed: no tomes available"

Hi there, I’ve been running automatic nightly backups of my Discourse installation to S3 for close to a year with no issues. However, during the past month or so the backups have been randomly failing to upload a few times per week. This is the relevant part of the log, which always has the same error whenever this happens:

...
[2023-12-05 06:13:27] pg_dump: creating CONSTRAINT "public.topic_link_clicks topic_link_clicks_pkey"
[2023-12-05 06:13:27] pg_dump: creating CONSTRAINT "public.topic_links topic_links_pkey"
[2023-12-05 06:13:27] pg_dump: creating CONSTRAINT "public.topic_search_data topic_search_data_pkey"
[2023-12-05 06:13:27] pg_dump: creating CONSTRAINT "public.topic_tags topic_tags_pkey"
[2023-12-05 06:13:27] Finalizing backup...
[2023-12-05 06:13:27] pg_dump: creating CONSTRAINT "public.topic_thumbnails topic_thumbnails_pkey"
[2023-12-05 06:13:27] Creating archive: myforum-com-2023-12-05-061001-v20231107055903.tar.gz
[2023-12-05 06:13:27] Making sure archive does not already exist...
[2023-12-05 06:13:27] Creating empty archive...
[2023-12-05 06:13:27] Archiving data dump...
[2023-12-05 06:13:34] Archiving uploads...
[2023-12-05 06:13:53] Removing tmp '/var/www/discourse/tmp/backups/default/2023-12-05-061001' directory...
[2023-12-05 06:13:53] Gzipping archive, this may take a while...
[2023-12-05 06:16:15] Uploading archive...
[2023-12-05 06:16:25] EXCEPTION: multipart upload failed: no tomes available
[2023-12-05 06:16:25] /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/multipart_file_uploader.rb:93:in `abort_upload'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/multipart_file_uploader.rb:82:in `upload_parts'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/multipart_file_uploader.rb:55:in `upload'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/file_uploader.rb:41:in `upload'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/customizations/object.rb:440:in `upload_file'
/var/www/discourse/lib/backup_restore/s3_backup_store.rb:48:in `upload_file'
/var/www/discourse/lib/backup_restore/backuper.rb:344:in `upload_archive'
/var/www/discourse/lib/backup_restore/backuper.rb:41:in `run'
/var/www/discourse/lib/backup_restore.rb:13:in `backup!'
/var/www/discourse/app/jobs/regular/create_backup.rb:10:in `execute'
/var/www/discourse/app/jobs/base.rb:292:in `block (2 levels) in perform'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/rails_multisite-5.0.0/lib/rails_multisite/connection_management.rb:82:in `with_connection'
/var/www/discourse/app/jobs/base.rb:279:in `block in perform'
/var/www/discourse/app/jobs/base.rb:275:in `each'
/var/www/discourse/app/jobs/base.rb:275:in `perform'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:202:in `execute_job'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:170:in `block (2 levels) in process'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/middleware/chain.rb:177:in `block in invoke'
/var/www/discourse/lib/sidekiq/pausable.rb:134:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/middleware/chain.rb:179:in `block in invoke'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/middleware/chain.rb:182:in `invoke'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:169:in `block in process'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:136:in `block (6 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/job_retry.rb:113:in `local'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:135:in `block (5 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq.rb:44:in `block in <module:Sidekiq>'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:131:in `block (4 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:263:in `stats'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:126:in `block (3 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/job_logger.rb:13:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:125:in `block (2 levels) in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/job_retry.rb:80:in `global'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:124:in `block in dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/job_logger.rb:39:in `prepare'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:123:in `dispatch'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:168:in `process'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:78:in `process_one'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/processor.rb:68:in `run'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/component.rb:8:in `watchdog'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/sidekiq-6.5.12/lib/sidekiq/component.rb:17:in `block in safe_thread'
[2023-12-05 06:16:25] Deleting old backups...
[2023-12-05 06:16:26] Cleaning stuff up...
[2023-12-05 06:16:26] Removing archive from local storage...
[2023-12-05 06:16:26] Removing '.tar' leftovers...
[2023-12-05 06:16:26] Marking backup as finished...
[2023-12-05 06:16:26] Notifying 'system' of the end of the backup...

When I manually re-run the backup in the morning it works fine.

Does this look like a Discourse issue, an issue with my server / network, or an S3 glitch? Thanks a lot for the help.

1 Like