Intended path to migrate S3 to local

Hello,

After having multiple outages and issues with our S3 provider (vultr) I thought it best to approach site uploads differently. Their VPS has been reliable, but the object storage fails much more often. I am also considering migration to new hardware of my own to save $$$, but the first stumbling block appears to be a lack of way to migrate from S3 to local. I see the rake task that was intended to serve this purpose is now gone, and the recommended backup/restore operation seems to have issues on my end.

I have a server currently operating at forum.tld and the new target hardware is running on forum2.tld. I enabled the hidden site settings to backup uploads and ran the backup task.

When I run the backup task with S3 enabled I do indeed seem to capture all the relevant uploads, but restoring this backup fails (and appears to use S3 anyway).

When I disable S3 before running the backup and restore, it succeeds restoring, but obviously the S3 uploads are broken. I can merge uploads from the other backup but simply doing a rebake doesnt seem to properly update the posts.

I’m fortunate enough to have the second server to try all the dumb stuff on so if anyone has a dumb idea I’m more than willing to entertain it.

I just dont see a good path for getting away from S3 once you have implemented it. I have older posts from when we migrated to AWS and back to vultr that broke and gave up on trying to make sense of that. It seems like all uploads are sha1 hashed and renamed, then added to some mapping separate from the DB. This process appears to just be a 1 way deal. Perhaps a dev can chime in with a bread crumb trail to generating this mapping?

Is there a way to do this without a backup/restore operation?

What is the expected/intended path for someone to migrate from s3 currently?

2 Likes

We are facing the same issue, we moved out S3 because our hosting offers data storage that’s really good.

1 Like

To add more context to this, I tried restoring a backup containing s3 uploads again.

These appear to be the relevant errors. It doesnt mean anything to me.

.............................................................................................................................................#<Thread:0x00007f5426c22fc8 /var/www/discourse/lib/file_store                                                                                /to_s3_migration.rb:218 run> terminated with exception (report_on_exception is true):
/usr/local/lib/ruby/3.2.0/net/http/response.rb:160:in `read_status_line': wrong status line: "0" (Net::HTTPBadResponse)
        from /usr/local/lib/ruby/3.2.0/net/http/response.rb:147:in `read_new'
        from /usr/local/lib/ruby/3.2.0/net/http.rb:1862:in `block in transport_request'
        from /usr/local/lib/ruby/3.2.0/net/http.rb:1853:in `catch'
        from /usr/local/lib/ruby/3.2.0/net/http.rb:1853:in `transport_request'
        from /usr/local/lib/ruby/3.2.0/net/http.rb:1826:in `request'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/rack-mini-profiler-3.1.0/lib/patches/net_patches.rb:19:in `block in request_with_mini_profiler'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/rack-mini-profiler-3.1.0/lib/mini_profiler/profiling_methods.rb:46:in `step'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/rack-mini-profiler-3.1.0/lib/patches/net_patches.rb:18:in `request_with_mini_profiler'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/net_http/connection_pool.rb:348:in `request'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/net_http/handler.rb:79:in `block in transmit'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/net_http/handler.rb:133:in `block in session'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/net_http/connection_pool.rb:104:in `session_for'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/net_http/handler.rb:128:in `session'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/net_http/handler.rb:76:in `transmit'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/net_http/handler.rb:50:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/plugins/content_length.rb:24:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/plugins/request_callback.rb:85:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/s3_signer.rb:132:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/s3_signer.rb:63:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/s3_host_id.rb:17:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/xml/error_handler.rb:10:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/transfer_encoding.rb:26:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/helpful_socket_errors.rb:12:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/s3_signer.rb:110:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/redirects.rb:20:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/retry_errors.rb:360:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/md5s.rb:31:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/http_checksum.rb:19:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/endpoint_pattern.rb:30:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/accelerate.rb:67:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/checksum_algorithm.rb:136:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/bucket_dns.rb:35:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/dualstack.rb:41:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/expect_100_continue.rb:22:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/bucket_name_restrictions.rb:26:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/arn.rb:62:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/rest/handler.rb:10:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/recursion_detection.rb:18:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/user_agent.rb:13:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/plugins/endpoint.rb:47:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/param_validator.rb:26:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/arn.rb:88:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/plugins/raise_response_errors.rb:16:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/sse_cpk.rb:24:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/dualstack.rb:27:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/accelerate.rb:56:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/checksum_algorithm.rb:111:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/jsonvalue_converter.rb:22:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/idempotency_token.rb:19:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/param_converter.rb:26:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/plugins/request_callback.rb:71:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/response_paging.rb:12:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/plugins/response_target.rb:24:in `call'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/request.rb:72:in `send_request'
        from /var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/client.rb:12369:in `put_object'
        from /var/www/discourse/lib/file_store/to_s3_migration.rb:220:in `block (2 levels) in migrate_to_s3'
................................................EXCEPTION: wrong status line: "0"
/usr/local/lib/ruby/3.2.0/net/http/response.rb:160:in `read_status_line'
/usr/local/lib/ruby/3.2.0/net/http/response.rb:147:in `read_new'
/usr/local/lib/ruby/3.2.0/net/http.rb:1862:in `block in transport_request'
/usr/local/lib/ruby/3.2.0/net/http.rb:1853:in `catch'
/usr/local/lib/ruby/3.2.0/net/http.rb:1853:in `transport_request'
/usr/local/lib/ruby/3.2.0/net/http.rb:1826:in `request'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/rack-mini-profiler-3.1.0/lib/patches/net_patches.rb:19:in `block in request_with_mini_profiler'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/rack-mini-profiler-3.1.0/lib/mini_profiler/profiling_methods.rb:46:in `step'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/rack-mini-profiler-3.1.0/lib/patches/net_patches.rb:18:in `request_with_mini_profiler'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/net_http/connection_pool.rb:348:in `request'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/net_http/handler.rb:79:in `block in transmit'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/net_http/handler.rb:133:in `block in session'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/net_http/connection_pool.rb:104:in `session_for'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/net_http/handler.rb:128:in `session'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/net_http/handler.rb:76:in `transmit'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/net_http/handler.rb:50:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/plugins/content_length.rb:24:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/plugins/request_callback.rb:85:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/s3_signer.rb:132:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/s3_signer.rb:63:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/s3_host_id.rb:17:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/xml/error_handler.rb:10:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/transfer_encoding.rb:26:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/helpful_socket_errors.rb:12:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/s3_signer.rb:110:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/redirects.rb:20:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/retry_errors.rb:360:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/md5s.rb:31:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/http_checksum.rb:19:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/endpoint_pattern.rb:30:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/accelerate.rb:67:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/checksum_algorithm.rb:136:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/bucket_dns.rb:35:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/dualstack.rb:41:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/expect_100_continue.rb:22:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/bucket_name_restrictions.rb:26:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/arn.rb:62:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/rest/handler.rb:10:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/recursion_detection.rb:18:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/user_agent.rb:13:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/plugins/endpoint.rb:47:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/param_validator.rb:26:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/arn.rb:88:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/plugins/raise_response_errors.rb:16:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/sse_cpk.rb:24:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/dualstack.rb:27:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/plugins/accelerate.rb:56:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/checksum_algorithm.rb:111:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/jsonvalue_converter.rb:22:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/idempotency_token.rb:19:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/param_converter.rb:26:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/plugins/request_callback.rb:71:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/aws-sdk-core/plugins/response_paging.rb:12:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/plugins/response_target.rb:24:in `call'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-core-3.130.2/lib/seahorse/client/request.rb:72:in `send_request'
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/aws-sdk-s3-1.114.0/lib/aws-sdk-s3/client.rb:12369:in `put_object'
/var/www/discourse/lib/file_store/to_s3_migration.rb:220:in `block (2 levels) in migrate_to_s3'
Trying to rollback...
Rolling back...
Cleaning stuff up...
Dropping functions from the discourse_functions schema...
Removing tmp '/var/www/discourse/tmp/restores/default/2023-05-10-221431' directory...
Unpausing sidekiq...
Marking restore as finished...
Notifying 'system' of the end of the restore...
Finished!
[FAILED]
Restore done.

the whole output is here.

restore failed.txt (201.7 KB)

The error seems to be related to moving the uploads out of the backup and into S3. These files already exist anyway so that may be part of the issue? or perhaps its just my S3 provider continuing to have issues, which is why I am trying to move away from them.

At any rate, digging around yielded this:

It seems like maybe this still applies.

Extracting the db dump from the backup and rezipping it, only to then have the software re-unzip it :firstworldproblem:, is somewhat time consuming. Perhaps it could be useful to specify this either using a discourse command in the container or via app.yml. Maybe a section for ‘db settings override’ that the restore process looks at first? Mainly just spit-balling because I only pretend to look like I know what I’m doing here.

after modifying the requisite dump.sql the restore completes successfully but avatars seem broken and some thumbnails dont seem to work but when you click the image it loads properly.

I think maybe a rebake might cure the latter issue. I’m not super upset if all I lose is avatars.

So to recap for anyone else that runs into this issue, heres what I was able to get working to both migrate from S3 and move to different hardware.

  1. Put your server in read only, and enable the hidden site setting to backup S3 (and local) uploads, detailed here

  2. Do a backup with S3 uploads enabled in your site settings. You will need enough local storage to download them all or else the backup task will fail.

  3. Pull the latest version of discourse from github, and copy over your app.yml.

  4. Rebuild with your app.yml and verify you get the discourse setup page.

  5. Extract the dump.sql from the backup you made, and modify it similar to what is said here.

  6. Recompress the dump.sql db into the backup and put the backup in /var/discourse/shared/standalone/backups/default with the same name it had when you made the backup. (this name is important, so do not truncate)

  7. Run the restore process as shown here


if you are simply trying to migrate from S3 without changing hardware I believe the process is mostly the same but you would skip steps 3&4.

1 Like

I would like to point out that if you are running a manual backup task, and a scheduled one tries to run, both will fail. Seems like it should at least continue the manual backup, though my iisue is probably considered and edge case.