Changing s3 bucket for uploads

Hey @Jite !

See if this works for you. If it does, I’ll go about creating a proper #howto

Old buckets

This assumes that you can install and configure a tool to move your data from your old bucket to a local machine and then again do the same from local to the new bucket. See aws cli sync (which can be configured for non-AWS buckets) and gsutil rsync for information. If you have huge amounts of data or are moving between buckets on the same provider, then you might want to investigate methods that move the data directly between buckets.

Get in a directory suitable for a holding space. (e.g., mkdir temp-bucket; cd temp-bucket) before doing something like the following. These examples include the -n and --dry-run switches to show you what will happen. If that looks like what you want, run the command again without that switch.

Move old data from old bucket to to local

    gsutil  rsync -r -n  gs://=OLD= .

or

    aws s3 sync s3://=OLD= .

Move data from local to new bucket

    gsutil rsync -r -n . gs://=NEW=

or

    aws s3 sync . s3://=NEW=

Updating the database to use the new bucket

You’’ do these commands at the Rails console, to get there, you’ll do a

cd /var/discourse
./launcher enter app
rails c

For the new bucket, upload an image with the new configuration and do this:

Upload.last.url

You should see something like

=> "//discourse-bucket.s3.dualstack.us-east-2.amazonaws.com/`original/2X/7/12345fbea574afc4e02db80107e6682430aede2c.png"

You’d then get discourse-bucket.s3.dualstack.us-east-2.amazonaws.com for the new bucket. Get the old bucket hostname similarly from the above.

Use this to check that your uploads are where you think they are:

Upload.order(Arel.sql('RANDOM()')).limit(10).pluck(:id, :url)

Now, you’ll update the database to use the new bucket rather than the old one. DbHelper.remap will replace occurrences in all tables.

DbHelper.remap("//=OLDHOST=/","//=NEWHOST=/")

Moving to AWS might require clearing your s3_endpoint.

NOTE: If you have a s3_endpoint defined in your SiteSettings in the database and switch to AWS (where no endpoint is needed), then you’ll need to clear that site setting after you build the new container with the updated settings (or after you restore a database that has it set).

Rebake posts that refer to bucket rather than S3 CDN

If you have posts that link directly to the new s3 bucket (perhaps you didn’t have an s3_cdn_url defined before), then here’s how to rebake only the posts that need it.

Get the posts:

  posts=Post.where("cooked like '%=NEWHOST=%'")

See how many:

  posts.count

Rebake those posts:

  posts.each do |p| p.rebake! end

Or, just replace the bucket with the cdn:

posts.each do |p|
  p.cooked.gsub!(/=NEWHOST=/,"=CDN=")
  p.save!
end

8 Likes