Migrate to s3 failed - but only check fails

Over the years we have seen many, many issues with migration to s3, including implicit migrations when restoring a backup.

EXCEPTION: 8 posts are not remapped to new S3 upload URL. S3 migration failed for db 'default'.

Some of the many examples:

Today, I encountered another one of these and since it was Friday I decided to dive into it instead of just commenting out the check.

So we have this situation:

  • we’re on multisite
  • let’s assume dbname as the database name for this example
  • we have S3_CDN_URL set
  • we do not have CDN_URL set

This is what happens in /lib/file_store/to_s3_migration.rb

First the prefix is decided

prefix = @migrate_to_multisite ? "uploads/#{@current_db}/original/" : "original/"

Then the files are uploaded to s3 and then the remap is done which is basically this and some variations

        from = "/uploads/#{@current_db}/original/"
        to = "#{SiteSetting.Upload.s3_base_url}/#{prefix}"

So in multisite this will remap

  • from /uploads/dbname/original/
  • to https://bucket-location-url.com/uploads/dbname/original/

and then finally the check is done

      cdn_path = SiteSetting.cdn_path("/uploads/#{@current_db}/original").sub(/https?:/, "")
      count = Post.where("cooked LIKE '%#{cdn_path}%'").count
      if count > 0
        error_message = "#{count} posts are not remapped to new S3 upload URL. #{failure_message}"
        raise_or_log(error_message, should_raise)
        success = false
      end

Now SiteSetting.cdn_path comes from lib/global_path.rb and looks like this

  def cdn_path(p)
    GlobalSetting.cdn_url.blank? ? p : "#{GlobalSetting.cdn_url}#{path(p)}"
  end

Sooooooo if we do have an S3 CDN but not a regular CDN then SiteSetting.cdn_path("/uploads/#{@current_db}/original") will be simply /uploads/dbname/original

and, per our remap, the new paths are https://bucket-location-url.com/uploads/dbname/original/

That means

  1. cdn_path is a substring of the new destination path
  2. the Post.where("cooked LIKE '%#{cdn_path}%'").count will thus always find posts
  3. it will cry wolf and bail out :scream: :scream: :scream:
3 Likes