Re-adding missing uploads to the database

Okay, I solved it with Claude and a lot of praise. I’m sharing what I did in order to help anyone else with a similar or the same issue.

I’m not sure if that’s the most clever and optimal method to use, just the one that worked for me.

Please be careful and keep in mind that I’m not an expert but a novice always learning.

The issue (S3 → local filesystem)

After migrating from AWS S3 to local FS, a lot of images displayed as transparent.png. The files was always on disk but Discourse couldn’t resolve them.

The root cause was a broken chain:

  1. Posts with upload:// short URLs (base62-encoded SHA1).
  2. Database uploads mapping SHA1 → local file path.
  3. Filesystem storing files named by their SHA1 hash,

The migration moved files to disk correctly, but no uploads DB records existed. Without a matching record, Discourse falls back to transparent.png.

The solution (create records and rebake)

# Enter container
./launcher enter app
rails c

Create missing upload records from orphan files:

dir = Rails.root.join("public", "uploads", "default", "original")
created = 0

Dir.glob(dir.join("**", "*")).select { |f| File.file?(f) }.each do |path|
  sha = File.basename(path, File.extname(path))
  next if Upload.find_by(sha1: sha)

  ext = File.extname(path).delete(".")
  relative = path.sub("#{Rails.root}/public", "")

  u = Upload.new
  u.sha1 = sha
  u.url = relative
  u.original_filename = File.basename(path)
  u.filesize = File.size(path)
  u.extension = ext
  u.user_id = -1
  u.save!(validate: false)

  created += 1
  puts "Created upload #{u.id}: #{sha}"
end

puts "Total created: #{created}"

Rebake posts that reference restored uploads:

fixed_posts = 0

Upload.where(user_id: -1).find_each do |u|
  short = u.short_url
  next unless short

  Post.where("raw LIKE '%upload://%'").find_each do |p|
    urls = p.raw.scan(/upload:\/\/[^\s\]\)]+/)
    urls.each do |url|
      decoded = Upload.sha1_from_short_url(url)
      if decoded == u.sha1
        p.rebake!
        fixed_posts += 1
        puts "Rebaked post #{p.id}"
        break
      end
    end
  end
end

puts "Total rebaked: #{fixed_posts}"

Regenerate missing optimized:

After fixing the original files, we need to populate the optimized files (1X, 2X, etc).

Rake works in discourse container but not in rails console.

rake uploads:regenerate_missing_optimized

[OPTIONAL] Still missing optimized

If rake uploads:regenerate_missing_optimized did not solve all the file issues and there is still missing files:

# Enter container
./launcher enter app
rails c
missing = 0
OptimizedImage.find_each do |oi|
  path = "#{Rails.root}/public#{oi.url}"
  unless File.exist?(path)
    missing += 1
    oi.delete
  end
end
puts "Deleted #{missing} broken optimized records"

Then exit rails and run again:

rake uploads:regenerate_missing_optimized

Safe rollback (just in case)

All created records use user_id: -1 and delete_all skips callbacks so filesystem files are untouched. To undo:

Upload.where(user_id: -1).delete_all

Previously used destroy_all by mistake and it triggered callbacks that moved files to tombstone.

Recovered an individual one that I used to test and reframed my approach.

3 likes