Recover_from_tombstone insanely slow

I’ve got a big import that’s got a bunch of tombstoned images. This happens from time to time and I’m not sure why, but that’s not why I’m here.

I’m running rake uploads:recover_from_tombstone and it’s taking well over a minute per upload (I think that’s what those 45471 things that it needs to process are). So, I’m looking at a month for it to finish. :-1:

And the 25 that it processed appeared not to have done anything (the most recent one did reported that it was restoring an image to its proper place).

It’s a standard install on an 8GB Digital Ocean “optimized” droplet.

I’ve set

  db_shared_buffers: "2048MB"
  db_work_mem: "80MB"
  UNICORN_WORKERS: 4

At one point I tried increasing num_connections up to 200 (a new trick I learned recently), but that didn’t seem to help either.

              total        used        free      shared  buff/cache   available
Mem:           7.8G        1.7G        133M        1.0G        6.0G        4.8G
Swap:          2.0G        691M        1.3G

I’m pretty stumped.

EDIT: I think it’s just that the database is not on SSD, but whatever DO calls their extra space and it’s a huge community with a couple million posts.

3 Likes

Possible, that job is pretty “dumb”. I’ll have a look if I can find some quick wins.

3 Likes

Thanks, @zogstrip!

My quick solution is to rsync the images back to default out of tombstone and then rebake (I think they may need to rebake anyway). This will un-break the images in all posts immediately and the rebake will see that they don’t get tombstoned again, right?

1 Like

Not sure actually. This will temporarily show the images but the rebake process won’t create the upload records, so they’ll be back in tombstone when the cleanup job runs.

1 Like

OK, so rsync them over so that the images aren’t broken, and THEN run the rake task to update the database.

It’s running much faster on the production server, but it still looks like 20 days to complete 430K (posts?). The site is mostly about images, so broken images is something of a problem. :slight_smile:

This happens to me on imports a lot. I’m thinking that there must be something to do to update the database that’s not happening in the importer. I should look at what that rake task is doing…

1 Like

@zogstrip how safe would the process be that Jay proposes here? Discourse is still moving images from the import into the tombstone, I seem to be at 440k now and I’m concerned that processing will take longer than 30 days - meaning images could get deleted permanently.

Also, any speedups in the task would be fantastic!

I have a related question: A little over 1600 entries have been processed now, but not a single file has been moved back from the tombstone into the uploads directory (I verified with a filecount). Does this mean it’s not working?

Edit: apparently almost all of them are in the optimized directory, not in original. Digging a little deeper.

1 Like

The script prints the names of the files or moves. Not all of the posts (?) it’s processing need fixing.

1 Like

We’ve made some improvements to the uploads:recover_from_tombstone rake task.

It even has its own tested class now :wink: thanks to @tgxworld and @vinothkannans

https://github.com/discourse/discourse/blob/7eea55d564e8f1a7202a078363628c7ba973dbd3/lib/upload_recovery.rb

3 Likes