Can't restore because sidekiq won't let go

I’ve got a site running in ECS with RDS. I’m trying to restore the database (as described here: Restore failed `entity2char already exists' RDS Postgres 13.7 - #7 by RGJ).

The problem at this point is that the backup won’t start because it’s pausing waiting for sidekiq. It claims it’s waiting for 60 seconds, but I’ve waited 60 minutes and it never quit waiting.

Then I decided that I would drop, create, migrate the database before the restore (I’m prett sure that’s what worked last time), but I still can’t drop the db because 6 tasks have it open.

SELECT application_name,client_addr FROM pg_stat_activity;

Shows this (and there are 6 sidekiqs with active connections)

 psql                                  | 10.3.2.155
 PostgreSQL JDBC Driver                | 127.0.0.1
                                       | 127.0.0.1
 sidekiq 6.5.8 discourse [0 of 5 busy] | 10.3.2.155
 sidekiq 6.5.8 discourse [0 of 5 busy] | 10.x.2.155
 sidekiq 6.5.8 discourse [0 of 5 busy] | 10.x.2.34
 sidekiq 6.5.8 discourse [0 of 5 busy] | 10.x.2.155
 sidekiq 6.5.8 discourse [0 of 5 busy] | 10.x.2.155
 sidekiq 6.5.8 discourse [0 of 5 busy] | 10.x.2.34

One of those addresses is the ECS, the other a container I started in an EC2. In both containers I did an sv stop unicorn, which in other contexts has been enough to be able to drop the database.

Oh. And now I have stopped the container on the EC2, so those connections must be just hanging since it shut down without closing them? Maybe I need to reboot the database again. (Rebooting the databse did stop those connections, and now it shows only idle connections from the shut-down unicorn).

What do I do to kill sidekiq? Do I go into redis and clear all the stuff? (I’ve done that before and can Google to figure it out again).

You just need to stop all instances (containers, unicorns) that have a sidekiq running inside.

That’s what I’d have thought, but even after I shut down the container running sidekiq postgres was still showing active connections. Similarly, I had done an sv stop unicorn on the ECS instance but postgres was still showing active connections.

I rebooted the RDS server, cranked up the container on the EC2 (the instances in ECS don’t have enough disk spare for a restore), sv stop unicorn, then drop, create, restore the database and now, I’m attempting a restore.

Thanks very much for your help. I do appreciate it!

1 Like