Bootstrap Container Hangs on su discourse -c 'bundle exec rake db:migrate'

We are currently running this recent release:

https://github.com/discourse/discourse/commits/f4db1675f3b0cc79fa87430c229918ad7bce8a44

It’s still running in our production container (no issues).

I’m trying to bootstrap based on this release:

https://github.com/discourse/discourse/compare/f4db1675...f5b18e2a

I’ve tried a number of times today, and each time it hangs here:

Checked the logs in the data and the standby app container (we are bootstrapping) and there are no errors, no issues.

Each time I bootstrap, it hangs on this line:

I, [2021-02-11T10:37:25.133098 #1]  INFO -- : > cd /var/www/discourse && su discourse -c 'bundle exec rake db:migrate'

Any clues where to look to help debug this?

I’ve looked in all the logs I can think of (in all containers) and cannot find problems.

Thanks.

Additional Info:

Each time it hangs and I control C out of the bootstrap process, it seems to leave a postmaster process, which I can subsequently kill.

Here is an example:

Update:

Bootstrap rebuilds fine if I comment out the “db:migrate” line in the web template.

There is some strange issue with “rails db:migrate” hanging with zero errors I have never experienced before.

I wonder if there is an error (or some underlying issue) in this recent migration?

https://github.com/discourse/discourse/commit/f5b18e2a311abf503b21d996031d38286e71f74a#diff-961cae2154c303bd80bb3c10faf54ea6037c9ad0dc72411fb21551418c7f1d68

See also:

https://github.com/discourse/discourse/blob/f5b18e2a311abf503b21d996031d38286e71f74a/db/migrate/20210208022738_move_new_since_to_new_table.rb

Hi @neounix, we’re aware of an issue with the migration you linked and we’re working on a fix right now

2 Likes

Hi @david

Thanks for the reply!

I was looking at the migration just now, and my first thought, as “not a Discourse DB migration expert”, was this might be the issue ?

 add_index :dismissed_topic_users, %i(user_id topic_id), unique: true

Anyway, I’m not qualified to go much deeper into Discourse migration issues, so I’ll stand down and stand by.

Thanks again!

Can we safely abort the Migration without any data loss and go back a few commits?

It worked.

1 Like

Hey @neounix,

sorry for including that heavy and broken migration into the codebase. Those changes were reverted so new deploy should be fine:

https://github.com/discourse/discourse/pull/12058

4 Likes

Hi @kris.kotlarek

Thank you for the update:

I tend to agree that that this “heavy migration” more-than-likely requires additional review and stringent development testing on a test configuration with a large DB before merged into core. Lucky for our site, I always bootstrap in parallel to production so we did not suffer any downtime during the migration failure(s), so no worries on our end. Thanks for reverting so quickly.

FYI only (just being technically precise, sorry about that… ), I checked the DB after rebuilding just now and the “revert” process did not drop the table from DB, no big deal, just FYI only:

discourse=> \d dismissed_topic_users

                                        Table "public.dismissed_topic_users"
   Column   |            Type             | Collation | Nullable |                      Default                      
------------+-----------------------------+-----------+----------+---------------------------------------------------
 id         | bigint                      |           | not null | nextval('dismissed_topic_users_id_seq'::regclass)
 user_id    | integer                     |           |          | 
 topic_id   | integer                     |           |          | 
 created_at | timestamp without time zone |           |          | 
Indexes:
    "dismissed_topic_users_pkey" PRIMARY KEY, btree (id)
    "index_dismissed_topic_users_on_user_id_and_topic_id" UNIQUE, btree (user_id, topic_id)

Thank you again for your hard work on this and for quickly reverting the migration.

2 Likes

This migration is working smoothly now, based on the latest commit:

https://github.com/discourse/discourse/pull/12062

4 Likes

This topic was automatically closed after 6 days. New replies are no longer allowed.