Upgrade failed due to process terminating

Today after several months of not upgrading, I upgraded docker manager successfully and then tried to upgrade docker, which failed during bundling. I tried to rebuild the container using ./launcher rebuild app but it failed on exactly the same place.

Now when I start the app, I get this error on the web console and a blank page:

Uncaught Error: There is no route named user
1 Like

First thing to check is your plugins – do they need updating too? Perhaps you are using an old plugin that is no longer compatible with Discourse.

I just tried the upgrade again, here’s where it blows up:

I, [2015-09-23T15:49:08.096530 #285]  INFO -- : Writing /var/www/discourse/public/assets/spinner_96-8091be87c9cf1abef73e3899ec7645c1.gif
I, [2015-09-23T15:49:48.583254 #285]  INFO -- : Writing /var/www/discourse/public/assets/admin-65c6808be982448a35912489faa998d1.js
[169] 23 Sep 15:52:41.064 * 10 changes in 300 seconds. Saving...
[169] 23 Sep 15:52:41.065 * Background saving started by pid 959
[959] 23 Sep 15:52:41.350 * DB saved on disk
[959] 23 Sep 15:52:41.351 * RDB: 38 MB of memory used by copy-on-write
[169] 23 Sep 15:52:41.367 * Background saving terminated with success
I, [2015-09-23T15:54:29.630194 #285]  INFO -- : Writing /var/www/discourse/public/assets/application-985117f83083528097778adfdbb2ee68.js
rake aborted!
SignalException: SIGTERM
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/asset.rb:151:in `write'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/asset.rb:151:in `block in write_to'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/asset.rb:146:in `open'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/asset.rb:146:in `write_to'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/manifest.rb:135:in `block in compile'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/manifest.rb:118:in `each'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/manifest.rb:118:in `compile'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-rails-2.0.1/lib/sprockets/rails/task.rb:60:in `block (3 levels) in define'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/rake/sprocketstask.rb:146:in `with_logger'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-rails-2.0.1/lib/sprockets/rails/task.rb:59:in `block (2 levels) in define'
Tasks: TOP => assets:precompile
(See full trace by running task with --trace)
zlib(finalizer): Zlib::GzipWriter object must be closed explicitly.
zlib(finalizer): the stream was freed prematurely.
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/message_bus-1.0.16/lib/message_bus.rb:437:in `kill': No such process (Errno::ESRCH)
        from /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/message_bus-1.0.16/lib/message_bus.rb:437:in `block (2 levels) in new_subscriber_thread'
        from /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/message_bus-1.0.16/lib/message_bus.rb:434:in `fork'
        from /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/message_bus-1.0.16/lib/message_bus.rb:434:in `block in new_subscriber_thread'
        from /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/message_bus-1.0.16/lib/message_bus/timer_thread.rb:98:in `call'
        from /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/message_bus-1.0.16/lib/message_bus/timer_thread.rb:98:in `do_work'
        from /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/message_bus-1.0.16/lib/message_bus/timer_thread.rb:29:in `block in initialize'
I, [2015-09-23T15:54:39.790839 #38]  INFO -- : Purging temp files
Bundling assets

I, [2015-09-23T15:54:39.791353 #38]  INFO -- : Terminating async processes

I don’t believe I am using any 3rd party plugins.

Weird, that implies that something terminated the upgrade like a Ctrl+c?

I also haven’t seen the Zlib::GzipWriter error before. Looks like something is preventing the upgrade from finishing.

I didn’t abort the upgrade. Have seen this same error 3 times now.

Same thing again just now, with slightly more info:

[176] 23 Sep 16:33:25.034 * Background saving started by pid 967
[967] 23 Sep 16:33:25.326 * DB saved on disk
[967] 23 Sep 16:33:25.327 * RDB: 4 MB of memory used by copy-on-write
[176] 23 Sep 16:33:25.335 * Background saving terminated with success
rake aborted!
SignalException: SIGTERM
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/index.rb:70:in `block in find_asset'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/index.rb:69:in `instance_eval'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/index.rb:69:in `find_asset'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/manifest.rb:211:in `block in find_asset'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/manifest.rb:257:in `benchmark'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/manifest.rb:210:in `find_asset'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/manifest.rb:119:in `block in compile'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/manifest.rb:118:in `each'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/sprockets/manifest.rb:118:in `compile'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-rails-2.0.1/lib/sprockets/rails/task.rb:60:in `block (3 levels) in define'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-2.11.0/lib/rake/sprocketstask.rb:146:in `with_logger'
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/sprockets-rails-2.0.1/lib/sprockets/rails/task.rb:59:in `block (2 levels) in define'
Tasks: TOP => assets:precompile
(See full trace by running task with --trace)
/var/www/discourse/vendor/bundle/ruby/2.0.0/gems/message_bus-1.0.16/lib/message_bus.rb:437:in `kill': No such process (Errno::ESRCH)
        from /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/message_bus-1.0.16/lib/message_bus.rb:437:in `block (2 levels) in new_subscriber_thread'
        from /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/message_bus-1.0.16/lib/message_bus.rb:434:in `fork'
        from /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/message_bus-1.0.16/lib/message_bus.rb:434:in `block in new_subscriber_thread'
        from /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/message_bus-1.0.16/lib/message_bus/timer_thread.rb:98:in `call'
        from /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/message_bus-1.0.16/lib/message_bus/timer_thread.rb:98:in `do_work'
        from /var/www/discourse/vendor/bundle/ruby/2.0.0/gems/message_bus-1.0.16/lib/message_bus/timer_thread.rb:29:in `block in initialize'
I, [2015-09-23T16:34:56.867208 #45]  INFO -- : Purging temp files
Bundling assets

I, [2015-09-23T16:34:56.867787 #45]  INFO -- : Terminating async processes
I, [2015-09-23T16:34:56.867971 #45]  INFO -- : Sending TERM to sudo -u postgres /usr/lib/postgresql/9.3/bin/postmaster -D /etc/postgresql/9.3/main pid: 85
I, [2015-09-23T16:34:56.868152 #45]  INFO -- : Sending TERM to sudo -u redis /usr/bin/redis-server /etc/redis/redis.conf pid: 174
2015-09-23 16:34:56 UTC LOG:  received smart shutdown request
[176 | signal handler] (1443026096) Received SIGTERM, scheduling shutdown...2015-09-23 16:34:56 UTC LOG:  autovacuum launcher shutting down

2015-09-23 16:34:56 UTC LOG:  shutting down
2015-09-23 16:34:56 UTC LOG:  database system is shut down
[176] 23 Sep 16:34:56.960 # User requested shutdown...
[176] 23 Sep 16:34:56.960 * Saving the final RDB snapshot before exiting.
[176] 23 Sep 16:34:57.240 * DB saved on disk
[176] 23 Sep 16:34:57.240 # Redis is now ready to exit, bye bye...


FAILED
--------------------
RuntimeError: cd /var/www/discourse && sudo -E -u discourse bundle exec rake assets:precompile failed with return #<Process::Status: pid 291 exit 1>
Location of failure: /pups/lib/pups/exec_command.rb:105:in `spawn'
exec failed with the params {"cd"=>"$home", "hook"=>"web", "cmd"=>["gem update bundler", "mkdir -p /shared/vendor_bundle", "cp -fr /shared/vendor_bundle/* vendor/bundle || echo \"can not copy\"", "chown -R discourse $home", "sudo -E -u discourse bundle install --deployment --verbose --without test --without development", "cp -fr vendor/bundle/* /shared/vendor_bundle", "sudo -E -u discourse bundle exec rake db:migrate", "sudo -E -u discourse bundle exec rake assets:precompile"]}
95c4e6c3dc8d1b6cb5efca65dfa15847f71cfc61701cf05c47adc0fc91cb8039
FAILED TO BOOTSTRAP

The error seems to be related to message bus killing another process. I’m not so familiar with that gem but maybe @sam can jump in and give some advice?

What’s your server environment? This looks like somehow some strange supervisor tool kills processes that run too long or cause too high CPU loads…

1 Like
$ cat /etc/issue
Ubuntu 14.04.1 LTS

Running on Rackspace Cloud.

They don’t seem to enforce something like that by default - can you send me a
ps auxf
output (if you like, via PM)?

Are you running out of memory? Do you have at least 1GB RAM + 1GB swap, or 2GB RAM?

1 Like

Getting SIGTERM is a real curiosity. Ctrl-C sends SIGINT, and the OOM killer sends SIGKILL. SIGTERM is the default signal sent by the kill command, if you don’t specify a different signal.

Looking at the message_bus code, I’m about 99% sure the problem is something to do with the communication being interrupted and the message bus deciding to take its bat and ball and go home. The exception coming out of message_bus shouldn’t happen, either (I’ll do a PR on that), but it’s a side-effect, not the cause. @sam is obviously the go-to guy on what’s going on in the innards, it being his gem.

2 Likes

Yeah, message bus has an optional default on keepalive test, when enabled it ships messages to itself on a regular basis, if it stops hearing echoes it knows that it is messed up internally, it can not recover, so it kills itself which is the only reasonable thing it can do

You can not kill threads in Ruby safely

When it kills itself it tries a term followed by a kill

Sounds to me like redis has gone belly up, my guess, not enough memory

I think that bit of code needs a rescue in there, since if the TERM goes well, the KILL is guaranteed to raise ESRCH

Well, we have a check for that now…

@shadowhand, are you running the latest version of discourse_docker (the repo that contains launcher)? If you’re on the latest code, if you try a rebuild without enough memory it should warn you of potential DOOOOOOM

2 Likes

We’re running on a server with 4GB of RAM and plenty of disk.

After upgrading to the latest discourse_docker and updating OS packages, I was able to run rebuild app without issues.

Consider the issue resolved.

3 Likes