Crash due to V8 isolate forked when time skips forward

I have recently observed crashes like the following on my discourse installation:

Dec 26 10:10:40 espresso bundle[367654]: D, [2021-12-26T10:10:40.517050 #367654] DEBUG -- : waiting 16.0s after suspend/hibernation
Dec 26 10:10:40 espresso bundle[367654]: I, [2021-12-26T10:10:40.738229 #367654]  INFO -- : reaped #<Process::Status: pid 367734 exit 0> worker=0
Dec 26 10:10:40 espresso bundle[367654]: I, [2021-12-26T10:10:40.738599 #367654]  INFO -- : reaped #<Process::Status: pid 367741 exit 0> worker=1
Dec 26 10:10:40 espresso bundle[367654]: I, [2021-12-26T10:10:40.738839 #367654]  INFO -- : reaped #<Process::Status: pid 367752 exit 0> worker=2
Dec 26 10:10:40 espresso bundle[367654]: I, [2021-12-26T10:10:40.738942 #367654]  INFO -- : master complete
Dec 26 10:10:41 espresso bundle[367653]: config/unicorn_launcher: line 71: kill: (367654) - No such process
Dec 26 10:10:41 espresso bundle[367653]: config/unicorn_launcher: line 15: kill: (367654) - No such process
Dec 26 10:10:41 espresso bundle[367653]: (367653) exiting
Dec 26 10:10:50 espresso bundle[481559]: WARNING: V8 isolate was forked, it can not be disposed and memory will not be reclaimed till the Ruby process exits.
Dec 26 10:10:50 espresso bundle[481559]: It is VERY likely your process will hang.
Dec 26 10:10:50 espresso bundle[481559]: If you wish to use v8 in forked environment please ensure the platform is initialized with:
Dec 26 10:10:50 espresso bundle[481559]: MiniRacer::Platform.set_flags! :single_threaded
Dec 26 10:10:50 espresso bundle[481559]: #
Dec 26 10:10:50 espresso bundle[481559]: # Fatal error in ../../src/base/platform/mutex.cc, line 51
Dec 26 10:10:50 espresso bundle[481559]: # Debug check failed: 0 == result (0 vs. 16).
Dec 26 10:10:50 espresso bundle[481559]: #
Dec 26 10:10:50 espresso bundle[481559]: #
Dec 26 10:10:50 espresso bundle[481559]: #
Dec 26 10:10:50 espresso bundle[481559]: #FailureMessage Object: 0x7f3e7a058470
Dec 26 10:10:50 espresso bundle[481559]: ==== C stack trace ===============================
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/59qyfxrrf2l6icyfvq450ghp86k4nssb-discourse-ruby-env-2.7.9/lib/ruby/gems/2.7.0/extensions/x86_64-linux/2.7.0/mini_racer-0.4.0/mini_racer_extension.so(v8::base::debug::StackTrace::StackTrace()+0x16) [0x7f3e7bd11ab6]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/59qyfxrrf2l6icyfvq450ghp86k4nssb-discourse-ruby-env-2.7.9/lib/ruby/gems/2.7.0/extensions/x86_64-linux/2.7.0/mini_racer-0.4.0/mini_racer_extension.so(+0x62c92c) [0x7f3e7ae0492c]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/59qyfxrrf2l6icyfvq450ghp86k4nssb-discourse-ruby-env-2.7.9/lib/ruby/gems/2.7.0/extensions/x86_64-linux/2.7.0/mini_racer-0.4.0/mini_racer_extension.so(V8_Fatal(char const*, int, char const*, ...)+0x173) [0x7f3e7bd089d3]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/59qyfxrrf2l6icyfvq450ghp86k4nssb-discourse-ruby-env-2.7.9/lib/ruby/gems/2.7.0/extensions/x86_64-linux/2.7.0/mini_racer-0.4.0/mini_racer_extension.so(+0x1530a05) [0x7f3e7bd08a05]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/59qyfxrrf2l6icyfvq450ghp86k4nssb-discourse-ruby-env-2.7.9/lib/ruby/gems/2.7.0/extensions/x86_64-linux/2.7.0/mini_racer-0.4.0/mini_racer_extension.so(v8::base::Mutex::~Mutex()+0x4c) [0x7f3e7bd0a30c]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/59qyfxrrf2l6icyfvq450ghp86k4nssb-discourse-ruby-env-2.7.9/lib/ruby/gems/2.7.0/extensions/x86_64-linux/2.7.0/mini_racer-0.4.0/mini_racer_extension.so(v8::platform::DelayedTaskQueue::~DelayedTaskQueue()+0x1b8) [0x7f3e7ae081d8]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/59qyfxrrf2l6icyfvq450ghp86k4nssb-discourse-ruby-env-2.7.9/lib/ruby/gems/2.7.0/extensions/x86_64-linux/2.7.0/mini_racer-0.4.0/mini_racer_extension.so(v8::platform::DefaultWorkerThreadsTaskRunner::~DefaultWorkerThreadsTaskRunner()+0xa7) [0x7f3e7ae07797]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/59qyfxrrf2l6icyfvq450ghp86k4nssb-discourse-ruby-env-2.7.9/lib/ruby/gems/2.7.0/extensions/x86_64-linux/2.7.0/mini_racer-0.4.0/mini_racer_extension.so(v8::platform::DefaultPlatform::~DefaultPlatform()+0x1ce) [0x7f3e7ae0527e]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/59qyfxrrf2l6icyfvq450ghp86k4nssb-discourse-ruby-env-2.7.9/lib/ruby/gems/2.7.0/extensions/x86_64-linux/2.7.0/mini_racer-0.4.0/mini_racer_extension.so(v8::platform::DefaultPlatform::~DefaultPlatform()+0x12) [0x7f3e7ae05332]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/z56jcx3j1gfyk4sv7g8iaan0ssbdkhz1-glibc-2.33-56/lib/libc.so.6(+0x3de57) [0x7f3e8e1e9e57]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/z56jcx3j1gfyk4sv7g8iaan0ssbdkhz1-glibc-2.33-56/lib/libc.so.6(+0x3e00a) [0x7f3e8e1ea00a]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/22hgrp938c7sfj1s4klhigddix2sqc12-ruby-2.7.4/lib/libruby-2.7.4.so.2.7(ruby_run_node+0) [0x7f3e8e5ecea0]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/22hgrp938c7sfj1s4klhigddix2sqc12-ruby-2.7.4/lib/libruby-2.7.4.so.2.7(+0x1567a8) [0x7f3e8e6917a8]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/22hgrp938c7sfj1s4klhigddix2sqc12-ruby-2.7.4/lib/libruby-2.7.4.so.2.7(+0x1f39c2) [0x7f3e8e72e9c2]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/22hgrp938c7sfj1s4klhigddix2sqc12-ruby-2.7.4/lib/libruby-2.7.4.so.2.7(+0x200073) [0x7f3e8e73b073]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/22hgrp938c7sfj1s4klhigddix2sqc12-ruby-2.7.4/lib/libruby-2.7.4.so.2.7(rb_vm_exec+0x9c) [0x7f3e8e73eb2c]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/22hgrp938c7sfj1s4klhigddix2sqc12-ruby-2.7.4/lib/libruby-2.7.4.so.2.7(+0x1c7a54) [0x7f3e8e702a54]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/22hgrp938c7sfj1s4klhigddix2sqc12-ruby-2.7.4/lib/libruby-2.7.4.so.2.7(+0x1c9f61) [0x7f3e8e704f61]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/22hgrp938c7sfj1s4klhigddix2sqc12-ruby-2.7.4/lib/libruby-2.7.4.so.2.7(+0x1ca50a) [0x7f3e8e70550a]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/z56jcx3j1gfyk4sv7g8iaan0ssbdkhz1-glibc-2.33-56/lib/libpthread.so.0(+0x8d3e) [0x7f3e8e506d3e]
Dec 26 10:10:50 espresso bundle[481559]:     /nix/store/z56jcx3j1gfyk4sv7g8iaan0ssbdkhz1-glibc-2.33-56/lib/libc.so.6(clone+0x3f) [0x7f3e8e2a643f]
Dec 26 10:10:50 espresso bundle[481559]: /nix/store/hcy90qhhi1cp82b6935wdacq2dlskpk0-ruby2.7.4-activesupport-6.1.3.2/lib/ruby/gems/2.7.0/gems/activesupport-6.1.3.2/lib/active_support/fork_tracker.rb:8: [BUG] Illegal instruction at 0x00007f3e7adf8ebf
Dec 26 10:10:50 espresso bundle[481559]: ruby 2.7.4p191 (2021-07-07) [x86_64-linux]
Dec 26 10:10:50 espresso bundle[481559]: -- Machine register context ------------------------------------------------
Dec 26 10:10:50 espresso bundle[481559]:  RIP: 0x00007f3e7adf8ebf RBP: 0x00007f3e7a058770 RSP: 0x00007f3e7a058448
Dec 26 10:10:50 espresso bundle[481559]:  RAX: 0x0000000000000000 RBX: 0x00007f3e7a058470 RCX: 0x0000000000000001
Dec 26 10:10:50 espresso bundle[481559]:  RDX: 0x0000000000000001 RDI: 0x00007f3e8e368310 RSI: 0x0000000000000000
Dec 26 10:10:50 espresso bundle[481559]:   R8: 0x0000000000000000  R9: 0x00007f3e7450b77a R10: 0x0000000000000006
Dec 26 10:10:50 espresso bundle[481559]:  R11: 0x0000000000000293 R12: 0x00007f3e7c523f0a R13: 0x00007f3e7c523fa8
Dec 26 10:10:50 espresso bundle[481559]:  R14: 0x0000000000000033 R15: 0x00007f3e7a058458 EFL: 0x0000000000010202

My theory for why this happens is the following:

  • I saw that the machine stalled for some time directly before that event. So there probably was a jump in time. This also explains the waiting 16.0s after suspend message from unicorn.
  • Analysing discourse logs, I found logs from discourse’s message bus that Global messages timed out. So this code here was triggered: message_bus/message_bus.rb at f85036bc437985b34c54d894cfde7fa008130733 · discourse/message_bus · GitHub. This lead to a fork of the Unicorn master process
  • That code does a direct fork, which probably won’t run unicorn’s before fork handlers to clean up the V8 isolates from mini_racer contexts
6 Likes

We are discussing this internally, but my feeling here is that it is time to retire this “suicide” code from message_bus. Long term it only causes issues.

3 Likes

This was fixed in

and merged into Discourse in

Thanks for the detailed report @bennofs!

2 Likes

This topic was automatically closed after 3 days. New replies are no longer allowed.