Failed to bootstrap due to out of memory killer

devnull · April 26, 2021, 1:16pm

Hey guys,

I want to update my Discoruse with ./launcher rebuild app. It was working fine for about one year. I’m updating if it’s neccessary all 2-4 weeks.

I’m on Ubuntu 18.04.5 LTS

***@***:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.5 LTS
Release:        18.04
Codename:       bionic

Today it is stopping with that error:

FAILED
--------------------
Pups::ExecError: cd /var/www/discourse && su discourse -c 'bundle exec rake themes:update assets:precompile' failed with return #<Process::Status: pid 726 exit 1>
Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn'
exec failed with the params {"cd"=>"$home", "hook"=>"assets_precompile", "cmd"=>["su discourse -c 'bundle exec rake themes:update assets:precompile'"]}
db6d1b1dd685de69942a3df05c9cbd622860faaa286b042635878519d5b69b7b
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one.
./discourse-doctor may help diagnose the problem.

The first error above that message was:

<--- JS stacktrace --->

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
 1: 0xa04200 node::Abort() [node]
 2: 0x94e4e9 node::FatalError(char const*, char const*) [node]
 3: 0xb7978e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
 4: 0xb79b07 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
 5: 0xd34395  [node]
 6: 0xd46c01 v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
 7: 0xd0c472 v8::internal::Factory::AllocateRaw(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [node]
 8: 0xd086c2 v8::internal::FactoryBase<v8::internal::Factory>::AllocateRawArray(int, v8::internal::AllocationType) [node]
 9: 0xd08774 v8::internal::FactoryBase<v8::internal::Factory>::NewFixedArrayWithFiller(v8::internal::Handle<v8::internal::Map>, int, v8::internal::Handle<v8::internal::Oddball>, v8::internal::AllocationType) [node]
10: 0xf4ef4b v8::internal::OrderedHashTable<v8::internal::OrderedHashSet, 1>::Allocate(v8::internal::Isolate*, int, v8::internal::AllocationType) [node]
11: 0xf4f0df v8::internal::OrderedHashTable<v8::internal::OrderedHashSet, 1>::Rehash(v8::internal::Isolate*, v8::internal::Handle<v8::internal::OrderedHashSet>, int) [node]
12: 0x103eb98 v8::internal::Runtime_SetGrow(int, unsigned long*, v8::internal::Isolate*) [node]
13: 0x1401219  [node]
Aborted (core dumped)
rake aborted!
Errno::ENOENT: No such file or directory @ rb_file_s_size - /var/www/discourse/public/assets/discourse/tests/test_helper-a9cbc4e1abdd1f2e9afced86d051cbd63c2e224dafe782533646a01592cc1f42.js
/var/www/discourse/lib/tasks/assets.rake:290:in `size'
/var/www/discourse/lib/tasks/assets.rake:290:in `block (4 levels) in <main>'
/var/www/discourse/lib/tasks/assets.rake:181:in `block in concurrent?'
/var/www/discourse/lib/tasks/assets.rake:281:in `block (3 levels) in <main>'
/var/www/discourse/lib/tasks/assets.rake:272:in `each'
/var/www/discourse/lib/tasks/assets.rake:272:in `block (2 levels) in <main>'
/var/www/discourse/lib/tasks/assets.rake:181:in `concurrent?'
/var/www/discourse/lib/tasks/assets.rake:269:in `block in <main>'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rake-13.0.3/exe/rake:27:in `<top (required)>'
/usr/local/bin/bundle:23:in `load'
/usr/local/bin/bundle:23:in `<main>'
Tasks: TOP => assets:precompile
(See full trace by running task with --trace)
I, [2021-04-26T13:10:13.996101 #1]  INFO -- : Downloading MaxMindDB...
Compressing Javascript and Generating Source Maps

I, [2021-04-26T13:10:14.018697 #1]  INFO -- : Terminating async processes
I, [2021-04-26T13:10:14.020721 #1]  INFO -- : Sending INT to HOME=/var/lib/postgresql USER=postgres exec chpst -u postgres:postgres:ssl-cert -U postgres:postgres:ssl-cert /usr/lib/postgresql/13/bin/postmaster -D /etc/postgresql/13/main pid: 55
I, [2021-04-26T13:10:14.022854 #1]  INFO -- : Sending TERM to exec chpst -u redis -U redis /usr/bin/redis-server /etc/redis/redis.conf pid: 172
172:signal-handler (1619442614) Received SIGTERM scheduling shutdown...
2021-04-26 13:10:14.023 UTC [55] LOG:  received fast shutdown request
2021-04-26 13:10:14.030 UTC [55] LOG:  aborting any active transactions
2021-04-26 13:10:14.043 UTC [55] LOG:  background worker "logical replication launcher" (PID 64) exited with exit code 1
2021-04-26 13:10:14.045 UTC [59] LOG:  shutting down
2021-04-26 13:10:14.073 UTC [55] LOG:  database system is shut down
172:M 26 Apr 2021 13:10:14.122 # User requested shutdown...
172:M 26 Apr 2021 13:10:14.123 * Saving the final RDB snapshot before exiting.
172:M 26 Apr 2021 13:10:14.270 * DB saved on disk
172:M 26 Apr 2021 13:10:14.271 # Redis is now ready to exit, bye bye...

Discoruse is not running after that. Only if I’m rebooting the server the Discourse is running again. But I’m then not able to run rebuild app again. Same error.

Could you figure that out with me?

Best regards

Falco · April 26, 2021, 1:57pm

Your server is running out of memory while bootstraping. Can you share the output of free -m?

devnull · April 26, 2021, 2:00pm

**@**:~# free -m
              total        used        free      shared  buff/cache   available
Mem:            985         777          70          49         136          44
Swap:          2047         228        1819

Hm, I’m wondering about running out of memory also after reboot. I also have nothing installed additionally since last update.

Andrey_Rechkov · April 26, 2021, 3:18pm

I have the same problem. Not enough memory for nodejs.
I used “export NODE_OPTIONS=”–max_old_space_size=4096 --some_other_option"", but it doesn’t give results.

weallwegot · April 26, 2021, 3:38pm

Also had the same stacktrace when attempting to rebuild. Relatively new install I’m just using for development and testing so still should have good amount of space i think?

OS: Ubuntu 20.04.1 LTS
free -m output:

root@discourse-test-environment:/var/discourse# free -m
              total        used        free      shared  buff/cache   available
Mem:            981         136         581           0         263         698
Swap:          2047         113        1934

pfaffman · April 26, 2021, 5:22pm

That’s a bit under the 1GB minimum. You can try increasing swap a bit, but I’d recommend more RAM.

If you’re getting that error you need more ram (or maybe swap, but ram is better).

devnull · April 26, 2021, 5:57pm

Okay, but can you tell us what’s the reason for that new behavior? It’s working for more than one year with that setup. What makes Discourse different now than it has been in the past?

weallwegot · April 26, 2021, 6:27pm

thanks for the recommendation. yea slightly strange i’ve usually gotten away with default 2GB swap file on a $5 Digital ocean droplet. will keep an eye out on if this becomes more common with latest updates or something.

anyways i went ahead and added a lot more swap (4 G) in a separate file

but the upgrade still failed. maybe more RAM is non-negotiable. just unexpected because the instance has 1 topic and 1 user at the moment. also curious if i need to do anything to make sure discourse knows to use the swap or if it is just accessible by default?

my new free -m output…

              total        used        free      shared  buff/cache   available
Mem:            981         138         576           0         266         703
Swap:          6143         109        6034

devnull · April 26, 2021, 6:28pm

Yep, I’m on a Digital Ocean Droplet with 1 vCPU and 1GB vRAM as well

devnull · April 26, 2021, 6:39pm

I have increased the Swap memory onto 3GB. It’s still not working with same error.

supermathie · April 26, 2021, 6:49pm

I’m able to rebuild my test instance with only 2.5GiB of RAM+swap. It’s possible that your instance is requiring more than that, however.

Do you have any plugins installed? I suspect that one of them is causing massive memory use during compilation.

Can you rebuild without plugins and see if that fixes the problem?

weallwegot · April 26, 2021, 7:19pm

Thanks for chiming in-

Out of curiosity what is the break down of RAM vs swap? and are you only counting what is “free” space on both of those or the total swap file size + total RAM of instance?

Oh right of course - i had forget to mention i was hoping to install the Discourse OpenID Connect Authentication plugin

Currently also have the Data Explorer plugin already installed.

Tried again with only Data Explorer + Docker Manager and no luck, same stacktrace as shared before.
Tried again with no plugins (just Docker Manager) and the rebuild still didn’t work.

will keep looking around since outside of trying to add the ConnectID plugin i haven’t changed anything since original install

pfaffman · April 26, 2021, 7:20pm

I am having a problem that might be related over at Trouble with `tests/test_helper`? - #2 by Falco.

devnull · April 26, 2021, 7:21pm

I tried to rebuild app without any plugins. No change. Same error.

pfaffman · April 26, 2021, 6:43pm

I don’t understand this, but it looks like a bug. I’m trying to do a bootstrap on this site. No non-standard plugins. I just moved assets from one bucket to another and it’s all working. I was doing one more rebuild to add DISCOURSE_S3_UPLOAD_BUCKET to the ENV so that it won’t show up in the UX. When this failed the first time, I commented out that line and tried again with the same config that worked 3 days ago.


Done compressing embed-application-9cef8308c816fc1d83137e63d6c556c6cc2b68fe2b6e5ce16cca6766ba2c0ae4.js : 0.17 secs

844614.350963717 Compressing: discourse/tests/test_helper-8590b31b8e73c4172aeea4a4a6bd1930ccbce2547a20d831a30d457ba092a631.js
terser '/var/www/discourse/public/assets/discourse/tests/_test_helper-8590b31b8e73c4172aeea4a4a6bd1930ccbce2547a20d831a30d457ba092a631.js' -m -c -o '/var/www/discourse/public/assets/discourse/tests/test_helper-8590b31b8e73c4172aeea4a4a6bd1930ccbce2547a20d831a30d457ba092a631.js' --source-map "base='/var/www/discourse/public/assets/discourse/tests',root='/assets/discourse/tests',url='https://CORRECT_CDN_ADDRESS.b-cdn.net/assets/discourse/tests/test_helper-8590b31b8e73c4172aeea4a4a6bd1930ccbce2547a20d831a30d457ba092a631.js.map'"
Killed
rake aborted!
Errno::ENOENT: No such file or directory @ rb_file_s_size - /var/www/discourse/public/assets/discourse/tests/test_helper-8590b31b8e73c4172aeea4a4a6bd1930ccbce2547a20d831a30d457ba092a631.js
/var/www/discourse/lib/tasks/assets.rake:290:in `size'
/var/www/discourse/lib/tasks/assets.rake:290:in `block (4 levels) in <main>'
/var/www/discourse/lib/tasks/assets.rake:181:in `block in concurrent?'
/var/www/discourse/lib/tasks/assets.rake:281:in `block (3 levels) in <main>'
/var/www/discourse/lib/tasks/assets.rake:272:in `each'
/var/www/discourse/lib/tasks/assets.rake:272:in `block (2 levels) in <main>'
/var/www/discourse/lib/tasks/assets.rake:181:in `concurrent?'
/var/www/discourse/lib/tasks/assets.rake:269:in `block in <main>'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rake-13.0.3/exe/rake:27:in `<top (required)>'
/usr/local/bin/bundle:23:in `load'
/usr/local/bin/bundle:23:in `<main>'
Tasks: TOP => assets:precompile
(See full trace by running task with --trace)
I, [2021-04-26T18:38:36.072881 #1]  INFO -- : Updating Discourse Loading Slider...
Downloading MaxMindDB...
Compressing Javascript and Generating Source Maps

I wondered if there was some problem with the CDN url, but all the lines above included it and worked fine.

Does this mean there’s faulty CSS in their theme? If so, ouch. There’s only some CSS in their master theme. Also these components: https://github.com/discourse/DiscoTOC.git and https://github.com/davidtaylorhq/discourse-loading-slider.git

Falco · April 26, 2021, 7:02pm

Is this a minimum size droplet? Looks like that file is kinda hard for terser to compress as it causes a lot of memory pressure.

pfaffman · April 26, 2021, 7:05pm

Oh. It is surprisingly small. It’s a site that I think you guys set up some years ago (when you’d still run a site not on your infrastructure).

root@community:/var/discourse# free -h
              total        used        free      shared  buff/cache   available
Mem:           1.9G        1.2G        101M        259M        655M        354M
Swap:          2.0G        1.2G        773M

Ah. OK. It’ s 2GB DO droplet, and I have access to their control panel. I’ll tell them that we need to upgrade to 4GB and move them to an AMD.

EDIT: But if this is just to compress that one file, shouldn’t a 2GB droplet be enough?

Falco · April 26, 2021, 7:16pm

This test-helper file is hard on the compression.

UglifyJS uses 1.5GB RAM to compress it.
Terser uses a little over 1GB. Takes 40s. For comparison the same server takes 8s on Ember+jQuery

@eviltrout should we even have this file in production?

Ohh looks like it comes from this change from @Osama

https://github.com/discourse/discourse/pull/12815

-rw-r--r-- 1 discourse discourse  14M Apr 26 19:13 _test_helper-f4c4b5bf0657eab910d85b9a65b4bddbbbe2ce2ba603b17fe11b3d633d324e34.js
-rw-r--r-- 1 discourse discourse 6.6M Apr 26 19:14 test_helper-f4c4b5bf0657eab910d85b9a65b4bddbbbe2ce2ba603b17fe11b3d633d324e34.js
-rw-r--r-- 1 discourse discourse 1.1M Apr 26 19:14 test_helper-f4c4b5bf0657eab910d85b9a65b4bddbbbe2ce2ba603b17fe11b3d633d324e34.js.br
-rw-r--r-- 1 discourse discourse 1.5M Apr 26 19:14 test_helper-f4c4b5bf0657eab910d85b9a65b4bddbbbe2ce2ba603b17fe11b3d633d324e34.js.gz
-rw-r--r-- 1 discourse discourse 5.7M Apr 26 19:14 test_helper-f4c4b5bf0657eab910d85b9a65b4bddbbbe2ce2ba603b17fe11b3d633d324e34.js.map

weallwegot · April 26, 2021, 7:48pm

Just to add another data point - still experiencing the rebuild failures after removing the two theme components. So just using the default Light theme.

Falco:

-rw-r--r-- 1 discourse discourse  14M Apr 26 19:13 _test_helper-f4c4b5bf0657eab910d85b9a65b4bddbbbe2ce2ba603b17fe11b3d633d324e34.js
-rw-r--r-- 1 discourse discourse 6.6M Apr 26 19:14 test_helper-f4c4b5bf0657eab910d85b9a65b4bddbbbe2ce2ba603b17fe11b3d633d324e34.js
-rw-r--r-- 1 discourse discourse 1.1M Apr 26 19:14 test_helper-f4c4b5bf0657eab910d85b9a65b4bddbbbe2ce2ba603b17fe11b3d633d324e34.js.br
-rw-r--r-- 1 discourse discourse 1.5M Apr 26 19:14 test_helper-f4c4b5bf0657eab910d85b9a65b4bddbbbe2ce2ba603b17fe11b3d633d324e34.js.gz
-rw-r--r-- 1 discourse discourse 5.7M Apr 26 19:14 test_helper-f4c4b5bf0657eab910d85b9a65b4bddbbbe2ce2ba603b17fe11b3d633d324e34.js.map

also where is this output from - looking to verify/debug a bit on my end as well. is this like a verbose option on the ./launcher rebuild app?

Osama · April 26, 2021, 7:49pm

Fixing this properly is going to take a little while, so I’m going to revert that change for now.

Topic		Replies	Views
Precompiling assets takes 20 minutes Installation server-resources	18	1123	January 31, 2024
Failed to upgrade discourse instance to Feb 15 2022 Installation server-resources	30	2667	March 23, 2022
Upgrade from 3.2.0.beta3-dev to 3.2.0.beta3 failed due to out of memory Installation server-resources	20	1388	March 7, 2024
Suddenly boot strapping is failing Installation two-container	26	1004	January 29, 2023
New Install Bootstrap Fail in 'bundle exec rake themes:update assets:precompile' Installation	7	430	February 21, 2024

Failed to bootstrap due to out of memory killer

Related topics