This is on a test machine. I was previously running discourse there - I messed up the installation and wasn’t able to update to the latest version which I thought was my mistake. After removing the whole discourse directory and clearing up docker I tried doing a completely fresh install before importing a backup from the live db.
Weirdly I’m still seeing the same issues which I am not able to resolve.
Here’s the failure output. Tried discourse-doctor already but didn’t bring up anything helpful.
...
I, [2022-06-04T18:42:29.087446 #1] INFO -- : Terminating async processes
I, [2022-06-04T18:42:29.087672 #1] INFO -- : Sending INT to HOME=/var/lib/postgresql USER=postgres exec chpst -u postgres:postgres:ssl-cert -U postgres:postgres:ssl-cert /usr/lib/postgresql/13/bin/postmaster -D /etc/postgresql/13/main pid: 42
I, [2022-06-04T18:42:29.087881 #1] INFO -- : Sending TERM to exec chpst -u redis -U redis /usr/bin/redis-server /etc/redis/redis.conf pid: 103
2022-06-04 18:42:29.088 UTC [42] LOG: received fast shutdown request
103:signal-handler (1654368149) Received SIGTERM scheduling shutdown...
2022-06-04 18:42:29.118 UTC [42] LOG: aborting any active transactions
2022-06-04 18:42:29.123 UTC [42] LOG: background worker "logical replication launcher" (PID 51) exited with exit code 1
2022-06-04 18:42:29.123 UTC [46] LOG: shutting down
103:M 04 Jun 2022 18:42:29.154 # User requested shutdown...
103:M 04 Jun 2022 18:42:29.154 * Saving the final RDB snapshot before exiting.
103:M 04 Jun 2022 18:42:29.159 * DB saved on disk
103:M 04 Jun 2022 18:42:29.159 # Redis is now ready to exit, bye bye...
2022-06-04 18:42:29.201 UTC [42] LOG: database system is shut down
FAILED
--------------------
Pups::ExecError: cd /var/www/discourse && su discourse -c 'bundle exec rake db:migrate' failed with return #<Process::Status: pid 1102 exit 1>
Location of failure: /usr/local/lib/ruby/gems/2.7.0/gems/pups-1.1.1/lib/pups/exec_command.rb:117:in `spawn'
exec failed with the params {"cd"=>"$home", "hook"=>"db_migrate", "cmd"=>["su discourse -c 'bundle exec rake db:migrate'"]}
bootstrap failed with exit code 1
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one.
./discourse-doctor may help diagnose the problem.
69cb25658efb6f16e4479bb98a2d0278d72e56028865730841ac1efacc5b8d9d
==================== END REBUILD LOG ====================
The server itself should be fine - plenty of disk space, enough resources otherwise. Any idea?
103:M 04 Jun 2022 18:40:07.369 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
Hmm… 16G RAM is quite a lot, so you might think you don’t need swap. But I would say it might not do any harm to add some. Without seeing your log I can’t say the problem is memory scarcity. But if it is, setting the overcommit mode might help, whether or not you have swap.
good find, Ed. Thanks. Looks like s3_bucket at some point changed to s3_upload_bucket and I do have those in containers/app.yml which seems to have caused the issue. At least building went fine now after I changed DISCOURSE_S3_BUCKET there to DISCOURSE_S3_UPLOAD_BUCKET.
I wish such changes would also introduce a check in the build process to avoid running into this - and good luck we always test our updates on a test machine.