Problems rebuilding app

I’ve got a problem rebuilding a test domain (self hosted - been running for ~7 years with infrequent updates but was running with latest release up until this week).

I had problems with a non-supported plugin which I’ve now removed and I think that may have scrogged something in the database or configuration. The error I get at completion is:

2024-04-25 01:07:42.098 UTC [34] LOG:  received fast shutdown request
I, [2024-04-25T01:07:42.099067 #1]  INFO -- : Sending TERM to exec chpst -u redis -U redis /usr/bin/redis-server /etc/redis/redis.conf pid: 96
96:signal-handler (1714007262) Received SIGTERM scheduling shutdown...
2024-04-25 01:07:42.105 UTC [34] LOG:  aborting any active transactions
2024-04-25 01:07:42.121 UTC [34] LOG:  background worker "logical replication launcher" (PID 49) exited with exit code 1
96:M 25 Apr 2024 01:07:42.121 # User requested shutdown...
96:M 25 Apr 2024 01:07:42.122 * Saving the final RDB snapshot before exiting.
2024-04-25 01:07:42.133 UTC [44] LOG:  shutting down
96:M 25 Apr 2024 01:07:42.177 * DB saved on disk
96:M 25 Apr 2024 01:07:42.178 # Redis is now ready to exit, bye bye...
2024-04-25 01:07:42.195 UTC [34] LOG:  database system is shut down
Error response from daemon: invalid JSON: got EOF while reading request body

FAILED TO COMMIT cbaab1290466a63d0a77f5f1e0894b0da632204e63472416674b7fab9ae53b41

I’ve scanned the rest of the log and the only additional errors I see are accounted for as “not important” on other posts here.

Any suggestions as to what to do next?

I think I might be left at this point to do a fresh install and then attempt a restore from backup but would appreciate any hints as to what might actually be going on…

Thanks!

There’s no way to tell without the full log.

My best guess is that you’re our of ram. I would try adding swap.

How much ram and swap do you have?

2G. Based on top, it looked like I was ok but it’s easy to add more and try again.

If there are still problems, I’ll upload the log.

I won’t get to it until tomorrow…

You’d need to be watching top while the rebuild was running.

2gb ram and 2gb swap? You can check the log for error 137 out of memory.

1 Like

I was - I’d forgotten we’d been fiddling with the WordPress instance that’s also running on that droplet so we’re deffinitely using some swap space. Probably need to grow that VPS anyway…

Yes.

I grep’d the log and didn’t see that error.

I had the bright idea of doing a reboot of the VPS before trying again. Presuming that fails, I’ll grow the droplet and try again.

1 Like

Still failed in the same way with 4G memory/swap so here’s log from the build.

rebuild.out.240425.txt (202.4 KB)

I hope you can see something and thanks for your help thus far…

[

SIGTERM looks like you did a control-c.

Did you get bored of waiting and kill the job?

no - I presume there’s something in one of the scripts in the build process - it’s the same way I’ve built it for years (ssh into a couple of sessions - one watching the other…) - all of them since it started failling have a sigterm in (I presume) that same place in the script which seems to close the app that something is reading from…

1 Like

No. I think it ask went OK. Maybe the error is the “failed to commit” at the very end but I don’t have an exclamation for that.

Is there something in the launcher script that does something back to GitHub? Would explain the error if there’s some sort of metric they track by a commit - if that’s in a shell pipeline (eg. Curl or such), would also explain closed pipe error.

Rather than my attempting to debug what’s going on with launcher, I think the easiest thing for me would be to try to do a new install and restore.

Happy to accept suggestions if you have any ideas…

Is your os out of date?

There are a bunch of strange errors about not being able to write some git file.

A new vm is probably a good idea. Restoring a backup is easiest, but you can also Move a Discourse site to another VPS with rsync

1 Like

Probably over-kill but I wound up spinning up a new droplet, did a fresh install and then did a restore of an old backup from there.

Working now…

2 Likes

Someone else had a similar error recently that I think was due to an expired key chain for the https certs. I suspect this was your issue.

The other person did an os upgrade, which solved the problem, but I prefer a fresh start.