Update = ☠

Hi all

I keep up on updates pretty well, seen one was due and did it, the result was no UI except in admin

SSH in and did server updates to be sure all were good, no help

Tried to roll back and got this

Entered Safe Mode and got booted, but got a partial UI

image

Can’t log in, says my username is available and that my email does not exist

I could use some help please and thank you


edit, rebooted server and got

not swayed I went to register, figured a back up was due now

no joy there either


edit, refreshing a different window I seemed to be able to register

then

image

still no joy


SSH and things still appear as they were


edit

DISCOURSE DOCTOR Sun Nov 12 01:54:06 UTC 2023
OS: Linux ip-10-0-159-37 6.2.0-1015-aws #15~22.04.1-Ubuntu SMP Fri Oct  6 21:37:24 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux


Found containers/app.yml

==================== YML SETTINGS ====================
DISCOURSE_HOSTNAME=forum.full30.com
SMTP_ADDRESS=smtp.sendgrid.net
DEVELOPER_EMAILS=REDACTED 
SMTP_PASSWORD=REDACTED 
SMTP_PORT=587
SMTP_USER_NAME=apikey
LETSENCRYPT_ACCOUNT_EMAIL=REDACTED  LETSENCRYPT_ACCOUNT_EMAIL

==================== DOCKER INFO ====================
DOCKER VERSION: Docker version 24.0.7, build afdd53b

DOCKER PROCESSES (docker ps -a)

CONTAINER ID   IMAGE                 COMMAND        CREATED        STATUS                      PORTS     NAMES
0be0150fecde   local_discourse/app   "/sbin/boot"   5 months ago   Exited (5) 11 minutes ago             app

==================== SERIOUS PROBLEM!!!! ====================
app not running!
Attempting to rebuild
==================== REBUILD LOG ====================
x86_64 arch detected.
Ensuring launcher is up to date
Fetching origin
Launcher is up-to-date
Stopping old container
+ /usr/bin/docker stop -t 600 app
app
2.0.20231023-1945: Pulling from discourse/base
Digest: sha256:2b0eb484d20888cc2daadb690dcfa73522105650c1420212e99345a36a424d77
Status: Image is up to date for discourse/base:2.0.20231023-1945
docker.io/discourse/base:2.0.20231023-1945
/usr/local/lib/ruby/gems/3.2.0/gems/pups-1.2.1/lib/pups.rb
/usr/local/bin/pups --stdin
I, [2023-11-12T01:54:13.829288 #1]  INFO -- : Reading from stdin
I, [2023-11-12T01:54:13.834181 #1]  INFO -- : > locale-gen $LANG && update-locale
I, [2023-11-12T01:54:13.862453 #1]  INFO -- : Generating locales (this might take a while)...
Generation complete.

I, [2023-11-12T01:54:13.862638 #1]  INFO -- : > mkdir -p /shared/postgres_run
I, [2023-11-12T01:54:13.865023 #1]  INFO -- : 
I, [2023-11-12T01:54:13.865390 #1]  INFO -- : > chown postgres:postgres /shared/postgres_run
I, [2023-11-12T01:54:13.867489 #1]  INFO -- : 
I, [2023-11-12T01:54:13.867791 #1]  INFO -- : > chmod 775 /shared/postgres_run
I, [2023-11-12T01:54:13.869643 #1]  INFO -- : 
I, [2023-11-12T01:54:13.869925 #1]  INFO -- : > rm -fr /var/run/postgresql
I, [2023-11-12T01:54:13.871930 #1]  INFO -- : 
I, [2023-11-12T01:54:13.872203 #1]  INFO -- : > ln -s /shared/postgres_run /var/run/postgresql
I, [2023-11-12T01:54:13.874058 #1]  INFO -- : 
/tmp/discourse-debug.txt
2 Likes

This seems significant:

Warning: Could not create server TCP listening socket *:6379: bind: Address already in use

I think Redis is expected to run on port 6379.

Could not being able to create the Redis server cause issues with migrations?

Error: relation "summary_sections" already exists

For some reason the migration to create the summary_sections table is getting run again. Are there any other duplicate table errors in your logs?

I guess the other question is, do you have a recent backup file?

6 Likes

lol, yep, I’d pretty much guess the same, answer sadly is looking into my server I see Nov 5th as most recent, as I can not log into admin I can not see if maybe there’s a newer one created by Discourse

But… this is this morning,

forum

If I can not log into the admin dash to do a back-up then I simply don’t know what to do?

Thanks for the reply, I hope I can get going again quickly even if on a week old back up.

Robert

1 Like

I’m so unfamiliar with logs, I’ve been dong this for five years with so few and easily repairable issues that there was no reason to be, and I don’t have a tech team to ask, is it something you can walk me through to get and post and will it be redacted like the Doctor file if I do upload?

I also experienced a problem with this most recent update for a new site, may have accidentally interrupted the install and then the site reported both that it had been updated and at the same time needed to be updated urgently, but could not be updated because it had just completed the most recent update.

Rebuilding the app from console solved this, did not have to recover from backup.

Would be good to know how to re-launch site from backup also don’t know how to do that.

2 Likes

just to try again I did

cd /var/discourse
git pull
./launcher rebuild app

and again got

errors/warnings as follow

Doctor did not help again, I kept its report but nothing new in it

1 Like

Doing another rebuild now got similar errors:

137:M 12 Nov 2023 13:09:14.143 # Warning: Could not create server TCP listening socket *:6379: bind: Address already in use
137:M 12 Nov 2023 13:09:14.143 # Failed listening on port 6379 (TCP), aborting.

However is still going remember seeing these before but install still completed.

Are you logged in to the server console with ssh as root?

A couple times in the install takes awhile to complete seemed like over 10 minutes, had thought this was crashing at first but after waiting for enough time it completed.

Lot of errors about incorrect/unmet peer dependency.

Then it stops after the line “background saving terminated with success,” which sounds like a conclusion but apparently not this is where it can take ten minutes to move on to the next step.

2 Likes

To view the log details, you can do something like this:

login as root or make “sudo su
then execute commands:

cd /var/discourse
./launcher enter app
tail -f log/production.log

After running the last command you need to make HTTP request to your Discourse. A new error message will be displayed in your shell. Then you can stop tail command execution using Ctrl+C and view or copy message.

To exit container shell, you can use exit command.

2 Likes

yes

Error response from daemon: Container 0be0150fecde6af5e98c0f12b97d24ccc1333fee2e96f02174ac63b79df8efbc is not running
tail: cannot open ‘log/production.log’ for reading: No such file or directory
tail: no files remaining

I tried the HTTP request as curl “https://forum.full30.com"

and got

curl: (3) URL using bad/illegal format or missing URL

so… may be I simply did not understand what you meant by http request :man_shrugging:

With that being your most recent backup, and me not being an expert in this kind of thing, I’m not sure I should be giving advice here.

As far as I can tell, the issue with Warning: Could not create server TCP listening socket *:6379 is probably unrelated to the errors you are getting for the migrations. For example:

INFO -- cd /var/www/discourse and su discourse -c 'bundle exec rake db:migrate'

ERROR: database "discourse" already exists

Error: role "discourse" already exists

Error: relation "summary_sections" already exists

Those errors seem to suggest that your database is somehow corrupted. The bundle exec rake db:migrate command (that’s in the INFO section of the snippet of your logs that I posted above) should trigger Discourse to check the database’s schema_migrations table to see what migrations have been previously run. That prevents the same migration from being run multiple times. So my guess is that either your database’s schema_migrations table is corrupt, or that duplicate entries have somehow made there way into the db/migrations folder, or files in the db/migrations folder have somehow been renamed. I’m not sure what could trigger any of these things to happen.

I’d be tempted to enter the app with ./launcher enter app and run the migration manually from there to see if that makes any difference. Hold off on doing that though. Hopefully someone with more knowledge of the launcher scripts will see this post and correct anything I’ve written that’s wrong.

3 Likes

Curious on your opinion, or anyone’s

We’re planning another server move, would this be a good time as these problems seem to be centered around

PG::DuplicateTable: ERROR: relation "summary_sections" already exists

etc already exists items?

Might a move help?

I suspect that would solve the issue. It couldn’t hurt to try creating a new installation and importing your most recent backup file into it.

2 Likes

The site already seems to think it is a new installation, note some of my screencaps

That occurred simply when I used safemode , its a pretty substantial crash just from an update, and the more odd if just my instance of discourse affectedly just toppled

I’ve done git pulls etc no different I’d say form building new, if I’m wrong please tell me how

tried

./launcher start-cmd app

./launcher cleanup

deleted 18mb of images , did a new pull/rebuild/fail/doctor and still no joy

all my info is still present when I access my APP so there’s that at least

I’m 95% sure that the duplicate table errors are actually safe to ignore. (I remember seeing them every upgrade on our forum)

1 Like

I ignored them but am still unable to get back online,

what do you think of those errors?

Could you post a full log from a rebuild?

1 Like

I had yesterday the same problem. I thought is was a server problem because in previous updates had similar behaviour, and decided to run to solve temporarily

/launcher rebuild app

Downloaded the latest backup and deploy a new server. Restores the backup was too easy and mysteriously all is aready updated.

1 Like
FAILED
--------------------
Pups::ExecError: cd /var/www/discourse && su discourse -c 'bundle exec rake db:migrate' failed with return #<Process::Status: pid 645 exit 1>
Location of failure: /usr/local/lib/ruby/gems/3.2.0/gems/pups-1.2.1/lib/pups/exec_command.rb:132:in `spawn'
exec failed with the params {"cd"=>"$home", "hook"=>"db_migrate", "cmd"=>["su discourse -c 'bundle exec rake db:migrate'"]}
bootstrap failed with exit code 1
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one.
./discourse-doctor may help diagnose the problem.
adb2c505fd2f1289f44586496fea24ff31264f73c26eb524baf16602a189f
root@ip-10-0-159-37:/var/discourse#

meaning do what exactly?

Yes, what should I look to redact first?