Site crashed - what are my options

I have an install of 2.5.0.beta4.
It was going to be upgraded but it in the process it broke and my site no longer loads.
I have all the files and db etc. It was via a docker on Ubuntu 16.
What are my options to fix this?
I have no idea when it comes to shell, so I will have to get someone to do it for me but I need to know without shooting in the dark, what I should be looking to do.
Any help appreciated.

Can you provide some additional details, is the container loading (docker ps)? Any error messages/logs that might point the way? Are you self-hosting or on some third party host?

2 Likes

Hi Mike, thank you for replying here’s what I get.

When I run ./discourse-docto I get this error:

+ /usr/bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e 
RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e 
RUBY_GC_HEAP_INIT_SLOTS=400000 -e 
RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e 
DISCOURSE_DB_SOCKET=/var/run/postgresql -e 
DISCOURSE_DB_HOST= -e 
DISCOURSE_DB_PORT= -e 
LETSENCRYPT_DIR=/shared/letsencrypt -e 
DISCOURSE_FORCE_HTTPS=true -e 
DISCOURSE_HOSTNAME=discuss.domain -e 
DISCOURSE_DEVELOPER_EMAILS=info@domain -e 
DISCOURSE_SMTP_ADDRESS=mail3.domain -e 
DISCOURSE_SMTP_PORT=587 -e 
DISCOURSE_SMTP_USER_NAME=xxx@domain -e 
DISCOURSE_SMTP_PASSWORD=xxx -e 
LETSENCRYPT_ACCOUNT_EMAIL=info@domain -h 
Ubuntu-1804-bionic-64-minimal-app -e 
DOCKER_HOST_IP=172.17.0.1 --name app -t -p 80:80 -p 443:443 -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:8d:5a:f6:a3:11 local_discourse/app /sbin/boot

Unable to find image 'local_discourse/app:latest' locally
docker: Error response from daemon: pull access denied for local_discourse/app, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
See 'docker run --help'.
Failed to restart the container.

Same error when I try to run /launcher rebuild app
docker is newest version

also I see this error in doctor log:

FAILED
--------------------
Pups::ExecError: cd /var/www/discourse && find /var/www/discourse ! -user discourse -exec chown discourse {} \+ failed with return #<Process::Status: pid 1929 exit 1>
Location of failure: /usr/local/lib/ruby/gems/3.2.0/gems/pups-1.1.1/lib/pups/exec_command.rb:117:in `spawn'
exec failed with the params {"cd"=>"$home", "hook"=>"web", "cmd"=>["gem install bundler --conservative -v $(awk '/BUNDLED WITH/ { getline; gsub(/ /,\"\"); print $0 }' Gemfile.lock)", "find $home ! -user discourse -exec chown discourse {} \\+"]}
bootstrap failed with exit code 1
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one.
./discourse-doctor may help diagnose the problem.
f4d0b782e3d1c3deccb5e3d6c186a08ebbaaea22ed37d19a2ff07b7688c83926
 

what kind of upgrade were you doing? I’d check the app.yml file for typos and then run a full rebuild again.

Your docker version is too old, and that is happening because your Ubuntu is too old.
Do you have a backup? Just spin up a new VPS on Ubuntu 22 and restore it.

2 Likes

I don’t have a backup only a snapshot from the VPS when it was last working but that doesn’t load after restore.

Docker version 23.0.3, build 3e7cbfd

app.yml is OK, validated on https://www.yamllint.com

Error is about this line in launcher:

cidbootstrap=cids/"$config"_bootstrap.cid
local_discourse=local_discourse
image="discourse/base:2.0.20230409-0052"
docker_path=`which docker.io 2> /dev/null || which docker`
git_path=`which git`
local_discourse=local_discourse

Anyway to get data or backup in this situation?

The database files should be accessible. On my production server they are at /var/discourse/shared/standalone/postgres_data, yours may be in a similar place.

I’d suggest doing a FULL backup of the system to another server before going further, especially all the directories that are under a ‘discourse’ directory.

In the absence of a current (enough) backup, I’d create a new server using a more current version of ubuntu (and thus docker) and then copy all the directories under the discourse trees to it, including all the postgres files. (I haven’t had to do this on a discourse server yet, but I’ve administered postgres databases for 20 years and if the database files are intact that aids in recreating/rebuilding things.)

Has it been a while since this system was updated? If the answer to that question is ‘yes’ there could be complications with trying to copy existing files under the discourse directories to a new server, because that’s really not the right way to do it; a discourse backup has a lot more than just the postgresql database files in it. (This is why I try to do a backup before even a small upgrade, although I’ll admit I don’t always do it.)

Your latest discourse backup is probably in a directory similar to this one:
/var/discourse/shared/standalone/backups/default

3 Likes

Hi Mike, thank you for your replies it’s highly appreciated.
The upgrade failed and at some point the SSL stopped auto-renewing so while trying to do all this, I ended up with this scenario.
I give this a go, I am trying to rebuild the app but I am hit with not enough space I have 15GB used by discourse for some reason.
Other than rebuilding is there a way to start discourse like a start command?

Thanks again.

You probably have multiple container images hanging around.

Running ‘docker images’ will list them. On my production system, for example, that lists 7 images each around 3.5 GB, so that adds up to a bit of space, my sandbox system shows over 20 of them. You could delete some of those images to free up some space, or copy them to a separate server if you have space available on them.

I assume ‘docker ps’ does not show a running container.

Have you tried starting up an older container to see if that brings the site up? Then you can take a backup.

Launcher should work for that, I think it largely serves a front end for docker commands but it may also do other tasks necessary to get discourse going.

Hey Mike, there’s only 1 container in there funnily. I managed free up space by clearing the logs etc. But had no luck. There’s another issue which I cannot find. I will just follow your other instructions to see if I can get it up using manual methods e.g. copy files and db over to a later version.
When I run docker run I get:

docker run discourse
Unable to find image 'discourse:latest' locally
docker: Error response from daemon: pull access denied for discourse, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.

What happens if you try launcher?

./launcher start app
Gives the following:

.....truncated
Unable to find image 'local_discourse/app:latest' locally
/usr/bin/docker: Error response from daemon: 
pull access denied for local_discourse/app, 
repository does not exist or may require 'docker login': denied:
 requested access to the resource is denied.

what do you get from ‘docker images’?

Just the one:
discourse/base 2.0.20230409-0052 08afe7103ce8

I can more or less reproduce the error messages you’re getting by playing with an older image, but so far I haven’t found a way to force that image to load.

Not sure what more I can suggest at this time.

Hi Mike your solution earlier marked as solution somewhat sorted it out. Copied all the files to a new vps and it worked after few changes.
Thanks all for the help.

1 Like

Glad you found a fix, happy to have helped out. Be sure to take backups on a regular basis. :slight_smile:

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.