docker exec -it app chown discourse.www-data /var/www/discourse/tmp/pids
docker exec -it app chmod g+w /var/www/discourse/tmp/pids
./launcher rebuild app
I’m installing from scratch yet again and trying to pay more attention to the log messages from the app build this time.
I see these:
153:C 16 Aug 2023 20:24:11.676 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
153:C 16 Aug 2023 20:24:11.676 # Redis version=7.0.7, bits=64, commit=00000000, modified=0, pid=153, just started
153:C 16 Aug 2023 20:24:11.676 # Configuration loaded
153:M 16 Aug 2023 20:24:11.677 * monotonic clock: POSIX clock_gettime
153:M 16 Aug 2023 20:24:11.677 # Warning: Could not create server TCP listening socket *:6379: bind: Address already in use
153:M 16 Aug 2023 20:24:11.678 # Failed listening on port 6379 (TCP), aborting.
Also this:
I, [2023-08-16T20:24:26.172936 #1] INFO -- : > cd /var/www/discourse && su discourse -c 'yarn install --frozen-lockfile && yarn cache clean'
warning " > @glint/environment-ember-loose@1.0.2" has unmet peer dependency "@glimmer/component@^1.1.2".
warning " > @glint/environment-ember-template-imports@1.0.2" has unmet peer dependency "ember-template-imports@^3.0.0".
warning " > @mixer/parallel-prettier@2.0.3" has unmet peer dependency "prettier@^2.0.0".
warning Resolution field "babel-plugin-ember-template-compilation@2.0.0" is incompatible with requested version "babel-plugin-ember-template-compilation@^2.0.1"
warning Resolution field "unset-value@2.0.1" is incompatible with requested version "unset-value@^1.0.0"
warning " > babel-plugin-debug-macros@0.4.0-pre1" has unmet peer dependency "@babel/core@^7.0.0".
warning "workspace-aggregator-d7aa52aa-3a92-43f5-97ca-2c6c21fe43f0 > discourse > @uppy/aws-s3@3.0.6" has incorrect peer dependency "@uppy/core@^3.1.2".
warning "workspace-aggregator-d7aa52aa-3a92-43f5-97ca-2c6c21fe43f0 > discourse > @uppy/aws-s3-multipart@3.1.3" has incorrect peer dependency "@uppy/core@^3.1.2".
warning "workspace-aggregator-d7aa52aa-3a92-43f5-97ca-2c6c21fe43f0 > discourse > @uppy/xhr-upload@3.1.1" has incorrect peer dependency "@uppy/core@^3.1.2".
warning "workspace-aggregator-d7aa52aa-3a92-43f5-97ca-2c6c21fe43f0 > discourse > @uppy/aws-s3 > @uppy/xhr-upload@3.3.0" has incorrect peer dependency "@uppy/core@^3.2.1".
(I’ll try the manual permission changes when it finishes building.)
Before chowning + chmodding:
root@ubuntu-app:/var/www/discourse# ls -lah /var/www/discourse/tmp/pids
total 8.0K
drwxr-xr-x 2 root root 4.0K Aug 16 20:47 .
drwxr-xr-x 6 root root 4.0K Aug 16 20:53 ..
After:
root@ubuntu-app:/var/www/discourse# ls -lah /var/www/discourse/tmp/pids
total 16K
drwxrwxr-x 1 discourse www-data 4.0K Aug 16 20:58 .
drwxr-xr-x 1 root root 4.0K Aug 16 20:53 ..
-rw-r--r-- 1 discourse www-data 5 Aug 16 20:58 unicorn.pid
Now the tail of ./launcher logs app
looks like this:
ok: run: redis: (pid 5790) 184s
ok: run: postgres: (pid 6154) 0s
supervisor pid: 6155 unicorn pid: 6159
config/unicorn_launcher: line 71: kill: (6159) - No such process
config/unicorn_launcher: line 15: kill: (6159) - No such process
(6155) exiting
ok: run: redis: (pid 5790) 188s
ok: run: postgres: (pid 6176) 0s
supervisor pid: 6177 unicorn pid: 6181
config/unicorn_launcher: line 71: kill: (6181) - No such process
config/unicorn_launcher: line 15: kill: (6181) - No such process
(6177) exiting
ok: run: redis: (pid 5790) 192s
timeout: down: postgres: 1s, normally up, want up
ok: run: redis: (pid 5790) 200s
timeout: down: postgres: 0s, normally up, want up
ok: run: redis: (pid 5790) 208s
timeout: down: postgres: 1s, normally up, want up
ok: run: redis: (pid 5790) 215s
timeout: down: postgres: 0s, normally up, want up
ok: run: redis: (pid 5790) 223s
timeout: down: postgres: 1s, normally up, want up
ok: run: redis: (pid 5790) 230s
timeout: down: postgres: 1s, normally up, want up
ok: run: redis: (pid 5790) 238s
ok: run: postgres: (pid 6264) 0s
supervisor pid: 6260 unicorn pid: 6266
config/unicorn_launcher: line 71: kill: (6266) - No such process
config/unicorn_launcher: line 15: kill: (6266) - No such process
(6260) exiting
ok: run: redis: (pid 5790) 244s
ok: run: postgres: (pid 6283) 0s
supervisor pid: 6284 unicorn pid: 6288
My browser reports an HTTPS issue with the site, and when I choose the “dangerous” option to bypass, I get the nginx “bad gateway” page.
I’m having this same issue with the current git.
Rebuilding the container after the chown
and chmod
will undo their effects.
I’m having the same issue. The last time Discourse broke was because of an undetected issue with plugins. I’m using these:
- git clone https://github.com/discourse/docker_manager.git
- git clone https://github.com/tfpk/discourse-reveal-anonymous.git
- git clone https://github.com/discourse/discourse-push-notifications.git
- git clone https://github.com/discourse/discourse-data-explorer.git
- git clone https://github.com/discourse/discourse-solved.git
- git clone https://github.com/discourse/discourse-math.git
Can we compare notes? Are there any known issues right now with these plugins?
Here’s more info. In the gunicorn stderr, I see:
/var/www/discourse/vendor/bundle/ruby/3.2.0/gems/activerecord-7.0.7/lib/active_record/connection_adapters/postgresql_adapter.rb:87:in `rescue in new_client': connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: No such file or directory (ActiveRecord::ConnectionNotEstablished)
Is the server running locally and accepting connections on that socket?
In the PG log, I see:
2023-08-21 19:24:00.721 UTC [1681] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2023-08-21 19:24:00.728 UTC [1681] LOG: could not open configuration file "/etc/postgresql/13/main/pg_hba.conf": Permission denied
2023-08-21 19:24:00.728 UTC [1681] FATAL: could not load pg_hba.conf
2023-08-21 19:24:00.741 UTC [1681] LOG: database system is shut down
further:
# ls -l /etc/postgresql/13/main/pg_hba.conf
-rw-r----- 1 root root 4846 Aug 21 19:05 /etc/postgresql/13/main/pg_hba.conf
What user is postgres running under inside the container? With the above permissions, it has to be root
or someone in group root
Ok, so I did chmod o+r /etc/postgresql/13/main/pg_hba.conf
and now the container is up again.
This is all a bit concerning - why isn’t the recommended installation method not working out of the box? My plugin status currently includes the ones listed above except data explorer which I disabled since it had caused the failure the last time.
Crosslinking to
which reports similar symptoms.
Update: I changed the git
command in the cmd
section of the app.yml file to use sudo
as described in the linked post.
I am declaring this failure to be intermittent. In 3 tries (between each I completely wiped the shared
directory) it succeeded once and failed twice. When it fails, manually fixing the three permissions in question and then restarting the container resulted in what appears to be a working system. Better logging and better self tests would be nice in order to detect failing container bootstraps.
Changing the permissions on /var/www/discourse/tmp/pids
and /etc/postgre/13/main/pg_hba.conf
does not work for me.
I had changed my plugin list before the rebuild but even after restoring the plugin list I get the same ArgumentError
. Are we sure it is the plugins list that is causing it?
After starting the container I see files written to the pids
directory. My launcher log stops after this
ok: down: unicorn: 0s, normally up
run-parts: executing /etc/runit/3.d/10-redis
ok: down: redis: 0s, normally up
run-parts: executing /etc/runit/3.d/99-postgres
run-parts: executing /etc/runit/1.d/00-ensure-links
run-parts: executing /etc/runit/1.d/00-fix-var-logs
run-parts: executing /etc/runit/1.d/01-cleanup-web-pids
run-parts: executing /etc/runit/1.d/anacron
run-parts: executing /etc/runit/1.d/cleanup-pids
Cleaning stale PID files
run-parts: executing /etc/runit/1.d/copy-env
Started runsvdir, PID is 35
ok: run: redis: (pid 52) 0s
ok: run: postgres: (pid 55) 0s
supervisor pid: 42 unicorn pid: 75
Anything else I can look at?
I have the same issue described in this thread. Tried to install discourse today on a new server, failed because of missing write permission.
The “Fix” with chown/chmod does not work for me, the permission is reset again on ./launcher rebuild app
So you managed to reproduce this on a brand new install? That’s very useful information, as I can’t reproduce by updating an existing one.
Yes, tried to install discourse by just following the guide and thought i’ve gone mad, since everything “seemed” to work but i only get a “502” error.
I tried changing the permissions within the container without rebuilding. Now i get this:
ok: run: redis: (pid 50) 4677s
ok: run: postgres: (pid 12224) 0s
supervisor pid: 12215 unicorn pid: 12226
config/unicorn_launcher: line 71: kill: (12226) - No such process
config/unicorn_launcher: line 15: kill: (12226) - No such process
(12215) exiting
ok: run: redis: (pid 50) 4686s
ok: run: postgres: (pid 12249) 0s
supervisor pid: 12240 unicorn pid: 12251
config/unicorn_launcher: line 71: kill: (12251) - No such process
config/unicorn_launcher: line 15: kill: (12251) - No such process
(12240) exiting
ok: run: redis: (pid 50) 4695s
timeout: down: postgres: 1s, normally up, want up
ok: run: redis: (pid 50) 4703s
ok: run: postgres: (pid 12279) 0s
supervisor pid: 12275 unicorn pid: 12281
I have a second server, set up exactly like this one, running for over a year. That one just works, stopping the container, doing a git pull, rebuild and start also just worked
Adding this as a data point since I don’t see it in this topic.
Do you see a fatal: detected dubious ownership...
message in the unicorn log? I am getting this message.
Adding /var/www/discourse
as safe.directory
to the gitconfig
has no effect for my install.
So I just got a brand new droplet running Ubuntu 22.04, ran the install guide and it came up just fine. Can’t reproduce this issue in either an old or a new install.
What distro were you running?
Nothing like this in a brand new install:
root@test-install:/var/discourse# cat /var/discourse/shared/standalone/log/rails/unicorn.std*
I, [2023-08-22T17:16:33.594602 #2982] INFO -- : Refreshing Gem list
I, [2023-08-22T17:16:38.624384 #2982] INFO -- : listening on addr=127.0.0.1:3000 fd=10
I, [2023-08-22T17:16:43.003213 #2982] INFO -- : starting 1 supervised sidekiqs
I, [2023-08-22T17:16:47.070059 #2982] INFO -- : master process ready
I, [2023-08-22T17:16:50.490722 #3068] INFO -- : worker=0 ready
I, [2023-08-22T17:16:52.394685 #3077] INFO -- : worker=1 ready
I, [2023-08-22T17:16:53.139229 #3085] INFO -- : worker=2 ready
I, [2023-08-22T17:16:53.518292 #3097] INFO -- : worker=3 ready
Loading Sidekiq in process id 3059
Here is the tail of my unicorn.stderr.log
for what it’s worth
I, [2023-08-22T04:18:52.795267 #81] INFO -- : Refreshing Gem list
fatal: detected dubious ownership in repository at '/var/www/discourse'
To add an exception for this directory, call:
git config --global --add safe.directory /var/www/discourse
I, [2023-08-22T04:18:57.742262 #81] INFO -- : listening on addr=127.0.0.1:3000 fd=10
fatal: detected dubious ownership in repository at '/var/www/discourse'
To add an exception for this directory, call:
git config --global --add safe.directory /var/www/discourse
I, [2023-08-22T04:19:04.916798 #81] INFO -- : starting 1 supervised sidekiqs
I, [2023-08-22T04:19:04.927971 #81] INFO -- : starting up EmailSync demon
I, [2023-08-22T04:19:07.993280 #81] INFO -- : master process ready
I, [2023-08-22T04:19:11.010040 #174] INFO -- : worker=0 ready
I, [2023-08-22T04:19:11.994849 #188] INFO -- : worker=1 ready
I, [2023-08-22T04:19:12.524936 #203] INFO -- : worker=2 ready
Ubuntu 22.04 (Server)
Freshly set up from a cloud hoster. Did nothing but update system, install docker, zsh, nginx and run the setup for discourse, which failed
Edit:
I rebuild the cloud server with Fedora, just installed docker, and then tried the fresh install again.
Another different error
[Tue 22 Aug 2023 05:51:02 PM UTC] Run reload cmd: sv reload nginx
warning: nginx: unable to open supervise/ok: file does not exist
[Tue 22 Aug 2023 05:51:02 PM UTC] Reload error for :
Started runsvdir, PID is 2941
ok: run: redis: (pid 2953) 0s
ok: run: postgres: (pid 2954) 0s
supervisor pid: 2949 unicorn pid: 2981
the internal nginx doesn’t want to run
Wait, why did you install nginx? Default install will fail if there is something on the host using port 80 already.
On a new Digital Ocean droplet I did nothing that wasn’t in the official install guide, and it did work fine. Letting Discourse installer install docker for me and everything.
Trying to reproduce this is proving tricky.
Yeah, it seems so. It definitely seems to be a thing since I’ve seen multiple people report it (@DarthLasciel , @Godmar_Back and maybe @kdambekalns ).
Let me know if there is anything you want to see from my install.
Does the issue still happens if you try to provision under a new subdomain in a brand new server and strictly following our official install guide, refraining from installing any extra packages in the host?
Also can you please share your [redacted] app.yml ?
Specifically, please be sure to redact the DISCOURSE_SMTP_PASSWORD
environment variable and any other sensitive bits.
Also:
Can you try adding this to the URL? ?safe_mode=no_plugins
If it succeeds, it might point to plugin bugs.
Except for the subdomain part i did exactly that.
New Fedora Server, install docker, install discourse, setup without nginx being installed on the server. Results in the error posted above.
## this is the all-in-one, standalone Discourse Docker container template
##
## After making changes to this file, you MUST rebuild
## /var/discourse/launcher rebuild app
##
## BE *VERY* CAREFUL WHEN EDITING!
## YAML FILES ARE SUPER SUPER SENSITIVE TO MISTAKES IN WHITESPACE OR ALIGNMENT!
## visit http://www.yamllint.com/ to validate this file as needed
templates:
- "templates/postgres.template.yml"
- "templates/redis.template.yml"
- "templates/web.template.yml"
## Uncomment the next line to enable the IPv6 listener
#- "templates/web.ipv6.template.yml"
- "templates/web.ratelimited.template.yml"
## Uncomment these two lines if you wish to add Lets Encrypt (https)
#- "templates/web.ssl.template.yml"
#- "templates/web.letsencrypt.ssl.template.yml"
## which TCP/IP ports should this container expose?
## If you want Discourse to share a port with another webserver like Apache or nginx,
## see https://meta.discourse.org/t/17247 for details
expose:
- "9980:80" # http
# - "443:443" # https
params:
db_default_text_search_config: "pg_catalog.english"
## Set db_shared_buffers to a max of 25% of the total memory.
## will be set automatically by bootstrap based on detected RAM, or you can override
#db_shared_buffers: "256MB"
## can improve sorting performance, but adds memory usage per-connection
#db_work_mem: "40MB"
## Which Git revision should this container use? (default: tests-passed)
#version: tests-passed
env:
LC_ALL: de_DE.UTF-8
LANG: de_DE.UTF-8
LANGUAGE: de_DE.UTF-8
# DISCOURSE_DEFAULT_LOCALE: en
## How many concurrent web requests are supported? Depends on memory and CPU cores.
## will be set automatically by bootstrap based on detected CPUs, or you can override
#UNICORN_WORKERS: 3
## TODO: The domain name this Discourse instance will respond to
## Required. Discourse will not work with a bare IP number.
DISCOURSE_HOSTNAME: 'redacted.de'
## Uncomment if you want the container to be started with the same
## hostname (-h option) as specified above (default "$hostname-$config")
#DOCKER_USE_HOSTNAME: true
## TODO: List of comma delimited emails that will be made admin and developer
## on initial signup example 'user1@example.com,user2@example.com'
DISCOURSE_DEVELOPER_EMAILS: 'me@example.com,you@example.com'
## TODO: The SMTP mail server used to validate new accounts and send notifications
# SMTP ADDRESS, username, and password are required
# WARNING the char '#' in SMTP password can cause problems!
DISCOURSE_SMTP_ADDRESS: none.com
#DISCOURSE_SMTP_PORT: 587
DISCOURSE_SMTP_USER_NAME: user@none.com
DISCOURSE_SMTP_PASSWORD: none
#DISCOURSE_SMTP_ENABLE_START_TLS: true # (optional, default true)
#DISCOURSE_SMTP_DOMAIN: discourse.example.com # (required by some providers)
#DISCOURSE_NOTIFICATION_EMAIL: noreply@discourse.example.com # (address to send notifications from)
## If you added the Lets Encrypt template, uncomment below to get a free SSL certificate
#LETSENCRYPT_ACCOUNT_EMAIL: me@example.com
## The http or https CDN address for this Discourse instance (configured to pull)
## see https://meta.discourse.org/t/14857 for details
#DISCOURSE_CDN_URL: https://discourse-cdn.example.com
## The maxmind geolocation IP address key for IP address lookup
## see https://meta.discourse.org/t/-/137387/23 for details
#DISCOURSE_MAXMIND_LICENSE_KEY: 1234567890123456
## The Docker container is stateless; all data is stored in /shared
volumes:
- volume:
host: /var/discourse/shared/standalone
guest: /shared
- volume:
host: /var/discourse/shared/standalone/log/var-log
guest: /var/log
## Plugins go here
## see https://meta.discourse.org/t/19157 for details
hooks:
after_code:
- exec:
cd: $home/plugins
cmd:
- git clone https://github.com/discourse/docker_manager.git
## Any custom commands to run after building
run:
- exec: echo "Beginning of custom commands"
## If you want to set the 'From' email address for your first registration, uncomment and change:
## After getting the first signup email, re-comment the line. It only needs to run once.
#- exec: rails r "SiteSetting.notification_email='info@unconfigured.discourse.org'"
- exec: echo "End of custom commands"
I’ve got the same permission problem where unicorn.pid isn’t writable. I wanted to get you some more info by seeing if it reproduces with the official install guide, but that doesn’t fit my system. First it complained because ports 80/443 are already in use, and then it did some e-mail checks that failed. Do you have an app.yml that just makes a docker image that doesn’t require anything external? Presumably this same issue should reproduce with that just as well.