Problems while updating from 3.0.3 to 3.0.4 : Error 523

Hey there !

I run into an issue while trying to upgrade my install with launcher utility.

I get an error 523 when the build container try to change ownership on uploaded images…
Any thought ?

Here is the log :

$ sudo ./launcher rebuild app
x86_64 arch detected.
WARNING: containers/app.yml file is world-readable. You can secure this file by running: chmod o-rwx containers/app.yml
Ensuring launcher is up to date
Fetching origin
Launcher is up-to-date
Stopping old container
+ /usr/bin/docker stop -t 600 app
app
2.0.20230502-0058: Pulling from discourse/base
Digest: sha256:fa95da36c3d3a582d644b139ec678f5778d745697454bc86f598c689031b30aa
Status: Image is up to date for discourse/base:2.0.20230502-0058
docker.io/discourse/base:2.0.20230502-0058
/usr/local/lib/ruby/gems/3.2.0/gems/pups-1.1.1/lib/pups.rb
/usr/local/bin/pups --stdin

.....

Switched to a new branch 'stable'
I, [2023-06-18T16:43:24.458070 #1]  INFO -- : Branch 'stable' set up to track remote branch 'stable' from 'origin'.

I, [2023-06-18T16:43:24.458386 #1]  INFO -- : > cd /var/www/discourse && sudo -H -E -u discourse git config user.discourse-version stable
I, [2023-06-18T16:43:24.469320 #1]  INFO -- : 
I, [2023-06-18T16:43:24.469386 #1]  INFO -- : > cd /var/www/discourse && mkdir -p tmp
I, [2023-06-18T16:43:24.472481 #1]  INFO -- : 
I, [2023-06-18T16:43:24.472660 #1]  INFO -- : > cd /var/www/discourse && chown discourse:www-data tmp
I, [2023-06-18T16:43:24.476232 #1]  INFO -- : 
I, [2023-06-18T16:43:24.476303 #1]  INFO -- : > cd /var/www/discourse && mkdir -p tmp/pids
I, [2023-06-18T16:43:24.479386 #1]  INFO -- : 
I, [2023-06-18T16:43:24.479449 #1]  INFO -- : > cd /var/www/discourse && mkdir -p tmp/sockets
I, [2023-06-18T16:43:24.482943 #1]  INFO -- : 
I, [2023-06-18T16:43:24.483012 #1]  INFO -- : > cd /var/www/discourse && touch tmp/.gitkeep
I, [2023-06-18T16:43:24.486152 #1]  INFO -- : 
I, [2023-06-18T16:43:24.486220 #1]  INFO -- : > cd /var/www/discourse && mkdir -p                    /shared/log/rails
I, [2023-06-18T16:43:24.489788 #1]  INFO -- : 
I, [2023-06-18T16:43:24.489954 #1]  INFO -- : > cd /var/www/discourse && bash -c "touch -a           /shared/log/rails/{production,production_errors,unicorn.stdout,unicorn.stderr,sidekiq}.log"
I, [2023-06-18T16:43:24.495214 #1]  INFO -- : 
I, [2023-06-18T16:43:24.495285 #1]  INFO -- : > cd /var/www/discourse && bash -c "ln    -s           /shared/log/rails/{production,production_errors,unicorn.stdout,unicorn.stderr,sidekiq}.log /var/www/discourse/log"
I, [2023-06-18T16:43:24.500211 #1]  INFO -- : 
I, [2023-06-18T16:43:24.500283 #1]  INFO -- : > cd /var/www/discourse && bash -c "mkdir -p           /shared/{uploads,backups}"
I, [2023-06-18T16:43:24.504652 #1]  INFO -- : 
I, [2023-06-18T16:43:24.504738 #1]  INFO -- : > cd /var/www/discourse && bash -c "ln    -s           /shared/{uploads,backups} /var/www/discourse/public"
I, [2023-06-18T16:43:24.512836 #1]  INFO -- : 
I, [2023-06-18T16:43:24.512942 #1]  INFO -- : > cd /var/www/discourse && bash -c "mkdir -p           /shared/tmp/{backups,restores}"
I, [2023-06-18T16:43:24.518383 #1]  INFO -- : 
I, [2023-06-18T16:43:24.518453 #1]  INFO -- : > cd /var/www/discourse && bash -c "ln    -s           /shared/tmp/{backups,restores} /var/www/discourse/tmp"
I, [2023-06-18T16:43:24.523090 #1]  INFO -- : 
I, [2023-06-18T16:43:24.523195 #1]  INFO -- : > cd /var/www/discourse && chown -R discourse:www-data /shared/log/rails /shared/uploads /shared/backups /shared/tmp
chown: /shared/uploads/default/optimized/1X: Unknown error 523
chown: /shared/uploads/default/original/1X: Unknown error 523
I, [2023-06-18T16:43:41.385629 #1]  INFO -- : 


FAILED
--------------------
Pups::ExecError: cd /var/www/discourse && chown -R discourse:www-data /shared/log/rails /shared/uploads /shared/backups /shared/tmp failed with return #<Process::Status: pid 135 exit 1>
Location of failure: /usr/local/lib/ruby/gems/3.2.0/gems/pups-1.1.1/lib/pups/exec_command.rb:117:in `spawn'
exec failed with the params {"cd"=>"$home", "hook"=>"code", "cmd"=>["sudo -H -E -u discourse git reset --hard", "sudo -H -E -u discourse git clean -f", "sudo -H -E -u discourse bash -c '\n  set -o errexit\n  if [ $(git rev-parse --is-shallow-repository) == \"true\" ]; then\n      git remote set-branches --add origin main\n      git remote set-branches origin $version\n      git fetch --depth 1 origin $version\n  else\n      git fetch --tags --prune-tags --prune --force origin\n  fi\n'", "sudo -H -E -u discourse bash -c '\n  set -o errexit\n  if [[ $(git symbolic-ref --short HEAD) == $version ]] ; then\n      git pull\n  else\n      git -c advice.detachedHead=false checkout $version\n  fi\n'", "sudo -H -E -u discourse git config user.discourse-version $version", "mkdir -p tmp", "chown discourse:www-data tmp", "mkdir -p tmp/pids", "mkdir -p tmp/sockets", "touch tmp/.gitkeep", "mkdir -p                    /shared/log/rails", "bash -c \"touch -a           /shared/log/rails/{production,production_errors,unicorn.stdout,unicorn.stderr,sidekiq}.log\"", "bash -c \"ln    -s           /shared/log/rails/{production,production_errors,unicorn.stdout,unicorn.stderr,sidekiq}.log $home/log\"", "bash -c \"mkdir -p           /shared/{uploads,backups}\"", "bash -c \"ln    -s           /shared/{uploads,backups} $home/public\"", "bash -c \"mkdir -p           /shared/tmp/{backups,restores}\"", "bash -c \"ln    -s           /shared/tmp/{backups,restores} $home/tmp\"", "chown -R discourse:www-data /shared/log/rails /shared/uploads /shared/backups /shared/tmp", "[ ! -d public/plugins ] || find public/plugins/ -maxdepth 1 -xtype l -delete"]}
bootstrap failed with exit code 1
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one.
./discourse-doctor may help diagnose the problem.
2 Likes

I can’t find anything useful about this error. What is your OS distribution and version? What kind of storage is in use for uploads?

I hope you already know that using the ‘stable’ track is an unusual tactic - almost everyone runs the ‘tests-passed’ track. See
Why does Discourse always install “beta” versions by default?

Are you using S3 and if so, is the name of your bucket configured correctly?

But it should not fail more, so this is irrelevant IMO?

It might be seen to fail less, if very few installations use stable. Also, I wanted to sure @gmoirod was aware of the situation - guessing that some people who run stable are running it without knowing what it is.

I get that. But if his install is failing right now then my estimate is that jumping to 3.1.0beta5 will make it much worse. So let’s focus on the issue first.

1 Like

I run Discourse as docker containers on top of Debian 11 server.
My uploads are on a shared NFS mount. This has always been the case and I never had this issue before.

Yeah I saw several things on this… I have to go for it one day…
I am kind of a Debian guy you know. Keep “stable” for production.

There you have it. And is that mount currently accessible from within the container?

My actual running Discourse 3.0.3 container :

            {
                "Type": "bind",
                "Source": "/nfsdata/discourse-data-shared/uploads",
                "Destination": "/shared/uploads",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            },

Within the container, ownerships and rights seem good

$ sudo docker exec app sh -c "ls -al /shared/uploads /shared/uploads/default/optimized/1X /shared/uploads/default/original/1X"
/shared/uploads:
total 4
drwxr-xr-x  2 discourse www-data    0 Jun 20 20:07 .
drwxr-xr-x 10 root      root     4096 Mar  8 16:29 ..
drwxr-xr-x  2 discourse www-data    0 Jun  8  2022 default
drwxr-xr-x  2 discourse www-data    0 Mar  8 17:34 tombstone

/shared/uploads/default/optimized/1X:
total 17094
drwxr-xr-x 2 discourse www-data      0 Mar 22 11:30 .
drwxr-xr-x 2 discourse www-data      0 Mar  8 16:18 ..
-rw-r--r-- 1 discourse www-data  54700 Mar  8 16:52 00964701d199ec0d6d3dd5269c842e1f0bb7e7a1_2_1035x456.png
-rw-r--r-- 1 discourse www-data    205 Mar  8 16:52 00964701d199ec0d6d3dd5269c842e1f0bb7e7a1_2_10x10.png
.....

/shared/uploads/default/original/1X:
total 17932
drwxr-xr-x 2 discourse www-data       0 Apr 23 11:42 .
drwxr-xr-x 2 discourse www-data       0 Jun  8  2022 ..
-rw-r--r-- 1 discourse www-data   35706 Nov 18  2022 00964701d199ec0d6d3dd5269c842e1f0bb7e7a1.png
-rwxr-xr-x 1 discourse www-data   17112 Jul  4  2022 00a82b03ffbcdf56e34f86adbec263e12573f49b.png

Moreover, I am able to upload new images in running Discourse 3.0.3

Precision : I do not have SELinux/AppArmor activated

Hoping this is not related :sob: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=102728

Actually, this isn’t a problem with the repository on NFS. It’s a problem
with NFS-mounted CLIENT dir.

The problem turned out to be an NFS bug in the Linux kernel, so you can
probably close this bug.

That something is the top result in Google does not mean it’s the best result.

Date: Thu, 28 Jun 2001 20:03:01 UTC

Are you able to chown from within the container, i.e. run the failing command manually?

sudo docker exec app sh -c "chown -R discourse:discourse /shared/uploads/default/optimized/1X"

I am afraid I can (I corrected the command as the one executed initially) …

$ docker exec -it app bash
app:/$ cd /var/www/discourse
app:/var/www/discourse$ chown -R discourse:www-data /shared/uploads
app:/var/www/discourse$ echo $?
0

Took a long time but no error.

Is there any difference in the docker image running in 3.0.3 and the one used to build image in 3.0.4 ?

Docker image versions are not tied to Discourse versions. Additionally, it depends on how you rebuilt before, so it’s hard to say.

My actual 3.0.3 image gives me

# docker image inspect local_discourse/app | grep 'discourse/base'
            "Image": "discourse/base:2.0.20230409-0052",

And the one in error used :

Status: Image is up to date for discourse/base:2.0.20230502-0058

I’m gonna check diff :mag:

Damned !
I can’t see anything related : Comparing 3d317b7f58e8201912972afa3910b6c4b9ad8c75...main · discourse/discourse_docker · GitHub

Okay there is definitely an issue with base image.
I forced the use of discourse/base:2.0.20230409-0052 in launcher and it worked like a charm.

# git diff launcher
diff --git a/launcher b/launcher
index 3e1a1c4..8a989b8 100755
--- a/launcher
+++ b/launcher
@@ -92,7 +92,7 @@ kernel_min_version='4.4.0'
 config_file=containers/"$config".yml
 cidbootstrap=cids/"$config"_bootstrap.cid
 local_discourse=local_discourse
-image="discourse/base:2.0.20230502-0058"
+image="discourse/base:2.0.20230409-0052"
 docker_path=`which docker.io 2> /dev/null || which docker`
 git_path=`which git`

Anybody can see the change causing this ?

Believe or not, I just rebuilded again with discourse/base:2.0.20230502-0058 and it passed… :man_shrugging:

2 Likes

No trouble believing in computer shenanigans :upside_down_face:

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.