Rebuild app fails with puzzling error message


(Quim Gil) #1

Hi, I wanted to install a plugin, but now rebuild app fails and our site is down. :frowning: (the plugin is Chatroom Integration, but I don’t think this has to do with the problem)

The final error message:

FAILED
--------------------
Pups::ExecError: cd /var/www/discourse && su discourse -c 'bundle exec rake db:migrate' failed with return #<Process::Status: pid 331 exit 1>
Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn'
exec failed with the params {"cd"=>"$home", "hook"=>"bundle_exec", "cmd"=>["su discourse -c 'bundle install --deployment --verbose --without test --without development --retry 3 --jobs 4'", "su discourse -c 'bundle exec rake db:migrate'", "su discourse -c 'bundle exec rake assets:precompile'"]}

Scrolling up, there are some repetitions/variations of this problem.

Bundled gems are installed into `./vendor/bundle`

I, [2018-11-01T19:44:47.921217 #13]  INFO -- : > cd /var/www/discourse && su discourse -c 'bundle exec rake db:migrate'
This monkey patch is no longer required.
2018-11-01 19:44:55.992 UTC [377] discourse@discourse ERROR:  syntax error at or near "T" at character 135
2018-11-01 19:44:55.992 UTC [377] discourse@discourse STATEMENT:  INSERT INTO site_settings(name, data_type, value, created_at, updated_at)
	             VALUES ('sso_provider_secrets', 8, '**********T*****', now(), now())
rake aborted!
StandardError: An error has occurred, this and all later migrations canceled:

PG::SyntaxError: ERROR:  syntax error at or near "T"
LINE 2: ...   VALUES ('sso_provider_secrets', 8, '**********T*****', n...
                                                             ^
: INSERT INTO site_settings(name, data_type, value, created_at, updated_at)
             VALUES ('sso_provider_secrets', 8, '**********T*****', now(), now())
/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/rack-mini-profiler-1.0.0/lib/patches/db/pg.rb:92:in `async_exec'

'**********T*****' looks like an actual password, edited here just in case.

I have removed the new repository, but the problem persists. I have searched the forum for similar logs, unsuccessfuly. Any ideas?


SSO login error + how to access WordPress now?
SSO login error + how to access WordPress now?
(Quim Gil) #2

(After the initial panic)

start app worked. Phew!

Now I can report that our current version is v2.2.0.beta3 +4.


(Quim Gil) #7

OK, now we are in serious trouble.

We started the upgrade to the last beta. The upgrade stopped asking for Ruby 2.5.2. OK, we went to the command line and upgraded Ruby. We thought, maybe now with the new Ruby version rebuild the app will work fine. But it didn’t, it threw the same error message complaining about a character in the password of sso_provider_secrets, which still doesn’t make any sense there.

The problem is that now we cannot bring the site up again.

502 Bad Gateway

See https://la.confederac.io

We have tried stop app, star app, clean… Nothing. I wonder how to disable SSO from the command line, if that helps at all.


(Stephen) #8

The ruby version was inside Docker, not outside. Launcher actually updates this but doesn’t restart itself, so a second run takes care of such things.

Had you just re-run ./launcher rebuild app it would have continued a second time without the earlier error.

Which version were you upgrading from? How long since your last update?

Do you have a recent backup? Probably worth taking a copy from the server before you go much further. Grab a copy of your app.yml for safekeeping too.


(Quim Gil) #9

We were up to date. The earlier upgrade was two weeks ago or so.

As explained in the first post, we hadn’t been able to rebuild the app even before this Ruby message appeared in the upgrade.

OK, I’m going to download the backups and app.yml to a safe location. Thank you for the advice.


(Stephen) #10

How old is the most recent backup?

Your site isn’t showing a 502 now, so I’m assuming you’re doing something else with the server at the moment.

I’m not fully aware of the bug referenced above, unless another member of the team is able to chip in over the weekend the quickest way to get back up and running is probably to swap DNS over to a fresh droplet, do a fresh install there, migrate any app.yml changes and then restore your most recent backup.


(Quim Gil) #11

Since the log seems to complain about a password during a database migration (?), I wonder whether any character in that password causes the problem. For instance, next to the “T” the log complains about there is a \ backslash. It sounds silly, but then again a syntax error pointing to a password is kind of silly. :slight_smile:

It’s late night in Europe. The next step for me is to go to sleep hoping that magically everything works when I wake up :innocent: or that someone has left great advice here.


(Quim Gil) #12

Good morning. Unsurprisingly, the site is still broken.

I tried accessing the database through the Rails console to remove that SSO password, but rails c goes nowhere because

Discourse requires Ruby 2.5.2 or up

If that Ruby is the one in the container, then I don’t know how to update that Ruby. Because rebuild app breaks, that won’t do it.

So I’m stuck.

Here you have the entire log produced by ./launcher rebuild app: [Ruby] root@confederac:/var/discourse# ./launcher rebuild app Ensuring launcher is up - Pastebin.com

This Discourse installation is one year old. We have been upgrading regularly (for every beta) via the web interface, and also from the console when it was required. We never had a problem, until this one reported 9 days ago.


(Stephen) #13

Ok, can you post a sanitised copy of your app.yml? just censor out any passwords and API keys.


(Quim Gil) #14

Here you have. There is one repo commented, just to reduce potential causes of problems. Usually it is enabled.

  GNU nano 2.7.4                                 File: app.yml                                           

## this is the all-in-one, standalone Discourse Docker container template
##
## After making changes to this file, you MUST rebuild
## /var/discourse/launcher rebuild app
##
## BE *VERY* CAREFUL WHEN EDITING!
## YAML FILES ARE SUPER SUPER SENSITIVE TO MISTAKES IN WHITESPACE OR ALIGNMENT!
## visit http://www.yamllint.com/ to validate this file as needed

templates:
  - "templates/postgres.template.yml"
  - "templates/redis.template.yml"
  - "templates/web.template.yml"
  - "templates/web.ratelimited.template.yml"
## Uncomment these two lines if you wish to add Lets Encrypt (https)
  - "templates/web.ssl.template.yml"
  - "templates/web.letsencrypt.ssl.template.yml"

## which TCP/IP ports should this container expose?
## If you want Discourse to share a port with another webserver like Apache or nginx,
## see https://meta.discourse.org/t/17247 for details
expose:
  - "80:80"   # http
  - "443:443" # https

params:
  db_default_text_search_config: "pg_catalog.english"

  ## Set db_shared_buffers to a max of 25% of the total memory.
  ## will be set automatically by bootstrap based on detected RAM, or you can override
  db_shared_buffers: "128MB"

  ## can improve sorting performance, but adds memory usage per-connection
  #db_work_mem: "40MB"

  ## Which Git revision should this container use? (default: tests-passed)
  #version: tests-passed

env:
  LANG: en_US.UTF-8
  # DISCOURSE_DEFAULT_LOCALE: en

  ## How many concurrent web requests are supported? Depends on memory and CPU cores.
  ## will be set automatically by bootstrap based on detected CPUs, or you can override
  UNICORN_WORKERS: 2

  ## TODO: The domain name this Discourse instance will respond to
   DISCOURSE_HOSTNAME: la.confederac.io

  ## Uncomment if you want the container to be started with the same
  ## hostname (-h option) as specified above (default "$hostname-$config")
  #DOCKER_USE_HOSTNAME: true

  ## TODO: List of comma delimited emails that will be made admin and developer
  ## on initial signup example 'user1@example.com,user2@example.com'
  DISCOURSE_DEVELOPER_EMAILS: '*****@confederac.io'

  ## TODO: The SMTP mail server used to validate new accounts and send notifications
  DISCOURSE_SMTP_ADDRESS: ****************
  DISCOURSE_SMTP_PORT: 587
  DISCOURSE_SMTP_USER_NAME: ******************
  DISCOURSE_SMTP_PASSWORD: "********************"
  #DISCOURSE_SMTP_ENABLE_START_TLS: true           # (optional, default true)

  ## If you added the Lets Encrypt template, uncomment below to get a free SSL certificate
  LETSENCRYPT_ACCOUNT_EMAIL: *******@confederac.io

  ## The CDN address for this Discourse instance (configured to pull)
  ## see https://meta.discourse.org/t/14857 for details
  #DISCOURSE_CDN_URL: //discourse-cdn.example.com

## The Docker container is stateless; all data is stored in /shared
volumes:
  - volume:
     host: /var/discourse/shared/standalone
      guest: /shared
  - volume:
      host: /var/discourse/shared/standalone/log/var-log
      guest: /var/log

## Plugins go here
## see https://meta.discourse.org/t/19157 for details
hooks:
  after_code:
    - exec:
        cd: $home/plugins
        cmd:
          - git clone https://github.com/discourse/docker_manager.git
##          - git clone https://github.com/xrav3nz/discourse-wellfed.git

## Any custom commands to run after building
run:
  - exec: echo "Beginning of custom commands"
  ## If you want to set the 'From' email address for your first registration, uncomment and change:
  ## After getting the first signup email, re-comment the line. It only needs to run once.
  #- exec: rails r "SiteSetting.notification_email='info@unconfigured.discourse.org'"
  - exec: echo "End of custom commands"

(Stephen) #15

It’s a long shot, because launcher technically does it, but can you:

cd /var/discourse
git pull
./launcher rebuild app  

Please let me know if git pull results in anything other than ‘already up to date’.


(Quim Gil) #16

/var/discourse# git pull
Already up-to-date.

No changes.


(Quim Gil) #17

I hope today someone can advise us about the errors seen in the rebuild app process. Otherwise we will have to pull another server, install a fresh Discourse and enable the last backup indeed.

Not the end of the world, but feels like an overkill for what could be a bug caused by a process getting stuck by a backslash in a password…


(Quim Gil) #18

https://la.confederac.io now works in a new server with the latest version. The backup worked perfectly. I still don’t know what caused this db:migrate problem and whether we will trigger this bug again. However, lesson learned, and now we will take a snapshot of the server before upgrading, just in case.

If for testing purposes the Discourse team is interested in the SSO password that I edited in the logs published here, I can send it to you.

Thank you @Stephen for putting some time on this during the weekend. Very much appreciated.


(Stephen) #19

Glad to hear you’re back up and running. Yep it’s an odd one, I imagine the team will be interested if there are any more reports, more data is always useful.