Error when installing discourse on Centos


(Flaviu) #1

Hello,

I am trying to install a new discourse on a clean server but I get this error:

I, [2018-10-11T04:28:17.965799 #14]  INFO -- : > cd /var/www/discourse && su discourse -c 'bundle exec rake db:migrate'
2018-10-11 04:28:24.160 UTC [365] discourse@discourse ERROR:  relation "users" does not exist at character 566
2018-10-11 04:28:24.160 UTC [365] discourse@discourse STATEMENT:                SELECT a.attname, format_type(a.atttypid, a.atttypmod),
                             pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod,
                             c.collname, col_description(a.attrelid, a.attnum) AS comment
                        FROM pg_attribute a
                        LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum
                        LEFT JOIN pg_type t ON a.atttypid = t.oid
                        LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation
                       WHERE a.attrelid = '"users"'::regclass
                         AND a.attnum > 0 AND NOT a.attisdropped
                       ORDER BY a.attnum

2018-10-11 04:28:35.564 UTC [374] discourse@discourse LOG:  duration: 100.559 ms  statement: ALTER TABLE "users" ADD "staged" boolean DEFAULT FALSE NOT NULL

This is definitely a bug, I tried on multiple servers with different OS versions.

Probably it is something with the current version of discourse

it alway end with this message error:

I, [2018-10-11T07:15:34.658740 #14]  INFO -- : Terminating async processes
I, [2018-10-11T07:15:34.658882 #14]  INFO -- : Sending INT to HOME=/var/lib/postgresql USER=postgres exec chpst -u postgres:postgres:ssl-cert -U postgres:postgres:ssl-cert /usr/lib/postgresql/10/bin/postmaster -D /etc/postgresql/10/main pid: 84
I, [2018-10-11T07:15:34.659052 #14]  INFO -- : Sending TERM to exec chpst -u redis -U redis /usr/bin/redis-server /etc/redis/redis.conf pid: 200
200:signal-handler (1539242134) Received SIGTERM scheduling shutdown...
2018-10-11 07:15:34.659 UTC [84] LOG:  received fast shutdown request
2018-10-11 07:15:34.680 UTC [84] LOG:  aborting any active transactions
2018-10-11 07:15:34.685 UTC [84] LOG:  worker process: logical replication launcher (PID 93) exited with exit code 1
2018-10-11 07:15:34.688 UTC [88] LOG:  shutting down
200:M 11 Oct 07:15:34.758 # User requested shutdown...
200:M 11 Oct 07:15:34.758 * Saving the final RDB snapshot before exiting.
200:M 11 Oct 07:15:34.787 * DB saved on disk
200:M 11 Oct 07:15:34.787 # Redis is now ready to exit, bye bye...
2018-10-11 07:15:34.873 UTC [84] LOG:  database system is shut down


FAILED
--------------------
Pups::ExecError: cd /var/www/discourse && su discourse -c 'bundle exec rake assets:precompile' failed with return #<Process::Status: pid 1429 exit 1>
Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn'
exec failed with the params {"cd"=>"$home", "hook"=>"bundle_exec", "cmd"=>["su discourse -c 'bundle install --deployment --verbose --without test --without development --retry 3 --jobs 4'", "su discourse -c 'bundle exec rake db:migrate'", "su discourse -c 'bundle exec rake assets:precompile'"]}
6e330632df0aef58e54069e5041a77d3624c8f870d1a7939cb738df4b2c6adba
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one

(Jeff Atwood) #2

Possible, @tgxworld?


(Flaviu) #3

@codinghorror , is there any option to install an older version of Discourse? I have my community down for more than 12 hours. Maybe there is an alternative that I can try.


(Stephen) #4

I’m just testing a fresh install of the currently available build now.

@UnivacTwo to confirm, your title says that this is a fresh install, but your most recent reply says you have a community down for more than 12 hours. Can you please clarify if this is a fresh install, or an existing community?


(Flaviu) #5

It is a fresh install cause the old server died, but the community has a few months. The problem is that I cannot install Discourse on a new server to put our backup.


(Stephen) #6

To confirm as of 4:47PM, the error:

I, [2018-10-11T15:37:48.991417 #14]  INFO -- : > cd /var/www/discourse && su discourse -c 'bundle exec rake db:migrate'
2018-10-11 15:37:59.592 UTC [374] discourse@discourse ERROR:  relation "users" does not exist at character 566
2018-10-11 15:37:59.592 UTC [374] discourse@discourse STATEMENT:                SELECT a.attname, format_type(a.atttypid, a.atttypmod),
	                     pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod,
	                     c.collname, col_description(a.attrelid, a.attnum) AS comment
	                FROM pg_attribute a
	                LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum
	                LEFT JOIN pg_type t ON a.atttypid = t.oid
	                LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation
	               WHERE a.attrelid = '"users"'::regclass
	                 AND a.attnum > 0 AND NOT a.attisdropped
	               ORDER BY a.attnum

Does appear during an install, but the build succeeds.

That’s the full recommended install, with domain name specified (IP address based install not supported), SMTP details provided, and Let’s Encrypt enabled.

  • Did you add any files to your install between git pull and ./discourse-setup?
  • Did you add any plugins before completing the build at least once?

(Flaviu) #7

Definitely there is something strange.

I used this guide https://github.com/discourse/discourse/blob/master/docs/INSTALL-cloud.md on a centos7.5 minimal with kernel 4.4

1. The install script doesn’t see the corect ip of the server
Checking your domain name . . .
WARNING:: This server does not appear to be accessible at community.lumminary.com:443.

        A connection to http://community.lumminary.com (port 80) also fails.

        This suggests that community.lumminary.com resolves to the wrong IP address
        or that traffic is not being routed to your server.

        Google: "open ports YOUR CLOUD SERVICE" for information for resolving this problem.

        You should probably answer "n" at the next prompt and disable Let's Encrypt.

        This test might not work for all situations,
        so if you can access Discourse at http://community.lumminary.com, you might try anyway.

But the DNS is correctly configured:

 [root@cm-wb01 discourse]# ifconfig
        docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
                inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
                inet6 fe80::42:3aff:fe97:ad0e  prefixlen 64  scopeid 0x20<link>
                ether 02:42:3a:97:ad:0e  txqueuelen 0  (Ethernet)
                RX packets 535  bytes 33082 (32.3 KiB)
                RX errors 0  dropped 0  overruns 0  frame 0
                TX packets 711  bytes 7901227 (7.5 MiB)
                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

        eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
                inet 159.69.184.75  netmask 255.255.255.255  broadcast 159.69.184.75
                inet6 2a01:4f8:c010:183e::1  prefixlen 64  scopeid 0x0<global>
                inet6 fe80::9400:ff:fe11:dd55  prefixlen 64  scopeid 0x20<link>
                ether 96:00:00:11:dd:55  txqueuelen 1000  (Ethernet)
                RX packets 628121  bytes 934834163 (891.5 MiB)
                RX errors 0  dropped 0  overruns 0  frame 0
                TX packets 24331  bytes 2170517 (2.0 MiB)
                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

        lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
                inet 127.0.0.1  netmask 255.0.0.0
                inet6 ::1  prefixlen 128  scopeid 0x10<host>
                loop  txqueuelen 1  (Local Loopback)
                RX packets 46  bytes 3016 (2.9 KiB)
                RX errors 0  dropped 0  overruns 0  frame 0
                TX packets 46  bytes 3016 (2.9 KiB)
                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

        [root@cm-wb01 discourse]# ping community.lumminary.com
        PING community.lumminary.com (159.69.184.75) 56(84) bytes of data.
        64 bytes from static.75.184.69.159.clients.your-server.de (159.69.184.75): icmp_seq=1 ttl=64 time=0.089 ms
        64 bytes from static.75.184.69.159.clients.your-server.de (159.69.184.75): icmp_seq=2 ttl=64 time=0.074 ms

2) After I configure Discourse, I am asked again to reconfigure Discourse. Please see below.

 WARNING: We are about to start downloading the Discourse base image
    This process may take anywhere between a few minutes to an hour, depending on your network speed

    Please be patient

    Unable to find image 'discourse/base:2.0.20181010' locally
    2.0.20181010: Pulling from discourse/base
    3b37166ec614: Pulling fs layer
    504facff238f: Pulling fs layer
    ebbcacd28e10: Pulling fs layer
    c7fb3351ecad: Pulling fs layer
    2e3debadcbf7: Pulling fs layer
    821ef8792d96: Pulling fs layer
    c7fb3351ecad: Waiting
    2e3debadcbf7: Waiting
    821ef8792d96: Waiting
    ebbcacd28e10: Verifying Checksum
    ebbcacd28e10: Download complete
    504facff238f: Download complete
    3b37166ec614: Verifying Checksum
    3b37166ec614: Download complete
    c7fb3351ecad: Verifying Checksum
    c7fb3351ecad: Download complete
    2e3debadcbf7: Verifying Checksum
    2e3debadcbf7: Download complete
    3b37166ec614: Pull complete
    504facff238f: Pull complete
    ebbcacd28e10: Pull complete
    c7fb3351ecad: Pull complete
    2e3debadcbf7: Pull complete
    821ef8792d96: Verifying Checksum
    821ef8792d96: Download complete
    821ef8792d96: Pull complete
    Digest: sha256:98eb4ece77b665cd0f29f64d8d0281e365efdedfc8946e6f85c4cf80187e3734
    Status: Downloaded newer image for discourse/base:2.0.20181010
    app was not started !

    Found 3GB of memory and 2 physical CPU cores
    setting db_shared_buffers = 768MB
    setting UNICORN_WORKERS = 4
    containers/app.yml memory parameters updated.

    Hostname for your Discourse? [community.lumminary.com]: community.lumminary.com
    Email address for admin account(s)? [email@lumminary.com]:
    SMTP server address? [smtp.sendgrid.net]:
    SMTP port? [587]:
    SMTP user name? [apikey]:
    SMTP password? password
    Optional email address for setting up Let's Encrypt? (Enter 'OFF' to disable.) [email@lumminary.com]:

    Checking your domain name . . .
    WARNING:: This server does not appear to be accessible at community.lumminary.com:443.

    A connection to http://community.lumminary.com (port 80) also fails.

    This suggests that community.lumminary.com resolves to the wrong IP address
    or that traffic is not being routed to your server.

    Google: "open ports YOUR CLOUD SERVICE" for information for resolving this problem.

    You should probably answer "n" at the next prompt and disable Let's Encrypt.

    This test might not work for all situations,
    so if you can access Discourse at http://community.lumminary.com, you might try anyway.

    Does this look right?

    Hostname      : community.lumminary.com
    Email         : email@lumminary.com
    SMTP address  : smtp.sendgrid.net
    SMTP port     : 587
    SMTP username : apikey
    SMTP password : password
    Let's Encrypt : email@lumminary.com

    ENTER to continue, 'n' to try again, Ctrl+C to exit:
    Let's Encrypt will be enabled for flaviu.radulescu@lumminary.com
    web.ssl.template.yml enabled
    letsencrypt.ssl.template.yml enabled

    Configuration file at  updated successfully!

3) The next thing after this I get some strange error messages

pdates successful. Rebuilding in 5 seconds.
Building app
which: no docker.io in (/sbin:/bin:/usr/sbin:/usr/bin)
Ensuring launcher is up to date
Fetching origin
fatal: ambiguous argument ‘@’: unknown revision or path not in the working tree.
Use ‘–’ to separate paths from revisions, like this:
’git […] – […]'
fatal: Not a valid object name @
./launcher: line 701: [: @: unary operator expected
./launcher: line 711: [: e0bbec1fb97f62c55c0036d68b16bdfefbadc5f1: unary operator expected
Launcher has diverged source, this is only expected in Dev mode

4) After a few minutes of installation everything end up with a fatal failre

I, [2018-10-11T16:05:10.262839 #14]  INFO -- : Terminating async processes
I, [2018-10-11T16:05:10.262918 #14]  INFO -- : Sending INT to HOME=/var/lib/postgresql USER=postgres exec chpst -u postgres:postgres:ssl-cert -U postgres:postgres:ssl-cert /usr/lib/postgresql/10/bin/postmaster -D /etc/postgresql/10/main pid: 84
I, [2018-10-11T16:05:10.263019 #14]  INFO -- : Sending TERM to exec chpst -u redis -U redis /usr/bin/redis-server /etc/redis/redis.conf pid: 200
2018-10-11 16:05:10.263 UTC [84] LOG:  received fast shutdown request
200:signal-handler (1539273910) Received SIGTERM scheduling shutdown...
200:M 11 Oct 16:05:10.274 # User requested shutdown...
200:M 11 Oct 16:05:10.274 * Saving the final RDB snapshot before exiting.
2018-10-11 16:05:10.275 UTC [84] LOG:  aborting any active transactions
2018-10-11 16:05:10.283 UTC [84] LOG:  worker process: logical replication launcher (PID 93) exited with exit code 1
200:M 11 Oct 16:05:10.288 * DB saved on disk
200:M 11 Oct 16:05:10.288 # Redis is now ready to exit, bye bye...
2018-10-11 16:05:10.292 UTC [88] LOG:  shutting down
2018-10-11 16:05:10.491 UTC [84] LOG:  database system is shut down


FAILED
--------------------
Pups::ExecError: cd /var/www/discourse && su discourse -c 'bundle exec rake assets:precompile' failed with return #<Process::Status: pid 1429 exit 1>
Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn'
exec failed with the params {"cd"=>"$home", "hook"=>"bundle_exec", "cmd"=>["su discourse -c 'bundle install --deployment --verbose --without test --without development --retry 3 --jobs 4'", "su discourse -c 'bundle exec rake db:migrate'", "su discourse -c 'bundle exec rake assets:precompile'"]}
9b1e759da9418b45031ddb63fbaec5832e631c4f8af50f721e3aaf9f3c2e0424
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one

If you think it may help your investigatin, I can give you access to a machine to play with it. Please just let me know.


(Stephen) #8

You’re using CloudFlare, is the orange cloud turned on for the community A record? Turn that off - it will prevent Let’s Encrypt from enrolling a certificate.

Also, your server is at Hetzner - have you opened up :80 and :443?

There are lots of threads here which document the pain of installing on CentOS, you said you’ve followed the cloud install guide, but the first bullet point recommends using Ubuntu. You’re making your life unnecessarily difficult here.

If you want to pay someone to troubleshoot this I would recommend posting in #marketplace. That’s really the only option if you want someone to do it for you.


(Flaviu) #9

it is off


(Flaviu) #10

@Stephen I have Ansible scripts which I used in the past to deploy Discourse. I have installed it for more than 50 times in the last 3 months, I have never had this type of issues. I was having it installed on Centos before. Something changed, either at the source code or some other linux packages.


(Stephen) #11

I’m not sure what your point here is.

I’ve built a fresh server this afternoon and verified that the build process as it is outlined still works without incident.

Your assertion that you’ve used Ansible to do this previously is irrelevant.


(Flaviu) #12

@Stephen, I didn’t meant to be impolite or anything.

I just tried to find a solution to this problem. I am sure I am not the only one using Centos with Discourse. I have offered myself to give you a server on which Discourse is not working.

If there is a problem, I am sure other users will benefit before even knowing that was a problem in the past.

Reagrding Ansible, I just wanted to pinpoint that human error was not the case as I have Ansible scripts to deploy Discourse.

I could try with an older version of Discourse if somebody can pinpoint how I can do this.


(Stephen) #13

installation is littered with install issues on various versions of CentOS.

You need to get to the bottom of why your server can’t be reached on those ports - it’s well outside the scope of what we support here.


(Flaviu) #14

@Stephen Just to let you know I did a kernel downgrade to 3.10 and now it is working. So there may be an incompatibility in between the software used by Discourse and the new kernel. I will let you know if I figure out what is happening.

BTW, It helped me a lot to know that the code base is working, as I didn’t know from where to start. I forgot to thank you for the testing. THANK YOU!


(Flaviu) #15

It looks like this is the cause of the problem

Errno::EACCES: Permission denied @ dir_s_mkdir - /var/www/discourse/tmp/cache/assets

it looks like this folder should be inside the container, but I don;t know who should create the folder, what permission should have, or how I can enter inside the container without the container being started.

Do you have some hints?


(Stephen) #16

From my earlier post:

Something about your install has to be different for this to fail, these are the obvious culprits.


(Flaviu) #17

No, absolutely nothing. Discourse is clean and it is based on your git, no plugin, nothing. The only difference from the server where is working and the one where is not working is the kernel. One has kernel 3.10 the other 4.4

But I can’t find how this can affect it


(Jeff Atwood) #18

I know @mpalmer has looked at newer kernels, any thoughts on possible issues Matt?


(Alan Tan) #19

The error is actually harmless but I’ll see if I can hunt down what is trying to connect to the users table even before the table has been created.


(Matt Palmer) #21

I know CentHat do a lot of patching to their kernels. Beyond that, I really couldn’t say. You’ve got a kernel that works now, stick with that if you can, otherwise you’ve got a fun* debugging session ahead of you.