Cannot access or rebuild Discourse container; Redis refuses connections


(Logomancer) #1

Hi. I was trying to restart a Discourse container to renew a Let’s Encrypt certificate – this was a manual process, before Discourse had support for it – and after I restarted the container, the site was down. I looked in the logs, and I found that Redis was refusing connections. Restarting the container did nothing. I tried rebuilding the container, and this is what I got:

I, [2016-07-27T02:36:46.769834 #15]  INFO -- : > cd /var/www/discourse && su discourse -c 'bundle exec rake db:migrate'
Failed to report error: Error connecting to Redis on localhost:6379 (Errno::ECONNREFUSED) 2 Error connecting to Redis on localhost:6379 (Errno::ECONNREFUSED) subscribe failed, reconnecting in 1 second. Call stack ["/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:345:in `rescue in establish_connection'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:331:in `establish_connection'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:101:in `block in connect'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:293:in `with_reconnect'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:100:in `connect'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:276:in `with_socket_timeout'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:133:in `call_loop'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/subscribe.rb:43:in `subscription'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/subscribe.rb:12:in `subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis.rb:2760:in `_subscription'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis.rb:2138:in `block in subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis.rb:58:in `block in synchronize'", "/usr/local/lib/ruby/2.3.0/monitor.rb:214:in `mon_synchronize'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis.rb:58:in `synchronize'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis.rb:2137:in `subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.1/lib/message_bus/backends/redis.rb:304:in `global_subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.1/lib/message_bus.rb:508:in `global_subscribe_thread'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.1/lib/message_bus.rb:456:in `block in new_subscriber_thread'"] 

Pursuant to another thread, I tried rebooting the server, then rebuilding. Still nothing. I start and enter the container, and I can see the Redis server running, but it’s not accepting connections. I have no clue what to do here. Help?


(Jeff Atwood) #2

What is the output of docker version, free memory, and free disk space?


(Logomancer) #3

docker version:

Client:
 Version:      1.9.1
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   a34a1d5
 Built:        Fri Nov 20 13:16:54 UTC 2015
 OS/Arch:      linux/amd64

free -m:

             total       used       free     shared    buffers     cached
Mem:           993        439        553          0         63        197
-/+ buffers/cache:        178        814
Swap:         1023          0       1023

df -h:

Filesystem      Size  Used Avail Use% Mounted on
udev            486M     0  486M   0% /dev
tmpfs           100M  512K   99M   1% /run
/dev/vda1        30G  9.0G   20G  33% /
tmpfs           497M     0  497M   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           497M     0  497M   0% /sys/fs/cgroup
tmpfs           100M     0  100M   0% /run/user/1000

(Jeff Atwood) #4

You should increase swap to 2Gb just to be safe. What happens when you do this

cd /var/discourse
git pull
./launcher rebuild app

(Matt Palmer) #5

If Redis is running inside the container, what does netstat -ltnp look like? Specifically, the line for port 6379.


(Logomancer) #6

It’s running in a VM, so not sure I can increase swap. Doing a git pull before rebuilding just tells me that the repo is already up-to-date, and rebuilding duplicates the same error.

netstat -ltnp (outside container, ssh ports removed):

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp6       0      0 :::80                   :::*                    LISTEN      12202/docker-proxy
tcp6       0      0 :::443                  :::*                    LISTEN      12193/docker-proxy   
tcp6       0      0 :::2222                 :::*                    LISTEN      12211/docker-proxy

netstat -ltnp (inside container):

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      -               
tcp6       0      0 :::6379                 :::*                    LISTEN      -

(Matt Palmer) #7

Well, I’m out of ideas then. Redis is running inside the container, and listening on the port, so whatever’s getting in the way is something weird and unusual, like firewall rules, that you haven’t mentioned.


(Sam Saffron) #8

What does your app.yml file look like? What is the full log during rebuild?


(Logomancer) #9
app.yml:

##
## After making changes to this file, you MUST rebuild for any changes
## to take effect in your live Discourse instance:
##
## /var/discourse/launcher rebuild app
##
## Make sure to obey YAML syntax! You can use this site to help check:
## http://www.yamllint.com/

## this is the all-in-one, standalone Discourse Docker container template

# You may add rate limiting by uncommenting the web.ratelimited template.
# Out of the box it allows 12 reqs a second per ip, and 100 per minute per ip
# This is configurable by amending the params in this file

templates:
  - "templates/postgres.template.yml"
  - "templates/web.template.yml"
  - "templates/web.ssl.template.yml"
  - "templates/web.letsencrypt.ssl.template.yml"
  - "templates/redis.template.yml"
  - "templates/web.ratelimited.template.yml"

## which TCP/IP ports should this container expose?
expose:
  - "80:80"   # fwd host port 80   to container port 80 (http)
  - "2222:22" # fwd host port 2222 to container port 22 (ssh)
  - "443:443" # forward ssl traffic

# any extra arguments for Docker?
# docker_args:

params:
  db_default_text_search_config: "pg_catalog.english"

  ## Set db_shared_buffers to a max of 25% of the total memory.
  ##
  ## On 1GB installs set to 128MB (to leave room for other processes)
  ## on a 4GB instance you may raise to 1GB
  #db_shared_buffers: "128MB"
  #
  ## Set higher on large instances it defaults to 10MB, for a 3GB install 40MB is a good default
  ## this improves sorting performance, but adds memory usage per-connection
  #db_work_mem: "40MB"
  #
  ## Which Git revision should this container use? (default: tests-passed)
  #version: tests-passed

env:
  LANG: en_US.UTF-8
  # DISCOURSE_DEFAULT_LOCALE: en

  ## TODO: How many concurrent web requests are supported?
  ## With 2GB we recommend 3-4 workers, with 1GB only 2
  ## If you have lots of memory, use one or two workers per logical CPU core
  #UNICORN_WORKERS: 2

  ## TODO: List of comma delimited emails that will be made admin and developer
  ## on initial signup example 'user1@example.com,user2@example.com'
  DISCOURSE_DEVELOPER_EMAILS: [redacted]

  ## TODO: The domain name this Discourse instance will respond to
  DISCOURSE_HOSTNAME: [redacted]

  ## TODO: The mailserver this Discourse instance will use
  DISCOURSE_SMTP_ADDRESS: [redacted]
  DISCOURSE_SMTP_PORT: 587
  DISCOURSE_SMTP_USER_NAME: [redacted]
  DISCOURSE_SMTP_PASSWORD: [redacted]
  DISCOURSE_SMTP_ENABLE_START_TLS: true

  ## The CDN address for this Discourse instance (configured to pull)
  #DISCOURSE_CDN_URL: //discourse-cdn.example.com

  LETSENCRYPT_ACCOUNT_EMAIL: [redacted]

## These containers are stateless, all data is stored in /shared
volumes:
  - volume:
      host: /var/discourse/shared/standalone
      guest: /shared
  - volume:
      host: /var/discourse/shared/standalone/log/var-log
      guest: /var/log

## The docker manager plugin allows you to one-click upgrade Discourse
## http://discourse.example.com/admin/docker
hooks:
  after_code:
    - exec:
        cd: $home/plugins
        cmd:
          - git clone https://github.com/discourse/docker_manager.git
          - git clone https://github.com/discourse/discourse-tagging.git


## Remember, this is YAML syntax - you can only have one block with a name
run:
  - exec: echo "Beginning of custom commands"

  ## If you want to set the 'From' email address for your first registration, uncomment and change:
  #- exec: rails r "SiteSetting.notification_email='info@unconfigured.discourse.org'"
  ## After getting the first signup email, re-comment the line. It only needs to run once.

  ## If you want to configure password login for root, uncomment and change:
  ## Use only one of the following lines:
  #- exec: /usr/sbin/usermod -p 'PASSWORD_HASH' root
  #- exec: /usr/sbin/usermod -p "$(mkpasswd -m sha-256 'RAW_PASSWORD')" root

  ## If you want to authorized additional users, uncomment and change:
  #- exec: ssh-import-id username
  #- exec: ssh-import-id anotherusername

  - exec: echo "End of custom commands"
  - exec: awk -F\# '{print $1;}' ~/.ssh/authorized_keys | awk 'BEGIN { print "Authorized SSH keys for this container:"; } NF>=2 {print $NF;}'

(Logomancer) #10

And here is the rebuild log.


(Sam Saffron) #11

Yeah, redis and pg must be before web in that yml section

See original sample in samples folder


(Logomancer) #12

Well, don’t I feel like an idiot. Rebuild succeeded, but Redis is still refusing connections. This appears in production.log twice:

Error connecting to Redis on localhost:6379 (Errno::ECONNREFUSED) subscribe failed, reconnecting in 1 second. Call stack ["/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:345:in `rescue in establish_connection'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:331:in `establish_connection'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:101:in `block in connect'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:293:in `with_reconnect'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:100:in `connect'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:364:in `ensure_connected'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:221:in `block in process'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:306:in `logging'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:220:in `process'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:134:in `block in call_loop'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:280:in `with_socket_timeout'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:133:in `call_loop'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/subscribe.rb:43:in `subscription'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/subscribe.rb:12:in `subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis.rb:2760:in `_subscription'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis.rb:2138:in `block in subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis.rb:58:in `block in synchronize'", "/usr/local/lib/ruby/2.3.0/monitor.rb:214:in `mon_synchronize'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis.rb:58:in `synchronize'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis.rb:2137:in `subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.1/lib/message_bus/backends/redis.rb:304:in `global_subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.1/lib/message_bus.rb:508:in `global_subscribe_thread'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.1/lib/message_bus.rb:456:in `block in new_subscriber_thread'"]
Job exception: Error connecting to Redis on localhost:6379 (Errno::ECONNREFUSED)

production_errors.log is empty.


(Logomancer) #13

Any more ideas? I’m really scratching my head here. Nothing makes sense.


(Sam Saffron) #14

Look in the logs, did it start redis?

Maybe you are out of disk space.


(Logomancer) #15

No, I fixed it. It was an issue with the SSL certificate renewal. It choked, which in turn choked nginx. Fun times. Thanks for the help!


(Lobaczewski) #17

We have the same problem

We followed the installation procedures here:
discourse/INSTALL-cloud.md at master · discourse/discourse · GitHub

We did the installation on an amazon ec2 instance via docker

on vim /var/discourse/shared/standalone/log/rails/production.log:

Error connecting to Redis on localhost:6379 (Errno::ECONNREFUSED) subscribe failed, reconnecting in 1 second.
Call stack 
"/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:345:in `rescue in establish_connection'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:331:in `establish_connection'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:101:in `block in connect'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:293:in `with_reconnect'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:100:in `connect'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:364:in `ensure_connected'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:221:in `block in process'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:306:in `logging'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:220:in `process'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:120:in `call'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis.rb:862:in `block in get'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis.rb:58:in `block in synchronize'", "/usr/local/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis.rb:58:in `synchronize'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis.rb:861:in `get'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.0.2/lib/message_bus/backends/redis.rb:252:in `process_global_backlog'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.0.2/lib/message_bus/backends/redis.rb:287:in `block in global_subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.0.2/lib/message_bus/backends/redis.rb:301:in `global_subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.0.2/lib/message_bus.rb:513:in `global_subscribe_thread'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.0.2/lib/message_bus.rb:461:in `block in new_subscriber_thread'"]

This is very strange, we are breaking our heads without finding a solution.

Any help would be greatly appreciated


(Matt Palmer) #18

Have you gone through the suggestions made previously? What were the results?


(Lobaczewski) #19

In our case we discovered that the firewall was blocking the sending of emails, therefore causing problems to confirm the registration of the new accounts. We found the solution through the firewall rules. I would like to thank you for the help


(Kai Middleton) #20

I’m having the same issue. Interestingly, we’ve been able to set Discourse up on two other servers, but not production. (All three of these are on AWS.) I do notice the following on production:

root@ip-172-31-30-132-app:/var/www/discourse# sudo netstat -ltnp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      -               
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      44/nginx        
tcp        0      0 0.0.0.0:3000            0.0.0.0:*               LISTEN      -               
tcp        0      0 0.0.0.0:5432            0.0.0.0:*               LISTEN      -               
tcp6       0      0 :::6379                 :::*                    LISTEN      -               
tcp6       0      0 :::5432                 :::*                    LISTEN      -  

Why is there an internal nginx listening on port 80? The other two instances don’t have that.

In app.yml I’m not exposing any ports:
expose:
#- “80:80” # http
#- “443:443” # https