Redis Error after upgrade

carlb · June 3, 2020, 7:12am

Hi,

After upgrading via the web front end I can no longer access my site. Havent changed anything, just clicked the upgrade button! The errors suggest issues connecting to redis. Ive done a lot of searching but found nothing to help so far. production.log is empty. Its running ubuntu on digital ocean. has worked fine for 18months , no errors apart from when i ran out of disk space 6 months ago where i successfully upped that

disk space is fine :-

Filesystem      Size  Used Avail Use% Mounted on
overlay          49G   25G   24G  52% /
tmpfs            64M     0   64M   0% /dev
tmpfs          1001M     0 1001M   0% /sys/fs/cgroup
shm             512M  8.0K  512M   1% /dev/shm
/dev/vda1        49G   25G   24G  52% /shared
tmpfs          1001M     0 1001M   0% /proc/acpi
tmpfs          1001M     0 1001M   0% /proc/scsi
tmpfs          1001M     0 1001M   0% /sys/firmware

the unicorn.stdout.log shows

> 2020-06-03T06:29:28.352Z pid=715 tid=osk2fuo0n ERROR: Error fetching job: Error connecting to Redis on localhost:6379 (Errno::EADDRNOTAVAIL)
> 2020-06-03T06:29:28.353Z pid=715 tid=osk2fszrb ERROR: Error fetching job: Error connecting to Redis on localhost:6379 (Errno::EADDRNOTAVAIL)
> 2020-06-03T06:29:28.354Z pid=715 tid=osk2fsjw3 ERROR: Error fetching job: Error connecting to Redis on localhost:6379 (Errno::EADDRNOTAVAIL)
> 2020-06-03T06:29:28.354Z pid=715 tid=osk2ftlhz ERROR: Error fetching job: Error connecting to Redis on localhost:6379 (Errno::EADDRNOTAVAIL)
> 2020-06-03T06:29:28.355Z pid=715 tid=osk2ftr43 ERROR: Error fetching job: Error connecting to Redis on localhost:6379 (Errno::EADDRNOTAVAIL)
> Starting up 1 supervised sidekiqs
> Loading Sidekiq in process id 725

first i tried manually rebuilding app

Have then tried apt upgrade docker rebooted Server via reboot , rebuilt using ./launcher rebuild app

redis-cli ping gets a PONG response

ss -t
State      Recv-Q Send-Q Local Address:Port                 Peer Address:Port   
ESTAB      0      0      104.248.166.162:ssh                  5.81.114.19:56270 
ESTAB      0      0      104.248.166.162:ssh                  5.81.114.19:56211

ps -axf shows its running

  PID TTY      STAT   TIME COMMAND
 2378 pts/1    Ss     0:00 /bin/bash --login
 2849 pts/1    R+     0:00  \_ ps -axf
    1 pts/0    Ss+    0:00 /bin/bash /sbin/boot
  627 pts/0    S+     0:00 /usr/bin/runsvdir -P /etc/service
  628 ?        Ss     0:00  \_ runsv rsyslog
  641 ?        Sl     0:00  |   \_ rsyslogd -n
  629 ?        Ss     0:00  \_ runsv cron
  640 ?        S      0:00  |   \_ cron -f
  630 ?        Ss     0:00  \_ runsv unicorn
  639 ?        S      0:00  |   \_ /bin/bash config/unicorn_launcher -E producti
  665 ?        Sl     0:09  |       \_ unicorn master -E production -c config/un
  725 ?        SNl    0:12  |       |   \_ sidekiq 6.0.7 discourse [0 of 5 busy]
  750 ?        Sl     0:20  |       |   \_ unicorn worker[0] -E production -c co
  758 ?        Sl     0:17  |       |   \_ unicorn worker[1] -E production -c co
 2848 ?        S      0:00  |       \_ sleep 1
  631 ?        Ss     0:00  \_ runsv postgres
  635 ?        S      0:00  |   \_ svlogd /var/log/postgres
  636 ?        S      0:00  |   \_ /usr/lib/postgresql/12/bin/postmaster -D /etc
  659 ?        Ss     0:00  |       \_ postgres: 12/main: checkpointer
  660 ?        Ss     0:00  |       \_ postgres: 12/main: background writer
  661 ?        Ss     0:00  |       \_ postgres: 12/main: walwriter
  662 ?        Ss     0:00  |       \_ postgres: 12/main: autovacuum launcher
  663 ?        Ss     0:00  |       \_ postgres: 12/main: stats collector
  664 ?        Ss     0:00  |       \_ postgres: 12/main: logical replication la
  691 ?        Ss     0:00  |       \_ postgres: 12/main: discourse discourse [l
 1848 ?        Ss     0:00  |       \_ postgres: 12/main: discourse discourse [l
 2633 ?        Ss     0:00  |       \_ postgres: 12/main: discourse discourse [l
 2675 ?        Ss     0:00  |       \_ postgres: 12/main: discourse discourse [l
 2840 ?        Ss     0:00  |       \_ postgres: 12/main: discourse discourse [l
  632 ?        Ss     0:00  \_ runsv nginx
  634 ?        S      0:00  |   \_ nginx: master process /usr/sbin/nginx
  654 ?        S      0:02  |       \_ nginx: worker process
  655 ?        S      0:00  |       \_ nginx: cache manager process
  633 ?        Ss     0:00  \_ runsv redis
  637 ?        S      0:00      \_ svlogd /var/log/redis
  638 ?        Sl     0:05      \_ /usr/bin/redis-server *:6379

Any ideas, Im no expert and really struggling to find a way to get this back running. Is there anythign simple Im missing also? Is there any other place to check that may help me find the issue?

thanks

sam · June 3, 2020, 7:20am

Oh … I see you rebooted and did all the obvious stuff…

Can you paste your container config (sans passwords) ? Is the redis template mixed in?

Anything of interest in docker logs app ?

Is localhost resolving inside your container?

carlb · June 3, 2020, 7:32am

Hi Sam, excuse my lack of knowledge,

do you mean the app.yml file?

which logs are the docker logs? I have done this ./launcher logs app

run-parts: executing /etc/runit/1.d/00-ensure-links
run-parts: executing /etc/runit/1.d/00-fix-var-logs
run-parts: executing /etc/runit/1.d/anacron
run-parts: executing /etc/runit/1.d/cleanup-pids
Cleaning stale PID files
run-parts: executing /etc/runit/1.d/copy-env
run-parts: executing /etc/runit/1.d/letsencrypt
[Wed 03 Jun 2020 06:34:47 AM UTC] Domains not changed.
[Wed 03 Jun 2020 06:34:47 AM UTC] Skip, Next renewal time is: Wed Jul  1 00:35:12 UTC 2020
[Wed 03 Jun 2020 06:34:47 AM UTC] Add '--force' to force to renew.
[Wed 03 Jun 2020 06:34:47 AM UTC] Installing key to:/shared/ssl/forum.tritalk.co.uk.key
[Wed 03 Jun 2020 06:34:47 AM UTC] Installing full chain to:/shared/ssl/forum.tritalk.co.uk.cer
[Wed 03 Jun 2020 06:34:47 AM UTC] Run reload cmd: sv reload nginx
warning: nginx: unable to open supervise/ok: file does not exist
[Wed 03 Jun 2020 06:34:47 AM UTC] Reload error for :
[Wed 03 Jun 2020 06:34:48 AM UTC] Domains not changed.
[Wed 03 Jun 2020 06:34:48 AM UTC] Skip, Next renewal time is: Thu Jul  9 00:35:12 UTC 2020
[Wed 03 Jun 2020 06:34:48 AM UTC] Add '--force' to force to renew.
[Wed 03 Jun 2020 06:34:48 AM UTC] Installing key to:/shared/ssl/forum.tritalk.co.uk_ecc.key
[Wed 03 Jun 2020 06:34:48 AM UTC] Installing full chain to:/shared/ssl/forum.tritalk.co.uk_ecc.cer
[Wed 03 Jun 2020 06:34:48 AM UTC] Run reload cmd: sv reload nginx
warning: nginx: unable to open supervise/ok: file does not exist
[Wed 03 Jun 2020 06:34:48 AM UTC] Reload error for :
Started runsvdir, PID is 627
ok: run: redis: (pid 638) 0s
ok: run: postgres: (pid 636) 0s
chgrp: invalid group: ‘syslog’
rsyslogd: imklog: cannot open kernel log (/proc/kmsg): Operation not permitted.
rsyslogd: activation of module imklog failed [v8.1901.0 try https://www.rsyslog.com/e/2145 ]
supervisor pid: 639 unicorn pid: 665

inside container i tried command , not sure if this was what you mean

tritalk@TriTalk-Discourse:/var/discourse$ sudo ./launcher enter app
root@TriTalk-Discourse-app:/var/www/discourse# curl http://localhost:8080
curl: (7) Failed to connect to localhost port 8080: Connection refused

itsbhanusharma · June 3, 2020, 8:02am

Yes, app.yml but please remove any passwords or sensitive information.

carlb · June 3, 2020, 8:06am

## this is the all-in-one, standalone Discourse Docker container template
##
## After making changes to this file, you MUST rebuild
## /var/discourse/launcher rebuild app
##
## BE *VERY* CAREFUL WHEN EDITING!
## YAML FILES ARE SUPER SUPER SENSITIVE TO MISTAKES IN WHITESPACE OR ALIGNMENT!
## visit http://www.yamllint.com/ to validate this file as needed

templates:
  - "templates/postgres.template.yml"
  - "templates/redis.template.yml"
  - "templates/web.template.yml"
  - "templates/web.ratelimited.template.yml"
## Uncomment these two lines if you wish to add Lets Encrypt (https)
  - "templates/web.ssl.template.yml"
  - "templates/web.letsencrypt.ssl.template.yml"

## which TCP/IP ports should this container expose?
## If you want Discourse to share a port with another webserver like Apache or nginx,
## see https://meta.discourse.org/t/17247 for details
expose:
  - "80:80"   # http
  - "443:443" # https

params:
  db_default_text_search_config: "pg_catalog.english"

  ## Set db_shared_buffers to a max of 25% of the total memory.
  ## will be set automatically by bootstrap based on detected RAM, or you can override
  db_shared_buffers: "128MB"

  ## can improve sorting performance, but adds memory usage per-connection
  #db_work_mem: "40MB"

  ## Which Git revision should this container use? (default: tests-passed)
  #version: tests-passed

env:
  LANG: en_US.UTF-8
  # DISCOURSE_DEFAULT_LOCALE: en

  ## How many concurrent web requests are supported? Depends on memory and CPU cores.
  ## will be set automatically by bootstrap based on detected CPUs, or you can override
  UNICORN_WORKERS: 2

  ## TODO: The domain name this Discourse instance will respond to
  ## Required. Discourse will not work with a bare IP number.
  DISCOURSE_HOSTNAME: forum.xxxx.co.uk

  ## Uncomment if you want the container to be started with the same
  ## hostname (-h option) as specified above (default "$hostname-$config")
  #DOCKER_USE_HOSTNAME: true

  ## TODO: List of comma delimited emails that will be made admin and developer
  ## on initial signup example 'user1@example.com,user2@example.com'
  DISCOURSE_DEVELOPER_EMAILS: 'admin@xxxx.co.uk'

  ## TODO: The SMTP mail server used to validate new accounts and send notifications
  # SMTP ADDRESS, username, and password are required
  # WARNING the char '#' in SMTP password can cause problems!
  DISCOURSE_SMTP_ADDRESS: in-v3.mailjet.com
  DISCOURSE_SMTP_PORT: 587
  DISCOURSE_SMTP_USER_NAME: xxxx
  DISCOURSE_SMTP_PASSWORD: "xxxx"
  #DISCOURSE_SMTP_ENABLE_START_TLS: true           # (optional, default true)

  ## If you added the Lets Encrypt template, uncomment below to get a free SSL certificate
  LETSENCRYPT_ACCOUNT_EMAIL: admin@xxxx.co.uk

  ## The CDN address for this Discourse instance (configured to pull)
  ## see https://meta.discourse.org/t/14857 for details
  #DISCOURSE_CDN_URL: //discourse-cdn.example.com

## The Docker container is stateless; all data is stored in /shared
volumes:
  - volume:
      host: /var/discourse/shared/standalone
      guest: /shared
  - volume:
      host: /var/discourse/shared/standalone/log/var-log
      guest: /var/log

## Plugins go here
## see https://meta.discourse.org/t/19157 for details
hooks:
  after_code:
    - exec:
        cd: $home/plugins
        cmd:
          - git clone https://github.com/discourse/docker_manager.git
          - mkdir -p plugins
          - git clone https://github.com/discourse/discourse-adplugin.git
          - git clone https://github.com/discourse/discourse-affiliate.git

## Any custom commands to run after building
run:
  - exec: echo "Beginning of custom commands"
  ## If you want to set the 'From' email address for your first registration, uncomment and change:
  ## After getting the first signup email, re-comment the line. It only needs to run once.
  - exec: rails r "SiteSetting.notification_email='admin@xxxx.co.uk'"
  - exec: echo "End of custom commands"

sam · June 3, 2020, 8:11am

This is a pickle, what version of docker are you running, what does docker info return?

carlb · June 3, 2020, 8:15am

you’re not wrong! Its as if the database cannot be found, when you go into the site its just a screen with headings

docker info :-

Client:
 Debug Mode: false

Server:
 Containers: 2
  Running: 1
  Paused: 0
  Stopped: 1
 Images: 6
 Server Version: 19.03.11
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 4.4.0-179-generic
 Operating System: Ubuntu 16.04.6 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 1.953GiB
 Name: TriTalk-Discourse
 ID: SYIS:XPWU:W2SP:NYNA:GFP7:DNVK:E7JF:553N:EGWF:OR7M:TV2E:A6ZX
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support

sam · June 3, 2020, 8:19am

Can you try safe mode?

carlb · June 3, 2020, 8:26am

I used safe mode and if i go in and don’t disable plugins it fails. So seems one of the plug ins is playing up since the upgrade. I comment them out from the yml file, rebuild app and it all works. It is either discourse-affiliate or discourse-adplugin. I will investigate further later but at least the site is up and runnign again. Thansk for all your help. note to self use safe mode to do prelim checks in future!

itsbhanusharma · June 3, 2020, 8:45am

carlb:

        cmd:
          - git clone https://github.com/discourse/docker_manager.git
          - mkdir -p plugins
          - git clone https://github.com/discourse/discourse-adplugin.git
          - git clone https://github.com/discourse/discourse-affiliate.git

Your plugins area doesn’t seem to be formatted correctly. maybe that is causing the issue?

it should actually look like:

        cmd:
          - git clone https://github.com/discourse/docker_manager.git
          - git clone https://github.com/discourse/discourse-adplugin.git
          - git clone https://github.com/discourse/discourse-affiliate.git

(remove the -mkdir -p plugins)
Hopefully that will solve the issue with plugins.

carlb · June 3, 2020, 9:06am

Thanks Bhanu. I’ll give that a go, strange it’s worked until today though. Maybe something has been tightened up in the latest docker release

itsbhanusharma · June 3, 2020, 9:10am

I could be wrong here. I’ve never used that parameter in my yml files and it works just fine so far.

neounix · June 3, 2020, 9:59am

Hi @carlb,

FYI, Redis is used primarily for background jobs (along with it’s companion job scheduler sidekiq); and, FWIW, when I am in the development environment, I often run Discourse without Redis or Sidekiq running because I don’t want these “background jobs” running.

Hence, your Discourse forum should still display the core forums, the topics, the users, the posts, etc. even if Redis is not running. I only reply with this FWIW, because you title your topic:

Redis Error after upgrade

… and then you helpfully provided a screen capture of your forum with a lot of basic problems which are not directly related to Redis per se.

carlb · June 3, 2020, 10:16am

thanks for the explanation, I’m not very knowledgeable on how any of this fits together as it always just works so never needed to delve too deeply so that helps me in the future. I just assumed as the only errors I found all mention failing to connect to redis that this may be the root cause. None of the usual things were working that always solve a basic issue either.

system · July 3, 2020, 10:30am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Error connecting to Redis Installation	17	6911	March 21, 2023
Issue with new update: debugging options Installation	16	155	August 30, 2024
Redis Problems? (Forum broken after upgrade) Installation	34	3077	December 24, 2021
Discourse web interface becomes unresponsive a few minutes after starting Installation	35	6703	January 9, 2018
Error connecting to Redis on localhost - Rebuilt Installation	14	1765	January 29, 2020

Redis Error after upgrade

Related topics