I’m running a rather small Discourse installation for some years now. It’s fairly low traffic, hence it took a while to notice that sending of emails (notifications, digests) obvoiusly began to fail some months ago. Forensics point to the upgrade to 2.8.0.beta7 around 2022-10-22, previously we were at 2.8.0.beta4. At least I haven’t received any email from that installation about posts or messages.
The emails pile up in Sidekiq, with a message I can’t relate to, nor find anything fitting when searching for it — there where reports of undefined method messages, but none of the conditions match mine. (It’s not TLS, it’s no timeout to the mail server, it’s not the events plugin, and the secure media fix should be in already — besides, the exact error messages were different.)
Error in Sidekiq:
Wrapped NoMethodError: undefined method `value' for #<Array:0x00007f7fd5277d68> Did you mean? values_at
The part after #<Array: is different for each held email. I reinstalled Discourse to a new VM last night and restored from a fresh backup — but seemingly the email issue was restored with the data
As the error rate started to climb end of October, I’m rather sure this was introduced by 2.8.0.beta7:
## Plugins go here
## see https://meta.discourse.org/t/19157 for details
- git clone https://github.com/discourse/docker_manager.git
## Any custom commands to run after building
Yes, it is a standard install; AFAICS there is only one container running. (Bring up a new VM, repoint DNS, apt update, apt dist-upgrade, reboot, git pull, ./discourse-setup, Web-based setup, upload backup, restore backup, re-enable mail, see the errors again.) Note that the old install was able to email the link to the backup, and sending test emails still works in the new install — it’s seemingly only posting-related mails that fail.
root@discourse:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ac408a70305d local_discourse/app "/sbin/boot" 12 hours ago Up 12 hours 0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp app
root@discourse:~# apt update
Hit:1 https://download.docker.com/linux/debian buster InRelease
Hit:2 http://deb.debian.org/debian buster InRelease
Get:3 http://deb.debian.org/debian buster-updates InRelease [51,9 kB]
Get:4 http://security.debian.org/debian-security buster/updates InRelease [65,4 kB]
Fetched 117 kB in 2s (60,5 kB/s)
Reading package lists... Done
Building dependency tree
Reading state information... Done
All packages are up to date.
The first restores actually failed, though. database_restorer.rb failed in restore_dump due to duplicated links in post_id 3841. I ended up replacing these links in a post from 2017 by a screenshot of them on the old install and doing another backup; only then I was able to restore the backup on the new install. As you mentioned postgres, this happened during CREATE INDEX with ERROR: could not create unique index "unique_post_links". Further info: EXCEPTION: psql failed: DETAIL: Key (topic_id, post_id, url)=(1300, 3841, [redacted]) is duplicated.
While I don’t think this is directly related, I though I should mention it.
So, if nobody knows an easy way how to fix it, let’s debug this properly. But I need your help, as Discourse is a rather complex Application using a bunch of technologies I’m not familiar with. So: How are Mails “send to” Sidekiq?
Which component of Discourse is giving Sitekiq a method “value”?
With email notifications stopping to work after an Upgrade, Discourse is unfortunately rather useless now. Instead of immediate attention, topics now take days to get attention, if any, as people don’t do active polling in 2022. No notification ⇒ nothing happened ⇒ no need to check the site
As I said, old site is running, I did a backup and put that onto a fresh install, restore failed. I modified the post noted as the culprit until restore on the fresh install worked. Only to see having the issue with Sitekiq again/still.
The old site runs postgres 13 as well (but goes back several years, so it most likely didn’t start with that version )
OK, so if the indexes are repaired then this suggests to me that something is getting called with an array rather than a model that has value. The problem isnt’ with sidekiq, per se, but with the function that sidekiq is causing to get called.
So it sounds like something is getting called that’s returning an array rather than a single item, but I can’t guess just what. I think you’ll need to look in /var/discourse/shared/standalone/logs/rails/production.log (or something very much like that if my fingers or memory are failing me). Then you can look in those logs for that error (or cause it to happen again so it’ll be at the end of the file). You should get more information about what is failing there.
Started POST "/sidekiq/retries" for 220.127.116.11 at 2022-04-11 16:31:35 +0000
Started GET "/sidekiq/retries" for 18.104.22.168 at 2022-04-11 16:31:35 +0000
Rendered email/notification.html.erb (Duration: 42.8ms | Allocations: 4323)
Rendered layouts/email_template.html.erb (Duration: 0.3ms | Allocations: 29)
Job exception: undefined method `value' for #<Array:0x00007ff393af6c78>
Did you mean? values_at
shared/standalone/log/rails/production_errors.log is empty.
is – in the Discource code? – where Sidekiq bails out?
This is called from …
228 MessageBuilder.custom_headers(SiteSetting.email_custom_headers).each do |key, _|
*229 value = header_value(key)
231 # Remove Auto-Submitted header for group private message emails, it does
232 # not make sense there and may hurt deliverability.
As per RFC, many fields can appear more than once, we will return a string of the value if there is only one header, or if there is more than one matching header, will return an array of values in order that they appear in the header ordered from top to bottom.
Discourse automatically sets a Precedence header, so because you’re adding one as well via the email_custom_headers setting, there are now twoPrecedence headers, and @message.header["Precedence"] is returning an array instead of a string.
I think this bug will be triggered any time email_custom_headers contains a header that already exists on the message object.