SMTP failure not reported and too many login attempts on backlog clearing

I noticed for the past few days no emails were being sent for posts on the forum. I checked the dashboard and found 0 emails for the past 3 days. No idea why, I had to reboot the server and then it started sending emails but here are a few issues I am noticing:

  1. There was no notification provided on the admin account that it’s having issues sending emails, even the logs didn’t show any errors
  2. After the reboot it starting sending the backlog of emails but the problem is after sending the initial batch of 100 emails now I’m getting an error Job exception: 454 4.7.0 Too many login attempts, please try again later and it’s stuck in that loop continuously trying to login to sending the remaining emails but the server is rejecting it.
    It looks like for EACH email it’s trying to login to the SMTP server. This IMHO is a bug, it should not login to the server for each email when there is more than one email to send, it should reuse the existing connection.
  3. Now it’s stuck in this trying to login loop even few seconds, how do I stop it and ask it to back off for 10-15 minutes and try again?

My guess is that your login credentials are wrong and that’s why it’s continuing to try to log in.

No it’s right, it was working until 3 days ago for the past few months and it start working again after I restarted the server (just rebooted linux). I haven’t made a single change.

The bugs however remain, see my points 2 and 3 above. The problem is it’s trying to login before sending each email (which now has hundreds of backlogged emails because for some reason the emails stopped being sent a few days ago). So after sending about 30 emails or so the SMTP server blocks it because it re logs in before sending each email. Now I have to manually stop the discourse server, wait for 10 minutes and then restart it and then again it sends 30 emails and logs in 30 times and then again the smtp server blocks it for too many logins.
This isn’t correct, it should reuse an existing login to send emails and it should backoff if the smtp server responds with too many logins and report it to the administrator.

Plus there is the other issue of why it didn’t inform the admin that emails are not being sent. I think the email component/engine would have just stopped after an web upgrade causing all the emails to queue up and when I rebooted the machine it started sending them. There are no errors in the logs files at all until after the reboot when the smtp server pushed back after too many logins.

@tgxworld @eviltrout @codinghorror - anyone thoughts on why discourse is trying to authenticate for every eMail in the backlog queue and how to have it backoff when the server errors out?

I don’t see this as a bug, either a configuration issue or a feature request for better logging.

We are only doing what you tell us per your SMTP settings, if you set the password we try to authenticate.

How is this a logging issue?

If there are 200 emails in the backlog (a whole different issue as to why the email module stopped working creating a backlog), it shouldn’t authenticate 200 times. It should authenticate once and then send the 200 emails in a single authenticated session. It would be inefficient to authenticate, send one email, disconnect and do this over 200 times.

The other issue is that if the STMP is asking it to backoff it doesn’t but keeps hammering away at it, there should be a backoff algorithm to wait and then retry. I fail to see how these are logging issues.

Have you tried sending a text message from the admin /email page since this problem started?

Yes, that’s how I started to debug the issue, it didn’t send the eMail. Then I rebooted the server and email started working again (again no changes to any configuration) but now it started sending the 266 backlog emails and after the initial batch the STMP started throwing and error about too many logins and there where I figured out that Discourse was authenticating with the SMTP server for each individual email separately causing it to push back. I had to manually stop the server for 10 minutes then start it, it would send the next batch of 30-50 emails and then again the SMTP would push back, then I’d stop the server, wait 10 minutes and then start until the entire backlog was cleared.

I still fail to see how this has anything to do with logging. It’s an inefficient and possibly incorrect way to implement sending multiple emails.

That is indeed very strange. What mail server is it?

GSuite - Google Business

I tried to take a look at the code to see how emails are built and delivered. I didn’t find any specific place where the emails are “queued” up into a backlog and then delivered. It looks like each service/module sends emails independently.

However I couldn’t see any “queue”, so I’m just left wondering why discourse decided not to send any emails for 3 days and then to send 266 emails after rebooting the server. It’s almost like the notification system just went offline and then came back online after a reboot and iterated through all pending notifications from each module. Again I couldn’t find any “single” piece of code that does this.

Since it appears that each notification is send independent of the other, I guess there’s no way to “reuse” an authenticated SMTP connection. The crux of the code appears to be:

begin
        @message.deliver_now
rescue *SMTP_CLIENT_ERRORS => e
        return skip(e.message)
end

Given my rudimentary understanding here, there doesn’t appear to be any control over the SMTP connection.

If so, as there’s no way to control the SMTP connection,

  1. instead can there be a rate limit option provided in the eMail settings to not deliver more than X messages per minute or second? This would help where SMTP servers have rate limits and would help in situations like this when messages get backlogged (for whatever reason) and then suddenly cleared.
  2. alternatively can the email sender module process the SMTP error when it pushes back saying that it’s logged in too many times and back off before for X minutes before sending it again (again this could a configurable parameter in the settings)

@eviltrout looks like you seem to have a fair amount on the eMail notifications. Any thoughts on the above?

We have no support for email rate limiting and no plans to add it.

Okay so how do you handle this situation when discourse is sending hundreds of emails in a short span and the SMTP server is rejecting the multiple logins for each email?

Get a better SMTP server?

Google business is a perfectly legitimate service. It isn’t unreasonable to ask a client not to login 100 times in a second to send 100 emails.

Sounds like you’re blaming google for implementing reasonable DDOS protection mechanisms.

Isn’t there a way to for the email send to process the SMTP error through the skip method and handle it more gracefully?

10,000 sites are using this software and not having the trouble that you describe. My best guess is that you’re using a Google SMTP server that’s intended for a single user and not for delivering hundreds of messages per hour.

Fair enough. I guess when more folks start having the issue it can be looked into.

That only brings me back to why the messages weren’t sent for 3 days and I had to reboot the server to get it started again. I’ll keep an eye out if it happens again.

Even the GSuite for business gives 2000 emails per day Email sending limits - G Suite Admin Help which is probably not suitable for sending digest emails to a forum with many users

the alternatives

  • getting a paid smtp server
  • disabling digests

is there any others ?

Solo quería intervenir aquí. Tengo el mismo problema, pero es más molesto que la experiencia del autor del hilo original. También uso GSuite y tengo configurada una “contraseña de aplicación” separada para Discourse. Sin embargo, he notado que mi problema no se debe a la cantidad de correos electrónicos por día ni a los límites de tasa de SMTP. El foro donde ocurrió esto apenas tenía actividad en términos de publicaciones, correos de resumen o cualquier cosa similar. Lo que sucede es que, de vez en cuando, falla la autenticación SMTP; esto tiene que ver con un error del lado de Google. De vez en cuando (cada 3-5 meses, por lo general), las autenticaciones SMTP fallan durante unos minutos. Normalmente, cuando veo esto en mi cliente de correo, simplemente espero cinco minutos y lo intento de nuevo, y funciona como por arte de magia. Bueno, con Discourse, un fallo lo vuelve loco. Incluso con solo unos pocos mensajes en cola, comienza a sobreesforzarse, intentando iniciar sesión una y otra vez por cada mensaje en la cola. El resultado es un bombardeo de autenticación SMTP que activa un mecanismo diferente de protección contra ataques de fuerza bruta dentro de Gmail, y muy rápidamente simplemente deja de procesar los intentos de inicio de sesión y muestra errores. Así que, en este punto, realmente no importa si la contraseña es correcta y el problema de autenticación de Gmail se resuelva; para entonces ya estás bloqueado porque Discourse se volvió loco.

Olvídate de la limitación de tasa; enfócate en lo siguiente:

  • Mejorar la eficiencia: enviar varios correos electrónicos en una sola conexión TCP, por empezar. Esto es estándar y reduce la cantidad de sockets activos y la sobrecarga de red.

  • Implementar un retraso de reintento incremental sensato, donde si falla la autenticación SMTP, no hay razón para intentarlo de nuevo en un período tan corto. Cuando la autenticación SMTP falla, ya sea deshabilitar el envío de correos electrónicos y mostrar una alerta en pantalla al administrador de que la autenticación SMTP ha fallado, o implementar un mecanismo de retraso serio, de modo que el segundo intento ocurra 60 segundos después, el tercero en 5 minutos, el cuarto en 30 minutos, etc.

Así que ahora básicamente tengo que cambiar mis configuraciones SMTP para deshabilitar el correo electrónico, luego reconstruir la aplicación, esperar 24 horas, revertir mis configuraciones y volver a reconstruir. Incluso una simple casilla de verificación en la configuración de administración para deshabilitar el envío de correos electrónicos sería 1000% mejor que esto.

Existe tal configuración. Busca ‘deshabilitar correo electrónico’.

Véase también Troubleshoot email on a new Discourse install - #362