Users reporting lots of 502 errors when attempting to post due to "max consecutive replies" check

(Sam Saffron) #19

Look for the 502s in the nginx logs, what are some example errors? Is there anything in our discourse logs that correlates?

My guess is that something is locking up and timing out. Can you confirm you are on latest?

(Clay Heaton) #20

I’ll take a look tonight or tomorrow - just back from vacation and trying to catch up on some work! Thanks for the help.


I think I may have narrowed it down a bit. I was unable to replicate the 502 errors on my mod/admin account, so I impersonated a user who had the problem and encountered it. When the user hits the submit reply button on a longer thread (4700 posts), the page sits on “saving” for a good 20 seconds before eventually failing to post with a 502 error.

Granting that user moderator status immediately fixes the problem. He was TL2, and granting TL3 did not fix it.

(Jeff Atwood) #22

That is very interesting, cc @sam. Good sleuthing!

(Sam Saffron) #23

Since this is new, I am guessing this could be due to the consecutive reply check, will have a look later today

(Sam Saffron) #24

Given we have an easy bypass per:

Can you set max consecutive replies in your site settings to 0 and let me know what happens?


I think that MAY have fixed it-- it isn’t 100% reproducible, and I closed the thread I was testing in earlier, but re-opening it I was not able to reproduce. Will ask the users if they’re seeing any more 502s.

(Sam Saffron) #26

hmmm cause I just tested this query in data explorer on a 20k post topic and it is lightning fast.

SELECT user_id
  FROM posts
 WHERE deleted_at IS NULL
   AND NOT hidden
   AND topic_id = SOME_TOPIC_ID
 ORDER BY post_number DESC

I would love to get to the bottom of this.


Happy to run whatever you want to help debug, but I’m not familiar with postgres or ruby so need some guidance.

(Sam Saffron) #28

Awesome, can you install Data Explorer Plugin and then run the query above substituting SOME_TOPIC_ID with the topic id of the problem topic?


All set. The topic I tested earlier:

3 results. Query completed in 0.5 ms.

and here is the output from the largest topic on the forum that’s not locked, 15k posts:

3 results. Query completed in 0.6 ms.

So seems really fast.

Also so far nobody has seen a 502 error, since we set that parameter to 0.

(Sam Saffron) #30

Thank you so much for :bear:ing with me.

I believe I just fixed the culprit here:

Any chance you can update to latest and re-enable the setting? Let me know if the issue is still gone?


Not sure how to update, dashboard says I’m already up to date and running git pull in /var/discourse says the same thing.

(Jeff Atwood) #32

Just issue the rebuild from the command line and you should be good. You can time it this way too :wink:

(Jay Pfaffman) #33

Did you try visiting /admin/upgrade?


OK, I updated (went to admin/upgrade manually, there was no link in the dashboard) and reverted the consecutive replies setting to 5. So far no errors, asked users to verify.

Thanks again for your help with this! Much appreciated!

(Joshua Rosenfeld) #35

There will only be a link from the dashboard when we release a new beta. We add code daily (hourly, even) - if we notified you for every single commit, your site would always say it’s out of date :wink:.


Makes sense, that’s why I tried a git pull earlier also. Anyway it’s upgraded now.

(Jeff Atwood) #37

Thanks for your diligence in staying on top of this, we found a subtle but important issue as a result.

(Jeff Atwood) closed #38