Wingtip
17. Januar 2019 um 14:16
21
I think I may have narrowed it down a bit. I was unable to replicate the 502 errors on my mod/admin account, so I impersonated a user who had the problem and encountered it. When the user hits the submit reply button on a longer thread (4700 posts), the page sits on “saving” for a good 20 seconds before eventually failing to post with a 502 error.
Granting that user moderator status immediately fixes the problem. He was TL2, and granting TL3 did not fix it.
8 „Gefällt mir“
That is very interesting, cc @sam . Good sleuthing!
4 „Gefällt mir“
sam
(Sam Saffron)
17. Januar 2019 um 19:49
23
Since this is new, I am guessing this could be due to the consecutive reply check, will have a look later today
sam
(Sam Saffron)
17. Januar 2019 um 20:44
24
Given we have an easy bypass per:
Can you set max consecutive replies in your site settings to 0 and let me know what happens?
3 „Gefällt mir“
Wingtip
17. Januar 2019 um 20:51
25
I think that MAY have fixed it-- it isn’t 100% reproducible, and I closed the thread I was testing in earlier, but re-opening it I was not able to reproduce. Will ask the users if they’re seeing any more 502s.
2 „Gefällt mir“
sam
(Sam Saffron)
17. Januar 2019 um 20:56
26
hmmm cause I just tested this query in data explorer on a 20k post topic and it is lightning fast.
SELECT user_id
FROM posts
WHERE deleted_at IS NULL
AND NOT hidden
AND topic_id = SOME_TOPIC_ID
ORDER BY post_number DESC
LIMIT 3
I would love to get to the bottom of this.
2 „Gefällt mir“
Wingtip
17. Januar 2019 um 20:59
27
Happy to run whatever you want to help debug, but I’m not familiar with postgres or ruby so need some guidance.
3 „Gefällt mir“
sam
(Sam Saffron)
17. Januar 2019 um 23:26
28
Awesome, can you install Data Explorer Plugin and then run the query above substituting SOME_TOPIC_ID with the topic id of the problem topic?
2 „Gefällt mir“
Wingtip
18. Januar 2019 um 00:49
29
All set. The topic I tested earlier:
3 results. Query completed in 0.5 ms.
and here is the output from the largest topic on the forum that’s not locked, 15k posts:
3 results. Query completed in 0.6 ms.
So seems really fast.
Also so far nobody has seen a 502 error, since we set that parameter to 0.
3 „Gefällt mir“
sam
(Sam Saffron)
18. Januar 2019 um 02:30
30
Thank you so much for ing with me.
I believe I just fixed the culprit here:
committed 02:18AM - 18 Jan 19 UTC
The `posts` relation on `Topic` is not ordered. Using `Topic.posts.first`
is bas… ically the same as asking for a random post, it will depend on DB
order. This breaks on Topic merge and split for example.
Additionally, a huge problem with that is that it forces active record down
a slow path. `Topic.posts.first` is extremely slow on giant topics, since
it has no default ordering it appears AR materializes the entire set prior
to doing `first`.
This commit also illustrates the importance of testing, initially I only
fixed the second instance of the problem in `post_validator.rb` but testing
revealed that the problem was repeated at the top of the file.
Longer term we should consider a larger change of default ordering the posts
relations so people do not fall down this trap anymore.
Any chance you can update to latest and re-enable the setting? Let me know if the issue is still gone?
6 „Gefällt mir“
Wingtip
18. Januar 2019 um 04:22
31
Not sure how to update, dashboard says I’m already up to date and running git pull in /var/discourse says the same thing.
Just issue the rebuild from the command line and you should be good. You can time it this way too
2 „Gefällt mir“
pfaffman
(Jay Pfaffman)
18. Januar 2019 um 12:48
33
Did you try visiting /admin/upgrade?
1 „Gefällt mir“
Wingtip
18. Januar 2019 um 15:35
34
OK, I updated (went to admin/upgrade manually, there was no link in the dashboard) and reverted the consecutive replies setting to 5. So far no errors, asked users to verify.
Thanks again for your help with this! Much appreciated!
3 „Gefällt mir“
jomaxro
(Joshua Rosenfeld)
18. Januar 2019 um 15:49
35
There will only be a link from the dashboard when we release a new beta. We add code daily (hourly, even) - if we notified you for every single commit, your site would always say it’s out of date .
2 „Gefällt mir“
Wingtip
18. Januar 2019 um 15:52
36
Makes sense, that’s why I tried a git pull earlier also. Anyway it’s upgraded now.
4 „Gefällt mir“
Thanks for your diligence in staying on top of this, we found a subtle but important issue as a result.
5 „Gefällt mir“