I just caught another occurrence of this.
We have a number of queries being blocked looking like
UPDATE "posts" SET reads = reads + 1 WHERE ("posts"."deleted_at" IS NULL) AND (topic_id = 47 and post_number = 587)
They’re blocking each other, and eventually they are blocked by
SET avg_time = (x.gmean / 1000)
round(exp(avg(ln(msecs)))) AS gmean
INNER JOIN posts AS p2
ON p2.post_number = post_timings.post_number
AND p2.topic_id = post_timings.topic_id
AND p2.user_id <> post_timings.user_id
GROUP BY post_timings.topic_id, post_timings.post_number
) AS x
WHERE (x.topic_id = posts.topic_id
AND x.post_number = posts.post_number
AND (posts.avg_time <> (x.gmean / 1000)::int OR posts.avg_time IS NULL))
When there are enough posts being read, too many UPDATE querys are being executed and blocked, and all processes get exhausted and the forum becomes non-responsive.
In my experience, the issue here is the
UPDATE .... SELECT pattern where the UPDATE blocks during the (long) execution time of the SELECT. I’m not a real Postgres guru but maybe this could be avoided by (pseudo Ruby code)
array = SELECT ...
foreach array AS row