It’s a dedicated server with the following:
AMD Ryzen 7 3800X
64 GB ECC RAM @ 2666 Mhz
2 x 1.2 TB Intel P3600 NVMe SSD (ZFS RAID 1)
The VM running Discourse has been assigned 8 CPU cores and 32 GB RAM.
I believe I have discovered the issue with the first query, or at least a way to instruct the query planner to make the right decision. Here’s the query that would not complete in 16+ hours (this is for the First Quote badge):
SELECT ub.id
FROM user_badges ub
LEFT JOIN (
SELECT ids.user_id, q.post_id, p3.created_at granted_at
FROM
(
SELECT p1.user_id, MIN(q1.id) id
FROM quoted_posts q1
JOIN badge_posts p1 ON p1.id = q1.post_id
JOIN badge_posts p2 ON p2.id = q1.quoted_post_id
WHERE (TRUE OR ( p1.id IN (-1) ))
GROUP BY p1.user_id
) ids
JOIN quoted_posts q ON q.id = ids.id
JOIN badge_posts p3 ON q.post_id = p3.id
) q ON q.user_id = ub.user_id
AND (q.post_id = ub.post_id OR NOT TRUE)
WHERE ub.badge_id = 15 AND q.user_id IS NULL
If I add a single ORDER BY line in the right location, this query now completes in a few seconds:
SELECT ub.id
FROM user_badges ub
LEFT JOIN (
SELECT ids.user_id, q.post_id, p3.created_at granted_at
FROM
(
SELECT p1.user_id, MIN(q1.id) id
FROM quoted_posts q1
JOIN badge_posts p1 ON p1.id = q1.post_id
JOIN badge_posts p2 ON p2.id = q1.quoted_post_id
WHERE (TRUE OR ( p1.id IN (-1) ))
GROUP BY p1.user_id
) ids
JOIN quoted_posts q ON q.id = ids.id
JOIN badge_posts p3 ON q.post_id = p3.id
ORDER BY ids.user_id
) q ON q.user_id = ub.user_id
AND (q.post_id = ub.post_id OR NOT TRUE)
WHERE ub.badge_id = 15 AND q.user_id IS NULL
I feel like it should be smart enough to do this sort in the right place, but it seems it’s not… Still, the fix seems rather easy at this point.
Haven’t really started digging into the other query on the percent_rank yet.