We’ve had a recent influx of users and traffic surge, and with that, more searches performed simultaneously.
We’re starting to see search timeouts across about 1.2 million posts.
What are you larger forums doing Search-wise? Have you moved outside of core Discourse search feature or is it still performing well? Perhaps updated the search algo/queries to be more restrictive/specific? DB tuning?
What are your server specs and what resources seem most constrained?
I’m looking into potentially moving a board with about 20 million posts (public posts + private conversations) so I’m definitely interested in learning about potential performance issues with large boards.
It’s a long way from 1M posts to 20M posts, so there’s little advice that will apply to both of you. FWIW, I’ve got a site with 9M posts that has acceptable performance on a i7-7500U @ 2.7Ghz with 16GB of ram. There are a few other low volume sites on the same server.
A site using AWS database with 8GB of ram with 4M posts is having some struggle on searches especially.
We’re on AWS. Forums are running on 4 instances, I believe t3.larges, so 4 vcpus and 16 GB ram. We’re running Kubernetes on a custom docker image. Our DB is on it’s own instance, m4.large, so 2 vcpus and 8 GB ram. The ram on the db instances could be the bottleneck for sure.
The queries that are timing out are generally more generic terms, single words. We’re had some run for greater than 60s. Now, we have also seen an influx of traffic to our website in the recent month. Last month, the month over previous month traffic jumped 40%. We also moved from v1.9 to v2.4 during that time, and I know since 1.9, Search has added the autocomplete search, so I assume that’s more load on the db server.
We did some query analysis and the following query was running anywhere from 20-60s:
`SELECT "posts".*
FROM "posts"
JOIN (
SELECT *, row_number() over() row_number
FROM (
SELECT topics.id, min(posts.post_number) post_number
FROM "posts"
INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id"
INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" AND ("topics"."deleted_at" IS NULL)
LEFT JOIN categories ON categories.id = topics.category_id
WHERE
("posts"."deleted_at" IS NULL)
AND "posts"."post_type" IN (1, 2, 3)
AND (topics.visible)
AND (topics.archetype <> 'private_message')
AND (post_search_data.search_data @@ TO_TSQUERY('english', '''a'':*ABD & ''price'':*ABD'))
AND (categories.id NOT IN (SELECT categories.id WHERE categories.search_priority = 1))
AND ((categories.id IS NULL) OR (NOT categories.read_restricted))
GROUP BY topics.id
ORDER BY
MAX((
TS_RANK_CD(
post_search_data.search_data,
TO_TSQUERY('english', '''a'':*ABD & ''price'':*ABD'),
1|32
) * (
CASE categories.search_priority
WHEN 2
THEN 0.6
WHEN 3
THEN 0.8
WHEN 4
THEN 1.2
WHEN 5
THEN 1.4
ELSE
CASE
WHEN topics.closed
THEN 0.9
ELSE 1
END
END
)
)) DESC,
topics.bumped_at DESC
LIMIT 6
OFFSET 0
) xxx
) x ON x.id = posts.topic_id AND x.post_number = posts.post_number
WHERE ("posts"."deleted_at" IS NULL)
ORDER BY row_number;
---
Sort (cost=495214.55..495214.56 rows=1 width=1177) (actual time=20529.725..20529.727 rows=6 loops=1)
Sort Key: (row_number() OVER (?))
Sort Method: quicksort Memory: 29kB
-> Nested Loop (cost=495164.08..495214.54 rows=1 width=1177) (actual time=20525.899..20529.703 rows=6 loops=1)
-> WindowAgg (cost=495163.65..495163.80 rows=6 width=16) (actual time=20520.976..20521.078 rows=6 loops=1)
-> Limit (cost=495163.65..495163.66 rows=6 width=24) (actual time=20520.969..20521.063 rows=6 loops=1)
-> Sort (cost=495163.65..495232.24 rows=27438 width=24) (actual time=20520.967..20520.969 rows=6 loops=1)
Sort Key: (max((ts_rank_cd(post_search_data.search_data, '''price'':*ABD'::tsquery, 33) * (CASE categories.search_priority WHEN 2 THEN 0.6 WHEN 3 THEN 0.8 WHEN 4 THEN 1.2 WHEN 5 THEN 1.4 ELSE CASE WHEN topics.closed THEN 0.9 ELSE '1'::numeric END END)::double precision))) DESC, topics.bumped_at DESC
Sort Method: top-N heapsort Memory: 25kB
-> GroupAggregate (cost=493642.90..494671.83 rows=27438 width=24) (actual time=19082.214..20506.763 rows=32951 loops=1)
Group Key: topics.id
-> Sort (cost=493642.90..493711.50 rows=27438 width=400) (actual time=19082.184..19283.907 rows=191436 loops=1)
Sort Key: topics.id
Sort Method: external merge Disk: 77632kB
-> Hash Left Join (cost=36655.60..486646.69 rows=27438 width=400) (actual time=1562.696..18611.724 rows=191436 loops=1)
Hash Cond: (topics.category_id = categories.id)
Filter: (((categories.id IS NULL) OR (NOT categories.read_restricted)) AND (NOT (SubPlan 1)))
-> Gather (cost=36645.63..486471.60 rows=58991 width=400) (actual time=1562.623..18249.349 rows=191436 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Nested Loop (cost=35645.63..479572.50 rows=24580 width=400) (actual time=1556.547..18541.793 rows=63812 loops=3)
-> Hash Join (cost=35645.20..328688.61 rows=157831 width=25) (actual time=1551.912..13356.416 rows=285279 loops=3)
Hash Cond: (posts_1.topic_id = topics.id)
-> Parallel Seq Scan on posts posts_1 (cost=0.00..286245.92 rows=504559 width=12) (actual time=0.280..11249.160 rows=404770 loops=3)
Filter: ((deleted_at IS NULL) AND (post_type = ANY ('{1,2,3}'::integer[])))
Rows Removed by Filter: 21884
-> Hash (cost=33938.80..33938.80 rows=92912 width=17) (actual time=1549.103..1549.103 rows=80351 loops=3)
Buckets: 65536 Batches: 2 Memory Usage: 2557kB
-> Seq Scan on topics (cost=0.00..33938.80 rows=92912 width=17) (actual time=0.010..1492.606 rows=80351 loops=3)
Filter: ((deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text))
Rows Removed by Filter: 216751
-> Index Scan using posts_search_pkey on post_search_data (cost=0.43..0.96 rows=1 width=383) (actual time=0.017..0.017 rows=0 loops=855836)
Index Cond: (post_id = posts_1.id)
Filter: (search_data @@ '''price'':*ABD'::tsquery)
Rows Removed by Filter: 1
-> Hash (cost=9.43..9.43 rows=43 width=9) (actual time=0.059..0.059 rows=43 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> Seq Scan on categories (cost=0.00..9.43 rows=43 width=9) (actual time=0.007..0.045 rows=43 loops=1)
SubPlan 1
-> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=191436)
One-Time Filter: (categories.search_priority = 1)
-> Index Scan using index_posts_on_topic_id_and_post_number on posts (cost=0.43..8.45 rows=1 width=1169) (actual time=1.435..1.435 rows=1 loops=6)
Index Cond: ((topic_id = topics.id) AND (post_number = (min(posts_1.post_number))))
Filter: (deleted_at IS NULL)
Planning time: 3.508 ms
Execution time: 20541.411 ms
(46 rows)`