Hey Discourse devs,
I’m working on improving a full text postgres search and decided to have a peek through Discourse’s implementation, since our use cases are very similar.
Was wondering if any devs (or otherwise knowledgeable persons) might have some insight into the logic behind Discourse search (which, I have to admit, is much more logic-heavy than I was expecting it to be.)
Some specific questions:
- Why the differentiation between a Post and its PostSearchData (ie, why is just putting the ts_vector column straight onto the Post a bad idea?)
- What’s the reasoning behind having separate SearchData classes for each searchable type (PostSearchData, CategorySearchData, etc.), instead of making it a polymorphic relationship? (ie a searchable has_one search_data) Would the resulting generic SearchData table be too massive to work with?
- I see there’s a need to occasionally reindex the search (using the
rebuild_problem_postsmethod); why is that? Does the search data go off over time for some reason?
Any other specific things to be aware of while putting a search like this together?
Thanks so much!