Need help with Sphinx integration


#1

We want to transfer our site content to Discourse. For that we’ve setup a test site to check the performance of Discourse and find out that it’s built-in search is returning results very slowly (10 to 30 seconds) after loading 200K dummy topics. We are thinking to integrate Sphinx. We are not that much knowledgeable on RoR, Docker and Discourse. If anyone here integrated Sphinx, please help/guide us.


(Jeff Atwood) #2

With a lot of topics, you will need to enable the “search prefer recent posts” site setting.


#3

Hi Jeff, thanks for your fast reply. We already tried that option. Our site type is such users will search for the oldest posts more. Our current site has 1 million topics. We’re quite sure once we load that much data, built-in search won’t perform as expected. If you can just guide us towards right direction on how we can integrate Sphinx, it’d be a great help. We’re willing to pay if someone can do that. Please help us. Fast search is crucial for our site. Thank you again.


(Jay Pfaffman) #4

How much RAM does your server have?

How fast is the CPU?

Data is on SSD?


#5

Hi Jay, thanks for your reply. Our test server has 8gb RAM, 4 vCores (3.5 GHz) and storage is 80gb SSD. It’s a VPS server, not dedicated. We don’t know the processor model. Let me know if you need any other info.

We really like Discourse and want to stick with it. Any help regarding the issue will be much appreciated.


(Jay Pfaffman) #6

And you did a standard install using discourse-setup?


(Jeff Atwood) #7

We don’t have any particular advice in this scenario. It is too bad “prioritize recent posts” doesn’t fit your use case. It does fit most discussion use cases we see, though, and you’ll note that Google itself puts a fair bit of weight on more recent content over very old content.


#8

Hi, Jay. Yes, we followed this to install Discourse: discourse/INSTALL-cloud.md at master · discourse/discourse · GitHub

Thanks for your reply.


#9

Hi Jeff, thanks for your reply. We already decided to stick with Discourse so we must find a way. We’re still working on this issue. Will update here if we overcome it.


(Jay Pfaffman) #10

I think what you need to do is to play around with the number of unicorn workers and the shared buffers and work_mem of the database.