Bad gateway 502 errors on some specific searches in some specific categories

(Andrew Waugh) #1

I’m not sure what to make of this:

Some specific searches give bad gateway errors

Here is an example:

If I search “remove doorhandle” on our category “E-Type” it returns a list of hits.
If I search “remove doorhandle” on our category “XJ” it returns a list of hits.

If I search “remove engine” on our category “E-Type” it returns a list of hits.
If I search “remove engine” on our category “XJ” it fails with bad gateway

Both of those categories are roughly the same post/thread count, and there should definitely be hits in both categories.

This is 100% consistent. “Engine removal” always fails on the XJ category. “replace doorhandle” never does.

Where do I start looking? The category in question is searchable for other terms.

(Sam Saffron) #2

I would imaging that the words engine and removal are extremely popular on a forum with close to 1 million topics that discusses cars :slight_smile:

My guess here is that the search is timing out and server running this is slightly underpowered.

(Andrew Waugh) #3

You would be right imagining that.
The search was one of the things that the users were well pleased about…

But here is what I don’t understand: I can search for “Switch”, or “Carb”, or “Flywheel” (or combinations thereof) and it doesn’t time out. All of those searches should have to trawl through the same amount of data, why on earth would it time out so consistently for a specific search, but not for others? If it timed out randomly for searches regardless of the term, then I could see that.

(Mittineague) #4

Maybe because searches are “OR” not “AND” ?

That is, using more words in the search phrase does not narrow the results, but broadens them.

That and maybe because words are stemmed and reduced to lexemes could have something to go with it,

(Sam Saffron) #5

It is very hard for me to triage search performance without actually having admin rights on the box.

The first thing that jumps to mind is that you are running an import and we have seen imports not perform as well. The general fix is, backup discourse instance => install new instance => restore backup.

Statistics go completely crazy after giant imports leading to lost of queries taking a lot longer than they really should.

(Andrew Waugh) #6

I realise this. I’m just trying to get my head around it. I’ll ask the guy who did the import if he did a backup>restore post the migration.

(Jeff Atwood) #7

You can also do a full Postgres statistics rebuild. Takes a while and locks the DB but it works.

(Andrew Waugh) #8

I just spoke to Gunnar (who did the import), he did an backup>restore before we went live. The box has 8GB.

I don’t have shell access, I’ve asked him to have a look at free and top.

(Andrew Waugh) #9

I saw a post detailing how to do a statistics rebuild just the other day, but I can’t for the life of me find it again. (help).