Escaping the "#" character in search

Is there an escape character for search?

One of our users was trying to search for “pozidrive #4” and it returned no hits, searching for “Pozidrive” works, but he was looking specifically for #4.

1 Like

You just hit 3 bugs-in-one … you get the :jack_o_lantern: :hotsprings:

Your actual bug:

#4 is treated as a category filter and by adding it search is assuming you are trying to find all topics in the category id 4, or something along those lines. # should not allow for category id, and always just pass through misses.

Two side bugs I discovered while answering this:

  • Our prettifier is mucking big time with search results. “pozidrive” as a search term returns zero results, most likely cause of the quotes. I have to fix that.

  • Our # autocomplete is all messed up in composer, #2space and the autocomplete is still open. It should always unconditionally close on space.

8 Likes

Quoted search strings work on our instance:

image

This is not about that, it is about “americandream” <= try searching for that sans quotes

2 Likes

@tgxworld I would like you to take the “cannotfindthis” problem. This is actually only a problem for our hosting.

Our default PG install returns:

local and image

[1] pry(main)> Post.exec_sql("select to_tsvector('english', '“test”')").to_a
=> [{"to_tsvector"=>"'test':1"}]

Our production pg

[1] pry(main)> Post.exec_sql("select to_tsvector('english', '“test”')").to_a
=> [{"to_tsvector"=>"“test”':1"}]

And:

Local + Dev + Image

pry(main)> Post.exec_sql("select ts_debug('english', '“test”')").to_a
=> [{"ts_debug"=>"(blank,\"Space symbols\",“,{},,)"},
 {"ts_debug"=>"(asciiword,\"Word, all ASCII\",test,{english_stem},english_stem,{test})"},
 {"ts_debug"=>"(blank,\"Space symbols\",”,{},,)"}]

PRD

pry(main)> Post.exec_sql("select ts_debug('english', '“test”')").to_a
[{"ts_debug"=>"(word,\"Word, all letters\",“test”,{english_stem},english_stem,{“test”})"}]

So there is something about our PG install in our PRD containers that is missing “smart quotes” from the space chars. This is going to cause some severe problems in search quality so we need it fixed asap.

The OP is fixed per

https://github.com/discourse/discourse/commit/97fa64d8468ab91265c9a8b1b9f42e62c40ca8de

Keep in mind # in search is used as a special delimiter to search for tags and categories, I added a bypass here if there are zero matching categories and tags. However, if you do a search for say #docker on meta, it will not find this post cause docker is a legit tag name.

There is an open question to @codinghorror on weather we should allow unqualified #TAG as opposed to requiring qualification for tags like the auto complete adds (eg: for tag search require #docker::tag and always pass through #docker )

I am mixed on changing this cause the bypass should be good enough ™

Leaving open so @tgxworld can sort out the extreme problem of “test” in our infrastructure and look at autocomplete which is weird.

@tgxworld feel free to split this up and close if you wish.

5 Likes

Closing this as I’m bringing the discussion into our internal forum.