Full text search and words delimited by punctuation


(zhongkeyuanict) #1

hello everyone,

i wrote a topic with the following message:
Caused by: java.io.IOException: Wrong number of partitions in keyset:99

but when i search with keyword: “IOException”, it does not match anything.
I can only search this toipc with keyword:“java.io.Exception”.

Is there any way to search this topic with keyword:“IOException”?


Discourse's internal search does not find the phrase «PagedownCustom», but Google does
Searching results not displaying expected topics
(Jens Maier) #2

Hm, this doesn’t reproduce on meta. In fact, if you search for IOException, this very thread pops right up…

Does your forum use the default english locale or a translation? Did you customize PostgreSQL’s full text search settings?


(Luke S) #3

Not conclusive. OP has “IOException” listed in the search text as a single entity. Try for the last part of:
java.io.FileNotFoundException:

(I have no idea if this is a real exception)

EDIT: And I just tried it. Search didn’t find this thread.


(Jeff Atwood) #4

Hard to say… could be words with periods (or other punctuation) in between are not turned into tokens in Postgres?

Try finding the word beginning with the letter s in the below:

this.is.the.word.shamalom.in.the.middle.of.periods


(Jens Maier) #5

Yeah, Sailsman63 is correct. Searching this.is.the.word returns this thread, searching shamalom doesn’t.


(Alexander) #6

…except that now it will.


(Jens Maier) #7

Yeah, I guess it does. :sweat_smile:


(zhongkeyuanict) #8

Is there any way to fix this issue?

or can i use wildcard to get rid of issue?


(Jens Maier) #9

This problem is caused by PostgreSQL’s builtin full text search functionality that Discourse uses. Altho it could probably work around Postgres’ behaviour, there is nothing for Discourse to fix…

You can always change PostgreSQL’s full text search settings, and these changes would persist through Discourse upgrades, changes and docker container deployments as they are part of the database.


(zhongkeyuanict) #10

i am a new comer to PostgreSQL, could you please tell me how to modify the search function?


(Jens Maier) #11

Sorry, that’s something I can’t really help you with.

Here’s the PostgreSQL 9.3 documentation regarding full text search: PostgreSQL: Documentation: 9.3: Full Text Search

The biggest problem appears to be that the default parser treats words with embedded dots as hostnames instead of (also?) splitting them into individual tokens:

discourse_development=# select alias, description, token from ts_debug('java.io.IOException');
 alias | description |        token
-------+-------------+---------------------
 host  | Host        | java.io.IOException

Tl;dr, the PostgreSQL full text parser is too smart for its own good and the recommended solution is to pre-parse documents in the application. :expressionless:


(Sam Saffron) #12

Fixed per: