OK this is absolutely not PR welcome territory cause it is a very tricky problem to solve.
In particular there are 2 very important things to keep in mind:
-
Diacritic stripping for search should be optional and default off on some languages like Vietnamese. In Vietnamese you never want to strip diacritics cause you end up getting nonsensical results.
-
The excerpts we show for search results should always show the diacritics, cause otherwise it just looks like there are bunch of silly spelling mistakes even in French. This is a hard problem cause we lean on PG to create the excerpts.
Given (1) I am reverting this feature for now, and given (2) this is something we will tackle for 2.2 release.
Reverted per:
https://github.com/discourse/discourse/commit/9b7cab589ac15a034d1c0e700230c1b3f63f8ba0