Yes this is pretty tricky, I can see how it is a problem for Vietnamese communities, the excerpts must look very confusing.
In Vietnamese is there ever a reason to type without diacritics? eg type khong and mean không? I imagine it is a super duper hard no cause the language is tone based, so the is the equivalent of me typing “dog” and meaning “milk”.
I think the best way forward here is to make diacritic stripping optional and turn this off in Vietnamese communities.
A bigger and more complex change is to amend it so excerpts are generated based off cooked and not the “normalized” cooked text.
FYI English speakers:
dấu: A sign
dàu: Head
dãu: Pudding
So yeah this is a pretty giant issue for Vietnamese.
The reason it looks “good” on old search results is cause they have not been indexed yet using the new algorithm. If you edit any of the posts there the bug will pop up.