Yes this is pretty tricky, I can see how it is a problem for Vietnamese communities, the excerpts must look very confusing.
In Vietnamese is there ever a reason to type without diacritics? eg type khong and mean không? I imagine it is a super duper hard no cause the language is tone based, so the is the equivalent of me typing “dog” and meaning “milk”.
I think the best way forward here is to make diacritic stripping optional and turn this off in Vietnamese communities.
A bigger and more complex change is to amend it so excerpts are generated based off cooked and not the “normalized” cooked text.
FYI English speakers:
dấu: A sign
dàu: Head
dãu: Pudding
So yeah this is a pretty giant issue for Vietnamese.