When exporting the user list in CSV format, if the name
contains characters other than ASCII characters and numbers, it results in garbled characters:
I found two solutions for addressing this issue:
-
Create a new XLSX file in Excel, import the data using the “From Text/CSV” option, and adjust the File Origin to
Unicode (UTF-8)
.
-
Utilize Notepad++ to open the CSV file, and change the Encoding from
UTF-8
toUTF-8-BOM
.
Result:
Despite their effectiveness in resolving the symptom, both of these solutions address the issue without tackling its root cause.
Upon further research, I have come across a potential long-term solution to this problem:
// Process to generate a CSV, with 'body' representing the data rows and 'header' as the header row.
const csv = body.reduce((prev, current) => `${prev}\r\n${current}`, header);
// Excel requires a Byte Order Mark (BOM) to indicate that the file is UTF-8 encoded.
const BOM = Buffer.from('\uFEFF');
// Concatenate the BOM header with the generated CSV content.
const bomCsv = Buffer.concat([BOM, Buffer.from(csv)]);
// Convert the resulting buffer to a string and return it.
return bomCsv.toString();
// Once the returned string is downloaded and opened with Excel, it will display the Chinese content correctly in the spreadsheet.
In essence, Excel requires metadata to specify the encoding of CSV files. Microsoft introduced the BOM (Byte Order Mark) header for this purpose. While this header might be disregarded by other spreadsheet software, it holds significant importance for Excel. To ensure cross-software compatibility, including the BOM header during CSV generation is essential.
I am curious if your team has plans to incorporate this solution in the upcoming version? Doing so could potentially expand language support and reduce the occurrence of bugs.
Thank you for your attention to this matter!
Reference: