Ignore BOM on CSV when sending bulk invitations

(Barry van Oudtshoorn) #1

When creating a CSV in Excel, it by default includes (what I think is) a BOM marker at the top of the file, to ensure that it’s parsed as UTF-8. Unfortunately Discourse doesn’t handle this (and doesn’t read the file as UTF-8!) so the first email address always fails with an error like this:

(Note that the “j” is the first letter of the actual email address.)

I think Discourse should strip/ignore the BOM on CSVs – it’s probably pretty common that people would put these together in Excel, after all. :wink:

(Jeff Atwood) #2

That might make sense @techapj I can confirm via hex editor, when Excel 2016 file is saved as CSV


there is the BOM marker by default:


(Mittineague) #3

I went to review the W3C page to double check my understanding that the BOMs were only needed for UTF-16 and UTF-32 because UTF-8 did not have big / little endian.
With HTML5 things have changed a bit since I last visited the page.
Older browsers could use …BE and …LE HTTP headers but now the only way is by reading the BOM
tl;dr for UTF-8 yes strip the BOM


(Arpit Jalan) #4

Fixed via:

(Arpit Jalan) closed #5