ISO-8859-1 error in static embedding

(Thomas Purchas)

I’m trying to integrate comment embedding from discourse into a website that produces ISO-8859-1 encoded HTML.

I’m pretty sure its all working, the IFrame appears and says loading comments, however the topic retrieval and parsing is currently throwing regex errors (Job exception: Wrapped Encoding::CompatibilityError: incompatible encoding regexp match (UTF-8 regexp with ISO-8859-1 string).

I can’t convert to UTF-8 (old PHP codebase), is there any way of adding ISO-8859-1 support, or re-encoding the retrieved pages into UTF-8 before attempting topic creation?

(Mittineague)

iconv somehow?

(Thomas Purchas)

That would seem like an obvious solution. But it’s difficult to change the charset in HTML, because it also changes what the browser sends back.

This means if I send out UTF-8, I need to make sure that I capture and convert anything sent back (form submissions etc).

The site I’m working on is old and written by students (student society website) so making sure that I convert everything sent back and handle it properly internally its quite a lot of work.

I’m looking at various options to convert to UTF-8 in the future because the current encoding scheme used causes headaches everywhere, but there is no simple solution :frowning:

(Kane York)

One hack would be to detect if the requester is Discourse, then add another ob_buffer (it does use a stack, right?) and convert the page to UTF-8 right before finishing output, as well as translating the request parameters to 8859.