API returns err 500 (no +info) when non-convertible unicode is in the body for a new topic

(Dylan Hunt) #1

( Spinned off originally from here for suggestions to provide friendlier API errors : Improve (add) friendly JSON API error messages (at least generic ones) )

Creating a generic new topic:


In “raw”, imagine there being a ¢ or ( ͡° ͜ʖ ͡°) or something with unicode. Attempting to use the API to create a new topic will result in a plain jane 500 err with no extra info –

{"headers":{"Server":"nginx","X-Request-Id":"blah-blah-blah-request-id123","X-Runtime":"0.155158","Connection":"close","Content-Length":"0","Date":"Sat, 07 Oct 2017 14:43:30 GMT","Content-Type":"text/html; charset=utf-8"},"responseXml":null,"responseJson":null,"cookies":[],"responseCode":500,"responseString":""}

Removing the unicode does the trick. The coin above ¢ can be used to test it. Simply adding it gets 500, removing it gets 200.

Nasty little bug ~ this breaks our reporting system, as chat logs often contain unicode. Surely there’s a multitude of other uses for unicode (not just for lenny faces, but for localization)

*It’s important to note that SOME unicode works, such as full-width commas (they are converted to regular-width commas).

(Jeff Atwood) #2

I suspect this has to do with how your request is encoded in the https call?

(Dylan Hunt) #3

This was encoded as a form, following the API docs. I haven’t looked at the docs for a while now, though – did you guys ever switch to application/json or is form still the best?

EDIT: Oh, you are probably talking about UTF-8 or something? Let me check out the default method, I’ll be back. I can’t seem to view or change the default encoding method on GameSparks, so I opened up a ticket:


(Dylan Hunt) #5

Setting the content header as { "Content-Type" : "application/json; charset=utf-8" }; returns error 400 without explanation.

Setting the content-type to just “charset=utf-8” goes back to returning 500 errors. I tried sending as both a form and application/json. I tried many combinations – at this point, I’m more suspecting Discourse than my side - never had an issue with any other site with Unicode, so far.

(Jeff Atwood) #6

Do you see anything on the Discourse side under /logs in your browser?

(Dylan Hunt) #7

Browser console logs (f12)? I couldn’t see anything unique in the console

(Jeff Atwood) #8

Nope under /logs on your Discourse site url

(Dylan Hunt) #9

Interesting - 0 logs for this (I cleared then tried again).

{ “Content-Type” : “application/json; charset=utf-8” };

Returned 500 with no error

{ “Content-Type” : “charset=utf-8” };

Returned 403 with anonymous array response, “BAD CSRF”

Logs show nothing (They had some old stuff, so logs are functional; I cleared it to try it again), but…

  • SHOULD they say something, or does this mean it never delivered?
  • However, if it never delivered, would I actually see a BAD CSRF response? Googling this, it seems to be a Discourse-exclusive response.
  • And shouldn’t it log if I got anything other than 404?
  • Are logs delayed longer than 5m?

(Sam Saffron) #10

Try creating the exact same topic in the browser and look at what chrome sends through, that is the first thing I would try.

(Dylan Hunt) #11

Works fine. I mean I can make or reply or new post directly within right now using symbols like ¢ without an issue. It’s only when using the API.

(Sam Saffron) #12

Thing is, the web app uses the api, inspect the request it makes

(Dylan Hunt) #13

Ah, I can’t see it with F12 – can you recommend the best Chrome extension for viewing Discourse requests?

(Jeff Atwood) #14

Yes you can – did you read the associated howto topic?

@jomaxro does this howto need updating? Can you validate it this week?

(Joshua Rosenfeld) #15

Sure, adding this to my list.

(Joshua Rosenfeld) #16

Looks good to me. Making one minor edit, but it shouldn’t have affected the #howto.