Issues with SSO and '~' character

Hi everyone,

So recently I discovered that if I inclue ‘~’ in a user’s bios, I get a base64decode error from Discourse. It can handle all sorts of other problematic characters just fine (spaces, =, %, &) but not ~ for some reason.

Anyone else encounter this issue?

My first thought is that perhaps my encoding could be wrong, but I haven’t been able figure it out.

Here is my python implementation of the encoding:

return_payload = base64.urlsafe_b64encode(parse.urlencode(params).encode())

which is then put directly into ‘sso’ in the requests (along with all the other necessary information)

resp = requests.post(
       ".../admin/users/sync_sso",
        data={'sso': return_payload, ...}
        headers={...}
)

I’ve updated my discourse to the newest version (3.5.0.beta1-dev), still persists.

Thanks for any help!

2 Likes

It propably should be fixed, but that question is above my paygrade and skills. But out of pure practical curiosity: why someone would use a tilde in a bio?

1 Like

Heh I guess that’s a reasonable question.

I run a multilingual forum and in other cultures ‘~’ is frequently used. As an example, in Korean, it’s often used at the end to soften your tone, like ‘If you have any questions, let me know~’

2 Likes

So this is a bug report and not a support request?

1 Like

Is it? A bug is something that is done, but doesn’t work. This is more like question ”is it done or not” and then it’s more Feature if not a support question.

Yeah I think Bug is appropriate. I believe i’m base64 encoding it correctly, so discourse should decode it correctly too.

I think it is a bug (provided we can repro it)

2 Likes

It looks like urlsafe_b64encode replaces some characters in the base64 encoding. From the docs:

Encode bytes-like object s using the URL- and filesystem-safe alphabet, which substitutes - instead of + and _ instead of / in the standard Base64 alphabet, and return the encoded bytes. The result can still contain = .

That means the result isn’t standard base64, and won’t be compatible with Discourse’s decoding.

I’d recommend using the normal b64encode function instead. Your HTTP library should take care of the URL escaping if needed.

5 Likes

Upon further investigation, I was indeed encoding it wrong.

Here is what I ended up with, for posterity:

return_payload = base64.b64encode(parse.urlencode(kwargs).encode("utf-8"))
h = hmac.new(secret.encode("utf-8"), return_payload, digestmod=hashlib.sha256)
resp = requests.post(
       ".../admin/users/sync_sso",
        data={"sso": return_payload, "sig": h.hexdigest()}
        headers={...}
)

And if you’re doing the redirect, be sure to parse.urlencode that {“sso”…}.

Thanks for the help @sam and @david !

3 Likes