Stripping Outlook Safe Link URLs?

Good morning - I’m an admin on a hosted (Business tier) forum for a software service, and I am trying to gently encourage more of our users to post to our Discourse rather than emailing me directly by forwarding messages to create new topics.

Unfortunately, my email is a managed Outlook/Office 365 account with Safe Links enabled in Windows Defender, and I have no control over the policies, so when enabling posting-by-email in Discourse and forwarding any messages from my email account, any links contained in the message come out an awful sight in the resulting Discourse topic, wrapped in one of Microsoft’s obfuscated Safe Link URLs (and breaking Discourse’s link previews, etc)

Is there any tool/option on the Discourse side to strip a posted Office365 Safe Link URL back down to the original link? An email setting I am missing or a plugin that could be used to set such a posting policy? I suspect this is just something I’ll have to fix manually when forwarding messages on the Outlook side, but thought it was worth asking.

Since Outlook rewrites URLs, you’ll need to gently encourage them to use Discourse by using Discourse yourself. :slight_smile:

Discourse will in many cases follow redirects to get to the URL to produce a onebox, but if that’s not working then I’m afraid that you’ll need to use the Discourse UX rather than you email client to create the topics.

2 Likes

…I do? But since there is a post-via-email feature, that would imply that there are times where it is convenient to do so :slight_smile: like, say, hitting “forward” on an existing message rather than having to manually copy-paste between UIs

Safe Links scans incoming email for known malicious hyperlinks. Scanned URLs are rewritten or wrapped using the Microsoft standard URL prefix: https://nam01.safelinks.protection.outlook.com .

We might be able to unwrap these, can you post some examples of how they’re rewritten?

Trawling my inbox I have two:

https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fvcacanada.com%2F&data=02%7C01%7Cinternalmedicine.altavista%40vca.com%7C8b234337846c4b26704608d859d038fb%7Ca2bdfa5ebe874736892b6db8bbf91546%7C1%7C0%7C637358098123527523&sdata=2b8GyZaj5cH3XNVMQ7KuIB8zgv2n4rEqX2MJutsQn2c%3D&reserved=0
https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.youtube.com%2Fembed%2F4qo27xcVS5I&data=02%7C01%7C%7C53e03ffe911442ddf50108d6492424d0%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636776816911303670&sdata=vzPKnv1BDVf6LL4z9P3faVYkxxcLcyjnnrLcch%2B1npA%3D&reserved=0

This does look pretty easy:

def un_safelink(link):
    try:
        parsed = urllib.parse.urlparse(link)
        if parsed.netloc != 'nam01.safelinks.protection.outlook.com':
            return link
        query = urllib.parse.parse_qs(parsed.query)
        return query['url'][0]
    except:
        return link
4 Likes

I think they all follow a similar pattern, although looks like in my case there’s also some kind of reference to my particular account, which is probably an enterprise-level subscription thing, e.g.:

https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftransformativemedia.swinburne.edu.au%2F&data=05%7C01%7Cethan.gates%40yale.edu%7Cd0c6ce8e6b4c44f1d9f508db243eda00%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C0%7C638143624621704534%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=B5zPUFQ4qRQqSdmtimFQLCnosghaZENDmyaz2zZlw40%3D&reserved=0

:face_with_raised_eyebrow:

so much for documentation

Yes, looks like it encodes the email address of the sender or receiver into the metadata. Not that we need it:

{'url': ['http://transformativemedia.swinburne.edu.au/'],
 'data': ['05|01|frodo.baggins@yale.edu|d0c6ce8e6b4c44f1d9f508db243eda00|dd8cbebb21394df8b4114e3e87abeb5c|0|0|638143624621704534|Unknown|TWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0=|3000|||'],
 'sdata': ['B5zPUFQ4qRQqSdmtimFQLCnosghaZENDmyaz2zZlw40='],
 'reserved': ['0']}

… I don’t think it does?

https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftransformativemedia.swinburne.edu.au%2F&data=05|01|ethan.gates%40yale.edu|d0c6ce8e6b4c44f1d9f508db243eda00|dd8cbebb21394df8b4114e3e87abeb5c|0|0|638143624621704534|Unknown|TWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D|3000|||&sdata=B5zPUFQ4qRQqSdmtimFQLCnosghaZENDmyaz2zZlw40%3D&reserved=0

Ah, that’s ugly. I saw in preview:

It should behave the same as:

https://transformativemedia.swinburne.edu.au

Though a site that oneboxes still works:

This might just be a matter of tweaking how we generate “inline titles” for these links.

2 Likes

ah that may have just been a bad example re: the previews/oneboxing, I will try out some other examples. That’s less of a concern than the unwieldy links/URLs themselves, in any case!

1 Like

It depends on how “safe” Microsoft has made the urls. Seems like it might be possible to make them work,though, so it seems I was too pessimistic about how hard the problem is. Sorry about that.

I’ve had several exchanges of email with someone using Microsoft that made it hard to read them in my own mail reader.

2 Likes

Examples welcome, eh!

2 Likes

I would be curious to get input from other business-type users as to whether stripping this entirely violated their security intent. One of the benefits of this feature is a URL can later be deemed “unsafe” and further attempts to click it will be blocked.

On the other hand, forum owners might be grumpy that Microsoft is “tracking all of their forum users activity”.

I can see two sides to an argument here.

A decent approach to take might be:

  • if a safelinked URL is pasted (on its own or inline)
    • resolve the onebox normally
    • if no onebox or title, make the “pretty text” the unwrapped link
    • the link target should still be the safelinked URL

i.e. it would look like:

https://transformativemedia.swinburne.edu.au

3 Likes