S3 CDN URL not being used on non-image uploads

Maybe related to S3 CDN URL ignored when uploading into posts.

At first I thought I had missed some configuration, but it’s reproducible here on meta:
https://meta.discourse.org/uploads/short-url/dw1U4hctATusBlHsUmWQXeme66j.csv (uploaded here) is a 302 redirect to
//assets-meta-cdck-prod-meta.s3.dualstack.us-west-1.amazonaws.com/original/3X/5/e/5ebb2cfb8cc907e8e8f7c6559a72d2f4a8ba2f8f.csv

Shouldn’t it redirect to https://d11a6trkgmumsb.cloudfront.net/original/3X/5/e/5ebb2cfb8cc907e8e8f7c6559a72d2f4a8ba2f8f.csv instead?

5 Likes

Hmmm there may possibly be an issue here @martin can you investigate? We should probably redirect to the CDN if possible.

5 Likes

This is a little bit tricky, I had a look at this just now. Basically we are always downloading from S3 with a presigned URL if we are doing a “force download”, which is what happens when we are clicking on the attachment link or clicking on the Download button on an image. This is so the appropriate content-disposition headers can be added:

attachment; filename="#{upload.original_filename}"; filename*=UTF-8''#{upload.original_filename}

I do not think it is possible to make the CDN URL behave in this way? The CDN URL for images is only used when displaying them inline, not when downloading them. Also in the case of secure images and attachments which have a private ACL the presigned URL must always be used.

2 Likes

@martin regarding Attachment links, I cannot see how they can be set to always “force download”. Right now when uploading a non-image file, the resulting rendered url is a short-url without the ?dl=1 param which is resolved to the value of the url attribute of the corresponding upload (which is not a presigned_url and also fails in our case as the url is not the s3 cdn_url but a constructed one that might only work for certain s3 providers).

Is there a way to always force download attachments, or is the current behavior actually a bug?

1 Like

Can you expand here, are you using Amazon S3, or a different provider? If so which?

1 Like

Yes, I seem to be using an unsupported S3 Provider (OVH Object Storage)

With an endpoint url of https://s3.de.ovh.cloud.net and a configured s3 cdn_url to https://storage.de.ovh.cloud.net/v1/<some-unique-user-id>/<the-bucket-name>

When short urls are resolved, discourse uses the endpoint url to construct the resulting url (which fails since it does not include the correct ovh user-id - the part in the s3 cdn_url).

Forcing short urls to have dl=1 would however construct a working presigned_url which would be fine for me.

I set these (s3 specific) environment variables within the container yaml:

DISCOURSE_USE_S3: true
DISCOURSE_S3_REGION: DE
DISCOURSE_S3_ENDPOINT: https://s3.de.cloud.ovh.net
DISCOURSE_S3_ACCESS_KEY_ID: ...
DISCOURSE_S3_SECRET_ACCESS_KEY: ...
DISCOURSE_S3_CDN_URL: https://storage.de.ovh.cloud.net/v1/<some-unique-user-id>/<the-bucket-name>
DISCOURSE_S3_UPLOAD_BUCKET: <the-bucket-name>
DISCOURSE_S3_INSTALL_CORS_RULE: false
1 Like

Maybe this would be out-of-scope for this topic, but if the current behavior is intended, how or where would one override short-url-rendering in frontend? I am thinking of simply hooking into the cooking-process of posts and then appending the query param to attachment short-urls. I am aware of how to write js-plugins for discourse, but most of the time I am just having a hard time finding the right Widget to “reopen” or function to override.

Ok so I found the place where Markdown for attachments is generated and as far as I understand the plugin API one cannot (easily) override it (I even think one should not do it).

So my initial thought of adding a ?dl=1 param to those urls seems like the wrong way to do it.

Regarding not-forcing downloads for resolved short-urls: If I understand the argument against public ACLs on S3 buckets correctly, one should either:

  1. serve files from S3 via a CDN (unfeasible for attachments as @martin pointed, as we may not be able to correctly set the filename for download in this case)
  2. create a presigned url for the S3 object

But the current behavior does neither and expects the S3 bucket to have a public ACL. This also seems to be the case for supported S3 providers (including amazon), so I’d ask why not make the force_download option in Discourse.store.url_for default to true when resolving short-urls for S3 stores?

1 Like

Having the same issue here, what I am excepting is that the file behind uploads/short-url should be download via or redirect to the S3 CDN URL.

2 Likes

We are facing the same issue with Cloudflare R2, as it doesn’t seem to allow using the direct S3 url without a presigned URL.
And R2 doesn’t support ACL for buckets.

It looks like discourse recently added an “s3 use cdn url for all uploads” which solves this issue.