i did a recent Cloudflare R2 uploads bucket configuration and my chat thumbnail images were borked. so i did some digging and made a quick fix for my configuration, then found this topic: Cloudflare R2 Image URL Display Issue: Detailed Explanation and Fix. anyways, so i looked at other S3 upload bucket configurations and noted the bug wasn’t really a Cloudflare-specific issue.
Description
when an external S3 or compatible object storage is configured for uploads, chat thumbnail images bypass the CDN and are loaded directly from the bucket URL.
for secure external S3-compatible bucket such as Cloudflare R2, chat thumbnails are broken and do not display.
the underlying issue is that the chat serializer is failing to apply the s3_cdn_url setting to thumbnails. instead of routing the image through the configured CDN, it is leaking the raw internal S3 bucket URL directly to the browser payloads.
Steps to reproduce
this is reproducible on Meta and other sites using S3 upload buckets:
- post an image in chat or a channel
- inspect the thumbnail image URL in the console
- click the image to get the larger original, and inspect the url
- compare it to the thumbnail
here is an example from a Meta chat
Thumbnail URL: from bucket
https://cdck-file-uploads-global.s3.dualstack.us-west-2.amazonaws.com/meta/optimized/4X/4/7/9/479815360e0e6e0cd9f4ba565891776e84aea532_2_375x500.jpeg
Original URL: via CDN
https://global.discourse-cdn.com/meta/original/4X/4/7/9/479815360e0e6e0cd9f4ba565891776e84aea532.jpeg
in the console, the html for the thumbnail <img... contains data-large-src = CloudFront CDN URL, and src= AWS bucket URL.
Impact:
- for S3-compatible storage like Cloudflare R2 that are secure-by-default and block unauthenticated access to raw bucket endpoints, chat thumbnail images (optimized) are broken.
- bandwidth leaks for AWS and other S3-compatible object storage buckets that allow access to raw bucket endpoints, since chat is bypassing the CDN entirely; this results in paying direct S3 egress fees for all chat thumbnail traffic.
- infrastructure leakage: the raw backend storage URLs (including internal bucket names and sometimes account IDs) are being exposed in the client JSON payloads.
PR:
i have a PR to fix the issue here:
looks like Sam added getURLWithCDN to the chat composer preview - however, i don’t think it makes it to the chat stream?
i wonder if the composer fix may have been failing for some S3 configurations as well because getURLWithCDN crashes on protocol mismatches (// vs https://)? anyway the above PR simply extends Sam’s work by adding the wrappers to the stream and making it protocol-agnostic.
Temporary workaround:
before i realized this was more than a cloudflare issue, i made a lightweight theme component. it intercepts the raw S3 domains in the Chat DOM and swaps them out for the proper CDN domain before the browser attempts to download them. this routes the traffic correctly and plugs the bandwidth leak. i adapted it to work for any S3-compatible object storage. just two settings - Raw S3 bucket URL and S3 CDN URL.
(no idea why github oneboxes are broken here) fixed now
