S3 image bandwidth costs are getting annoying

4 posts were split to a new topic: Discourse & Cloudflare

Cloudflare has been generally good for me as well. I don’t think it caches images on the free plan though?

mm, possibly, not sure how I could tell. All I know is that they tell me they’ve saved half my bandwidth. I do see they have some kind of non-free image-related offering.

2 Likes

I would note that the Cloudflare terms do explicitly state:

2.8 Limitation on Serving Non-HTML Content

The Services are offered primarily as a platform to cache and serve web pages and websites. Unless explicitly included as part of a Paid Service purchased by you, you agree to use the Services solely for the purpose of (i) serving web pages as viewed through a web browser or other functionally equivalent applications, including rendering Hypertext Markup Language (HTML) or other functional equivalents, and (ii) serving web APIs subject to the restrictions set forth in this Section 2.8. Use of the Services for serving video or a disproportionate percentage of pictures, audio files, or other non-HTML content is prohibited, unless purchased separately as part of a Paid Service or expressly allowed under our Supplemental Terms for a specific Service. If we determine you have breached this Section 2.8, we may immediately suspend or restrict your use of the Services, or limit End User access to certain of your resources through the Services.

Which would indicate to me that if you’re using Cloudflare as an asset CDN for Discourse, that you’re likely breaking those terms, and they could in theory shut you down at any time.

4 Likes

Interesting - I think this probably does mean that they don’t voluntarily cache the images (or other attachments.) I notice a breakdown of cached content for my site which says JSON data is the top cached-and-served category. Possibly JSON is the forum content being transferred to the browser for display? Or the polling/notifications system??

It might be worth noting that Cloudflare also offer some kind of block storage called R2, which is perhaps intended for the S3 role. It’s potentially cheaper than S3 as it doesn’t have egress charges, AIUI.

2 Likes

I’m sure this will turn out to be a stupid question, but what about just getting a DigitalOcean server with lots of storage? They have very large transfer allotments as well (in the multiple TB depending on the droplet size).

Off the top of my head, my guess is that if everything is served from the same server it might slow down the site. Does this make sense?

Also, the other obvious downside is that you would be paying for storage you aren’t using until your user base grows. But the S3 transfer rates are so high that you would probably still come out ahead over time.

Again, looking for people to poke reasonable holes in this suggestion as we are all trying to find the right balance.

1 Like

How many GB do you need? How fast is it growing? How much data transfer do you need? These are the crucial questions.

It might be worth asking:

  • People using S3, how much storage did you need when you first chose to do that?

I think it will come down to cost and flexibility - I wouldn’t expect any performance problem. Using local storage on the instance will, I think, be more expensive and only comes in certain fixed sizes: you’ll always have some unused space which you’re paying for. But you can’t predict the future so you can’t really model the costs either way.

For cases with modest storage needs it’s surely fine to use local storage, and it’s certainly simpler. Note that, as far as I understand it, migrating from local to block storage is straightforward, but migrating back from block storage to local storage is not. See MJK’s excellent opinionated guide:

Note that there are various storage providers each with their own pricing. Cloudflare also has an offering (without egress fees) but it’s not quite ready:
Configure an S3 compatible object storage provider for uploads

And of course, the various hosting companies will be competing on price, so shop around even if using local storage.

3 Likes

Backblaze is cheaper than S3 for a very similar object store service. Idk if Discourse has a client for it.

If you mean backblaze B2, You can see it here: Configure an S3 compatible object storage provider for uploads

I’m on too meta now, I guess, but S3 (or any other similar one) is not too costly per se. Old images are. By stetson statistic 97 % of older images are just collecting spider web and are never shown — that storage is expensive.

And no, I don’t know how that should fix. I know what should be done, but knowing is not enough…

The bandwith costs when out there is bigger audience downloading relatively new images. Let’s forget CDNs, because when a forum is not operating really globally those new images should served from VPN itself. When images start to be older and forgotten then those should move to S3 and free some disk space.

Big guys do things differently, but they have money.