Use WebTorrent to load media objects

Hi! I’m an admin over at https://discuss.pixls.us, a community for photographers who use Free Software. We’re lucky enough to have our costs covered by community donations, but I’m not sure how long that will be sustainable as we continue to grow. I also take the job of spending people’s donations quite seriously and I’ve been trying to think of ways to reduce our costs.

Since we’re image makers, our forum is extremely image heavy. Our largest cost by far is our Amazon S3 bandwidth costs (almost entirely egress bandwidth). Our S3 storage costs are quite minimal.

It would be beneficial to use to be able to have those images and other media load via webtorrent a la peertube.

I would see this feature as two things, (1) users who visit the site download media from other users using webtorrent and (2) some sort of RSS feed that I can subscribe to in a torrent client so users can be “super seeders” that just provide the site with bandwidth.

1 Like

You are using S3 without a CDN :scream::scream::scream:

You really should put a CDN, even of just a Cloudflare free plan, to front the S3 uploads so you don’t drown in egress fees. If possible enable the Origin Shield for even less egress traffic.

Check Using Object Storage for Uploads (S3 Clones).

5 Likes

Literally just waited for @Falco to answer (:sweat_smile:) as he has done a great job getting S3 to Discourse standards.

I would go to a cheaper S3 Clone option. Digital Ocean spaces would be the defacto go to after S3 if you have a lot of data, maybe going with bigger storage focues players like BlackBlaze or Wasabi to reduce pricing even more.

Doing this will remove most of the bandwidth cost as are rated a lot cheaper (Wasabi doesn’t charge for it).

Also, just for increased performance I would add a CDN, I just went in the search for a cheap CDN as my community is barely profitable via AdSense and I found BunnyCDN is not the fastest but it gets the job done and is quick to get it working.

BunnyCDN Referal link :see_no_evil:

With this setup you’d reduce the billing a lot. Either just putting the CDN in front of AWS S3 or using another S3 provider + CDN.

I believe @paperdigits has the exact amount of monthly egress traffic on the AWS bill, so we can use that calculate the CDN cost.

I would stay with S3 and just add a CDN at first.

6 Likes

Thanks, I’ll look into the CDN.

WebTorrent would not cost us anything though :wink:

I’m almost sure that it would cost you more than anything, you would need to setup a P2P network using the Discourse server (or a seedbox) as the source of truth or main seeder for all the media.

That seedbox would need to have enough resources to handle one new connection opened by each request as each media file would be a different torrent file. I know WebRTC is pretty good and efficient, but opening a connection is usually the most resource intensive part of a web application.

Unless I’m missing something this would be a not trivial setup and resource intensive one for really active communities.

3 Likes

The torrents have a file seed, so if the file is not available from the swarm, it is loaded from another source, e.g. s3.

The discourse application only servers up the torrent/magnet link, it doesn’t not seed. Users who have images loaded already seed them and additional if the media assets are being seeded by traditional bittorrent clients as well then those can be used. The torrent client is browser/client side, so the load on the server is just serving up more javascript.

Would you add the disclaimer for the users? For example I enter to your page via my cellphone in a metered connection, how do I opt out of my limited metered connection from getting all my bandwidth consumed because the post is viral?

You keep using S3 as the main source to load the files which wouldn’t really save you a lot of money.

You still need to have seeders serving the connections, you need unreliable seeders because of different internet speeds

Sure, you just add a toggle switch in a dialog, or make it part of the preferences, or use something like privacy badger to keep that library from even loading. Or detect if the browser is mobile and disable it all together. There are quite a few options.

It works sort of like a cache. If the media isn’t being seeded, then it pulls from the web source (S3 in our case). However, since web torrent can also pull from traditional torrent clients like transmission, rtorrent, etc, we can have a few seeders (like myself) who just seed a whole bunch of stuff on my home connection. You really only need a handful of seeders to make this viable.

Indeed so, but we’d solve that problem as a community.

It’s not impossible now, you just complicated the feature to a level that almost no feature has. Like the Discourse team just dropped support for IE11 because of the headache that meant maintaining the code. Imagine this kind of logic, maybe if you go to #marketplace someone goes for it.

I don’t really see any advantage on having it in core and I have asked for a lot of things to be added to core that are a lot easier to get done.

Which have been denied rightfully and with a clear and logic explanation (no complain here).

I personally wouldn’t use it nor have any kind of interested getting it to work it’s easier, more reliable and cheaper getting a CDN in front and forgetting about it.

Nor would the basic Cloudflare plan :wink: