Update Avatar Image Serving - Remove Proxy Method

With Discourses 1RTT philosophy, I think it might be time to rewrite the Avatar image serving code.

Avatar images should be treated as any other image upload. Resized at upload, stored, and served directly from the file system/S3/CDN.

The current Discourse method uses a proxy method to serve Avatar images. This approach creates unnecessary HTTP round trips and IP address challenges.

Here is an overview of Avatar Requests:

  • Initial Discourse HTML is painted.
  • Browser detects an avatar image and requests an image from the Discourse server
  • Discourse server acts as a proxy and requests the image from local file store/S3/CDN
  • Discourse server receives image
  • Discourse server sends the image to the browser.

Every custom avatar has 1 additional HTTP roundtrip and related server processing time. A typical topic or topic listing can have 30 plus custom avatar images. On my site, that results in 10K to 20K extra RTT and related server load per day that could be avoided.

Additionally, the avatar images are direct access called from the server. To do any CDN-level protection requires whitelisting the IP address. This requires whitelisting gateways versus server IP addresses. Hosting companies make changes to balance network traffic on a regular basis. My gateway IP address will change/evolve. The software should not depend on updating the IP address for basic avatar images to work. This is based on the following incident in support, Avatar Proxy and CDN Hot-Link Protection

From a performance and simplicity perspective, can we have avatar images served directly from the Discourseā€™s designated file store?

5 Likes

This is indeed a very convoluted part of the app @LotusJeff.

To address some of itā€™s shortcomings, we recently developed a way to reduce those round trips in our hosting, but I donā€™t think we ever documented it here. Is there any public docs about it @david ?

4 Likes

Yeah, we have this GlobalSetting, which you can enable by setting the environment variable DISCOURSE_REDIRECT_AVATAR_REQUESTS=true

Then, instead of proxying, avatar requests will be served with a 302 redirect to the file store.

By itselfā€¦ thatā€™s not really a good idea. It means browsers have to make two full HTTP roundtrips for every avatar. So, while it might solve your ā€˜hotlinking protectionā€™ problemā€¦ I wouldnā€™t recommend that you enable it. It will make the experience worse for your users.

We use the setting on our discourse.org hosting. But we supplement it with a lambda running on our Cloudfront CDN. It detects the 302 and performs the proxying itself. Essentially: we move the proxying from our application servers to the CDN.

As for the more general question of ā€œcan we change avatars to link directly to the assetā€. Itā€™s tricky because avatar URLs are baked into all historic posts (e.g. quotes). The dynamic /user-avatar/ URLs allow us to keep those working when a user changes their avatar. Iā€™m afraid we donā€™t have any plans to change that system.

If thereā€™s an easy low-risk way we could make the existing proxying work for your use-case (e.g. add a GlobalSetting which inserts a specific HTTP header in any avatar-proxy requests), then we could consider accepting a PR for the change.

3 Likes

None of the abovementioned options are feasible or preferred in my current environment. I would love to help solve this convoluted part of the application, but I am very new to this tech stack.

The only assistance I can provide now is using my old BA and Dev Mgt skills. So, in the spirit of feeling like I am helping, these are my thoughts.

When I look at any technical conundrum, I first look at possible assumptions that would make a solution more difficult.

One assumption is to save an updated avatar image as a new filename. The person only has one avatar. We donā€™t keep a record of avatar names. I would suggest that when a person updates their avatar image, it is saved using the same filename as the existing avatar. This is basically what the /user-avatar/ link is doing. Instead of having a workaround, perform this logic during avatar creation/update. This would solve the historic postsā€™ concerns for future baked posts.

Is this a big-bang change? No, this change could be implemented over months, slowly improving site performance. My dev teams always had an opportunistic code list for every code block. We wanted to make these improvements, but they were not critical to be done individually. When the code was opened for development for some other reason, the developer also made opportunistic changes. This minimized our testing and errors by reducing the times the code was opened and updated.

I would break this project into the following phases:

  1. Update avatar image saving logic to replace any image updates using the current filename.
  2. Replace all uses of avatar images with a standard function that utilizes the discourseā€™s preferred file storage/delivery system. These could be implemented over months, slowly moving to the new avatar image presentation logic.
  3. Once the first two are completed, provide the community with the steps and code snippets to change out and rebake historical avatar images in baked HTML to their preferred filesystem. This would be up to the individual discourse site ownersā€™ choice to implement. (This code instructions and snippets would be useful for making raw HTML changes)

I am sure you have considered all of this. I also know that fixing older code never makes it up the priority list.

If there is anything I can do to assist in this effort, please let me know.

ps. I do appreciate the offer of a PR for a global header setting. Let me do some additional research and follow-up with a more defined requirements. thank you for your assistance.

Unfortunately if the file is mutable, it would mean we can no longer enable infinite caching of the asset in the CDN and end-user browsers.

There are strategies to make something like that work without hurting performance too muchā€¦ e.g. the stale-while-revalidate directive. But that brings its own costs & UX implications.

1 Like