ETag header support

JackCA · December 20, 2019, 6:14pm

I feel like ETag’s would have a large positive performance impact for page loads since most of the HTML pages are not cached. This would avoid the server from needing to serve the page if the client already has downloaded it.

Has there been any thought around this?

marianord · December 21, 2019, 5:44am

I might be wrong, but Discourse is already heavily dependent on client-side JS, so, what the client downloads is minimal data. Almost everything is loaded in the first visit, and then cached. I don’t know, really, how much ETags might improve that caching process.

For example, the first load of my page is ~800Kb, the second one is ~40Kb

Dannii · December 21, 2019, 1:44pm

Discourse is already quite well designed and set up for caching.

Most site assets (JS, CSS) have unique URLs that are generated each time you update the site with a hash of the asset so they can have long cache times.

I think site uploads (images, avatars etc) also use unique URLs.

Most of the full pages that you can see are dynamic and should not be aggressively cached. It would be possible I guess to have the kind of ETag caching where it checks every page load if there are no new or edited posts. I’m not sure why the team decided not to do this.

JackCA · December 21, 2019, 4:25pm

I should have clarified: the assets are indeed cached well – what I’m talking about is the HTML document (first request).

Most of the full pages that you can see are dynamic and should not be aggressively cached. It would be possible I guess to have the kind of ETag caching where it checks every page load if there are no new or edited posts. I’m not sure why the team decided not to do this.

Yes this is essentially what I’m talking about, but I don’t think ETag’s are generated by hand like that – they can be based off the raw html that is being served and can tell the client, “hey this is exactly what you saw before, just use that”

marianord · December 21, 2019, 7:08pm

The thing is that, to my understanding, it already happens with the JS in the client side. So, you don’t have HTML going back and forth.

Dannii · December 22, 2019, 1:44am

The HTML loads JSON from the server, and that JSON request could use ETags. Currently it does not, though I’m not sure of the team’s argument for why.

JackCA · December 22, 2019, 4:46am

The first request to a page definitely has rendered content before it loads JSON from the server via XHR, which you’re right, is also happening.

You can verify this by looking at the “Document” type network request in Chrome Debugger, it should have (at least in my case) the categories already rendered.

Here’s an example of what’s rendered from the document request:

codinghorror · December 22, 2019, 4:57am

Your request is nonsensical because Discourse is a JavaScript app that does not retrieve HTML, all “pages” are built via executable JavaScript code in real time.

JackCA · December 22, 2019, 5:59am

Your request is nonsensical because Discourse is a JavaScript app that does not retrieve HTML…

I totally respect your experience and expertise here, but I’ve run dozens of javascript-rendered web applications that use ETags in the root response (if the content can be reused).

all “pages” are built via executable JavaScript code in real time.

The screenshot I posted above is the HTML that is returned before any clientside code runs, so there is certainly something on the backend (I’m assuming rails) serving this route.

Every single discourse community I’ve looked at (besides this one), initially returns a a javascript-less version of the site with all of the content rendered, presumably for crawlers.

Apologies if I’m way off here, but I don’t think I am being “non-sensical,” I may just be wrong.

codinghorror · December 22, 2019, 6:40pm

Only for crawler user agents, so this isn’t a useful observation.

JackCA · December 22, 2019, 7:04pm

Only for crawler user agents

That’s not what I see when I run this:

curl -H "User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36" https://community.midi.city/

That is not a crawler user agent and it’s returning the above payload.

Regardless, I think the answer to my request for ETag’s is a “no,” so thanks for the feedback and maybe it will be reconsidered at some point.

codinghorror · December 22, 2019, 7:05pm

Correct, the answer is a hard, definitive no for both philosophical and technical reasons.

(assets are a different issue, but using unique filenames with a guid is a far superior approach, so etags are kind of obsolete in general.)

Dannii · December 22, 2019, 11:11pm

Even for the API? I could understand for the smaller requests that it’s probably not worth it, but the topic views can be up to 20KB, which would add up. But then, not a lot of people are viewing topics repeatedly unless there are new posts…

Falco · December 23, 2019, 12:02am

That’s the point. For repeated views of the exact same content, if you are offline, we already render all the content from the browser cache without touching the server.

Upgrading that to load even if you are online involves cache invalidation, so naturally it’s hard.

Dannii · December 23, 2019, 12:05am

Oh, good to hear that works. I would’ve thought that the cache-control: no-cache, no-store header meant the API responses would never enter the browser’s cache.

Falco · December 23, 2019, 12:13am

They don’t. Well, it’s complicated. There are multiple caches in play

It doesn’t enter the conventional browser cache everyone knows and love. But there is a Cache Web API exposed to browsers in JS which is used to cache responses in order to provide offline navigation of previously read content.

Topic		Replies	Views
The initial page load Site feedback	21	9206	April 25, 2016
Does discourse have a cache? Support	12	2539	April 27, 2019
What is the Discourse approach to SEO? Site feedback search	8	8493	June 21, 2016
Discourse slow loading on dedicated server Support	1	232	March 27, 2024
What is going on when the site is loading? UX	4	312	March 15, 2024

ETag header support

Related topics