Full site CDN acceleration for Discourse

(Sam Saffron) #1

Fastly , CloudFlare and a few other CDNs offer a mode where they accelerate dynamic content.

In a nutshell you point your domain IP address at the CDN and the CDN will intelligently decide how to deal with the request.

  • Static content can be easily served from cache
  • Dynamic content can be routed to the site.

This provides some advantages over only shipping static assets which is covered in the CDN howto.

  • You can elect for “shielding” that protects your site from traffic spikes.
  • Dynamic content can be accelerated using techniques like railgun. (note: in general our paylod fits in 1 RTT so this has less of an impact)
  • SSL negotiation can happen at the edge cutting on expensive round trips for negotiation.

If you enable full site acceleration with a CDN it is critical you follow 3 rules

  1. The “message bus” must be served from the origin.

  2. You need to set up X-Forwarded-For trust. For Cloudflare, add cloudflare.template.yml to your app.yml file.

  3. Be extra careful with techniques that apply optimisation to the site, stuff like Rocket Loader can stop Discourse from working. Discourse is already heavily optimised, this is not needed.

To serve “long polling” requests from a different domain, set the Site Setting long polling base url to the origin server.

For example, if your site is on “http://forum.example.com” you should set up http://forum-direct.example.com/ to plug into the site setting. If you don’t your site will be broken.

If you are fronting Discourse using Varnish you probably want to follow the same trick here and bypass Varnish for the message bus requests.

Boring technical notes:

Achieving a working message bus on a completely different domain is quite challenging. Our message bus is aware of which user is polling, the other domain may have no cookie set up so untouched there are two issues. Firstly, you can’t even make standard ajax requests cross domain without a huge CORS dance.

Secondly, we needed a mechanism to inform the other domain who the user is so we can poll for the correct information.

When long polling base url is changed, Discourse ships an extra meta tag that shares a “cross domain” auth token. This token is passed using a custom header back to the message bus. The token expires after 7 days or as soon as the user logs off. In future we are probably going to amend it so the token has N uses and is automatically reissued after they pass.

You can see most of the implementation here: FEATURE: allow long polling to go to a different url · discourse/discourse@aa9b3bb · GitHub

How do you setup Cloudflare?
Enable a CDN for your Discourse
SSL on Discourse / DO sub-domain of Heroku hosted domain
Request header field Dont-Chunk is not allowed by Access-Control-Allow-Headers in preflight response
MessageBus short polling is not working
Add a new site (android) 'was not found'
Varnish config for discourse
How do you setup Cloudflare?
Is there a way to install discourse without using "discourse-setup"?
Install Discourse on Amazon WS with Cloudflare
403 Error when I try to edit a post
(Erick Guan) #2

I don’t know what it means… fits in 1 RTT?

(Sam Saffron) #3

1 Round trip, read up abount TCP congestion control, initial windows and so on.


(Renoir Boulanger) #4

UPDATE 2015-04-02: I just made the full example more complete.

I am confused with this definition.

But I think this should clarify it. Please, don’t hesitate to tell me if i’m doing something wrong.

The confusing part is that if “http://some-origin.com/”. If you are behind Fastly, you have to use a CNAME entry and then you have to have a sub domain name and not the top level.

Background: In DNS, a top level domain name (i.e. “some-origin.com”) can only have A records. Since Fastly requires we use a CNAME entry, we have no choice but to use a sub domain name.

Let’s say that we will then use “http://discourse.some-origin.com/” to serve our Discourse forum so we can use Fastly.

Now there’s this thing called “long polling” which is basically an OPTION HTTP request with a long time before returning anything. If we use the Fastly or Varnish address, as Discourse would by default, Varnish will time out and “long polling” won’t work.

More background: Varnish has this option to bypass in known contexts through vcl_pipe which is roughly a raw TCP socket. But Fastly doesn’t offer it because of the size of their setup.

Proposed setup

Let’s enable long polling and expose our site under Fastly. We’ll need two names, one pointing to Fastly’s and the other to the IP addresses we give within the service dashboard.

  1. discourse.some-origin.com that’s our desired Discourse site domain name
  2. discoursepolling.some-origin.com (pick any name) that we’ll configure in Discourse to access directly to our public facing frontend web server

In my case, I generally have many web apps running that are only accessible from my internal network. I refer to them as “upstream”; the same term NGINX uses in their config. Since this number of web apps you would host on a site can fluctuate, you might still want the number public IP address to remain stable. That’s why I setup a NGINX server in front that proxies to internal web app server. I refer to them as “frontends”.

Let’s say you have two public facing frontends running NGINX.

Those would be the ones you setup in Fastly like this.

Here we see two Backends in Fastly pannel at Configure -> Hosts.

Notice that in this example i’m using 443 port because my backends are configured to communicate between Fastly and my frontends through TLS. But you don’t need to.

Quoting again @sam;

[quote=“sam, post:1, topic:21467”]
To server “long polling” requests from a different domain, set the Site Setting long polling base url to the origin server.[/quote]

Really means here is that we would have to put one of those IP addresses in Discourse settings.

What I’d recommend is to create a list of A entries for all your frontends.

In the end we need three things:

  1. What’s the public name that Fastly will serve
  2. Which IPs are the frontends
  3. Which hostname we want to use for long polling and we’ll add it to our VirtualHost

The zone file would look like this;

# The public facing URL
discourse.some-origin.com.  IN CNAME global.prod.fastly.net.

# The list of IP addresses you’d give to Fastly as origins/backends
frontends.some-origin.com.  IN A
frontends.some-origin.com.  IN A

# The long polling URL entry
discoursepolling.some-origin.com.  IN CNAME frontends.some-origin.com.

That way you can setup the “long polling base url” correctly without setting a single point of failure.

Then, we can go in Discourse admin zone and adjust the “long polling base url” to our other domain name.

# /etc/nginx/sites-enabled/10-discourse

# Let’s redirect to SSL, in case somebody tries to access the direct IP with
# host header.
server {
    listen      80;
    server_name discoursepolling.some-origin.com discourse.some-origin.com;
    include     common_params;
    return      301 https://$server_name$request_uri;

server {
    listen      443 ssl;
    server_name discoursepolling.some-origin.com discourse.some-origin.com;
    # Rest of NGINX server block
    # Also, I would make a condition if we are in discoursepolling but not
    # under using anything specific to polling.
    # #TODO; find paths specific to polling

To see if it works; look at your web browser developer tool “Network inspector” for /poll calls on discoursepolling.some-origin.com, and see if you have 200 OK status code.

(Brahn) #5

To clarify something here, in a multisite configuration, all sites should use the same long polling url? It looks to me like the this line is making that a requirement:


Edit: No wait, that doesn’t work.

base site: example.com
long polling url: origin.example.com

multisite 1: mysite.com

If mysite uses origin.example.com as the long polling address I get:

XMLHttpRequest cannot load https://origin.example.com/message-bus/634dd18187094c6c950c0bf14f74c239/poll. Response to preflight request doesn't pass access control check: The 'Access-Control-Allow-Origin' header has a value 'https://example.com' that is not equal to the supplied origin. Origin 'https://mysite.com' is therefore not allowed access.

If mysite uses it’s own long polling origin as the domain I get this:

XMLHttpRequest cannot load https://origin.mysite.com/message-bus/b35c9c8e958f44f78d0d4773dc6d75f3/poll. Response to preflight request doesn't pass access control check: The 'Access-Control-Allow-Origin' header has a value 'https://example.com' that is not equal to the supplied origin. Origin 'https://mysite.com' is therefore not allowed access.

Is this because of "Access-Control-Allow-Origin" => Discourse.base_url_no_prefix ?


I have noticed there is no “cloudfront.template.yml” in discourse_docker/templates/. So I am wondering:
Can CloudFront work using the same techniques ?


Also, can we use http2 ? Is the long polling stuff still needed when using http2 ?

(Felix Freiberger) #8

If you’re using a the supported Docker-based install, HTTP2 should be working automatically! :sunny:

Long polling is still needed for notifications to appear live.

(James Mc Mahon) #9

I think if you have cloudfront setup, it’s only delivering specific objects (images), rather than the site/application in it’s entirety with js and so on.

So the only thing you need is to have the correct cloudfront url for those images.