Embedding Discourse on blog: link to forum does not appear

embedding

(oblio) #1

I’m trying to Embed discourse using the standalone setup (standard installation). I’ve configured discourse to allow dota.oblio360.com as an embeddable host. And I’m using this code snippet for the page itself:

<div id="discourse-comments"></div>

<script type="text/javascript">
var discourseUrl = "https://forum.oblio360.com/",
discourseEmbedUrl = "https://dota.oblio360.com{{ page.url }}";

(function() {
var d = document.createElement('script'); d.type = 'text/javascript'; d.async = true;
d.src = discourseUrl + 'javascripts/embed.js';
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(d);
 })();
</script>

However I’m seeing this in the logs:

Job exception: Wrapped RuntimeError: redirection forbidden: https://dota.oblio360.com/2015/08/10/zero-hero-storm-spirit -> http://dota.oblio360.com/2015/08/10/zero-hero-storm-spirit/

Is there a reason Discourse is trying to redirect to http?

The end result can be seen here: https://dota.oblio360.com/2015/08/10/zero-hero-storm-spirit/
-> “Loading discussion” but no link or anything.

The forum itself is here: https://forum.oblio360.com/

Also another question: what’s up with the loading time tags (10.5 ms & co)? Can I disable them?


(Robin Ward) #2

The site links you posted are 404ing, did you figure it out?


(oblio) #3

Ah, I changed the time stamp of the post. Same error.

Try this: https://dota.oblio360.com/2015/07/15/zero-hero-storm-spirit/

Full stacktrace: Started GET "/embed/comments?embed_url=https%3A%2F%2Fdota.oblio360.com%2F2015%2F - Pastebin.com


(Robin Ward) #4

It looks like the error is something is trying to redirect the https request to http:

redirection forbidden: https://dota.oblio360.com/2015/07/15/zero-hero-storm-spirit -> http://dota.oblio360.com/2015/07/15/zero-hero-storm-spirit/

Do you have any idea what would be causing the request to redirect like that?


(oblio) #5

Well, I’ll try to explain the entire setup.

The blog is served by Github Pages. Github Pages doesn’t offer HTTPS so I’m using Cloudflare as DNS/CDN/HTTPS proxy.
So Cloudflare is redirecting HTTPS to HTTP, but it is transparent. All the web requests only hit Cloudflare.

Discourse is hosted on a vanilla Ubuntu 14.04 instance on which I’ve followed the instructions for a standalone install of Discourse + SSL.

So there aren’t any extra layers between Discourse and the blogs/Cloudflare. At no point, from Discourse’s point of view, is dota.oblio360.com available through HTTP.

I’ve followed the steps from the page about embedding Discourse and that’s why I’m puzzled: there should be no reason to redirect from HTTPS to HTTP.

I’ve tried wget https://dota.oblio360.com and it shows no redirect…

I have no other proxy/server on the Discourse VM, except for Discourse itself.


(oblio) #6

In the end I just switched the blogs to HTTP as they’re supposed to be read-only, anyway. So embedding works now.

Maybe someone could clarify something about this entire process:

The browser connects to the CDN (Cloudflare in my case), via HTTPS. The certificate thing is a bit tricky since it is based on Cloudflare’s Universal SSL: Introducing Universal SSL

The forum is also on HTTPS, with is own category 1 certificate from StartSSL.

Why would Discourse ever think about connecting to the HTTP blog? It doesn’t even know about it. The request is made through AJAX from the browser, and the browser only knows about the HTTPS front end offered by CloudFlare. Discourse only knows about the HTTPS front end since that’s what the Javascript embedding snippet tells it.

I’m still baffled by this :frowning:

Anyway, I worked around the problem. So things should be alright now.

Thanks for the cool software, I love the billion little details especially tied to conversations and fighting spam!


#7

I am running the same setup (including the Dota content, ha!), and I figured out what is causing this.

There are two reasons why this happens:

1. Discourse tries to fetch the url without the trailing slash

My embed url is set to: https://www.sentryguides.com/heroes/weaver/

In the error logs, I found this under env:

embed_url[https://www.sentryguides.com/heroes/weaver, https://www.sentryguides.com/heroes/weaver/]

Notice the missing trailing / in the first url. It tries to fetch the first, but because redirects from https -> http are not allowed (see #2), throws an exception and never fetches the second. I think this is a bug (@eviltrout ?) .

2. GitHub Pages redirects to regular HTTP

Since there is a trailing slash missing and we try to fetch a folder, Pages does a 301 redirect to http://www.sentryguides.com/heroes/weaver/. Cloudflare would instantly redirect to HTTPS again, but since Discourse already errored out at this point, we never get to that.

GitHub doesn’t really seem to care though, they replied to someone’s email about this GitHub pages redirect from https to http · Issue #289 · isaacs/github · GitHub

Sorry for the trouble. We don’t yet officially support HTTPS on
GitHub Pages. While it works in certain situations, we don’t recommend
relying on it, or promoting your pages using HTTPS URLs.

Even though GitHub doesn’t handle this correctly, I still think it is wrong by Discourse to even use this url in the first place.


(Kane York) #8

Actually, what that means is that both urls were tried at least once, and resulted in the same error.

That sounds like a normal visitor would get a redirect loop, but I don’t see that. I can load the HTTPS version just fine.


#9

I didn’t read enough of the code to understand what is going on (and I don’t really know ruby anway), but at least there’s no log entry for the second url.

No, what happens is this:

GET https /foo
301 http /foo/ <- GitHub
GET http /foo/ <- redirection forbidden
301 https /foo/ <- Cloudflare
GET https /foo/
200


(Kane York) #10

That is the log entry for both urls. There’s an incident count, right? And several other env vars should be arrays too, such as the client IPs.


#11

Ok, found the right code: discourse/topic_embed.rb at 8a5f8d62b2e07575db3581a80212e024610cfbba · discourse/discourse · GitHub

What I don’t know is why it does what it does. In find_remote, it should fetch original_uri IMO; /foo and /foo/ are technically different resources (not that they should be, but that’s how http works).


#12

A simple workaround is to just add a second trailing slash to the url.


(oblio) #13

Cool! I’ll switch everything back to HTTPS as soon as I can. Thanks for the help!


#14

@eviltrout Sorry for pinging you again, but I still think this is a bug and you wrote the code behind it. :wink:


(Robin Ward) #15

Sure, I think I realized the mistake I made here:

https://github.com/discourse/discourse/commit/e2b59195799ecefdd0410431e92dee84ceafbcfe


#16

Awesome, thank you very much!