Google Maps Onebox data is not cached

Test case: https://goo.gl/maps/R6nj3Qwf2LR2

https://goo.gl/maps/R6nj3Qwf2LR2

Rendering this onebox issues 10 network requests in sequence every time.


Update: I mean server-side network requests when calling Onebox.preview(map_url).to_s.

Are you talking about the pngs from Google?

Every special onebox is just an iframe where the remote site is responsible for everything and we can’t control it.

No, I’m talking about the server-side (google_maps_onebox.rb)

2 Likes

I am still struggling with this, how can the onebox gem fix it.

This markup is cached fine:

<div class="regular contents"><div class="cooked"><p>Test case: <a href="https://goo.gl/maps/R6nj3Qwf2LR2" rel="nofollow noopener">https://goo.gl/maps/R6nj3Qwf2LR2</a></p>

<div class="maps-onebox">
<div style="background:transparent;position:relative;width:690px;height:400px;top:400px;margin-top:-400px;"></div>
<iframe src="https://maps.google.com/maps?ll=51.5332441,-0.1281903&amp;z=17&amp;output=embed&amp;dg=ntvb&amp;q=Google+UK" width="690" height="400" frameborder="0" style="border:0">&amp;lt;img src="https://maps.googleapis.com/maps/api/staticmap?maptype=roadmap&amp;amp;amp;center=51.5332441,-0.1281903&amp;amp;amp;zoom=17&amp;amp;amp;size=690x400&amp;amp;amp;sensor=false" width="690" height="400"&amp;gt;</iframe>
</div>

<p>Rendering this onebox issues <strong>10</strong> network requests in sequence every time.</p></div><section class="post-menu-area clearfix"><nav class="post-controls clearfix"><div class="actions"><button class="widget-button toggle-like like no-text" aria-label="like this post" title="like this post"><i class="fa fa-heart" aria-hidden="true"></i></button><button class="widget-button share no-text" aria-label="share a link to this post" title="share a link to this post" data-share-url="/t/bug-google-maps-onebox-data-is-not-cached/59124?u=sam" data-post-number="1"><i class="fa fa-link" aria-hidden="true"></i></button><button class="widget-button show-more-actions no-text" aria-label="show more" title="show more"><i class="fa fa-ellipsis-h" aria-hidden="true"></i></button><button class="widget-button reply create fade-out" aria-label="begin composing a reply to this post" title="begin composing a reply to this post"><i class="fa fa-reply" aria-hidden="true"></i>Reply</button></div></nav></section></div>

Is there an option to amend the markup so somehow the iframe stops it barrage of requests and uses a local cache or something?

This is in no way a bug of any kind.

@codinghorror Every other onebox’s data is cached. For maps, though, Onebox.preview(maps_url).to_s makes network calls every time.

Correct behaviour:

Onebox.preview('https://xkcd.com/123').to_s # network call
Onebox.preview('https://xkcd.com/123').to_s # no network call because it's been cached

The maps onebox:

Onebox.preview('https://goo.gl/maps/R6nj3Qwf2LR2').to_s # 10 network calls
Onebox.preview('https://goo.gl/maps/R6nj3Qwf2LR2').to_s # 10 network calls

The library states it provides caching. The caching is not working for the maps onebox. How is it unreasonable to consider this a bug?

Even if you consider caching simply an optimization (I disagree but can see how you may think so), how is it “in no way a bug of any kind”? Performance issues are a kind of a bug. These 10 sequential network calls take about 5 seconds (I’ve tried on Linode London, Scaleway Amsterdam, and Heroku US).

Perhaps you’ve missed my post clarifying that this is about server-side? That would explain it.

I’m with @codinghorror: this isn’t a bug.

Consider the difference in response headers from the two requests:

$ wget -O /dev/null -S https://xkcd.com/123 |& grep Cache-Control
  Cache-Control: max-age=300
  Cache-Control: max-age=300
$ wget -O /dev/null -S https://goo.gl/maps/R6nj3Qwf2LR2 |& grep Cache-Control
  Cache-Control: no-cache, no-store, max-age=0, must-revalidate
  Cache-Control: private
  Cache-Control: no-cache, must-revalidate

(There are multiple headers because of redirects being followed)

It would be a bug if the onebox library cached data when the origin said not to. The other way around… not so much.

4 Likes

@mpalmer Good point, I haven’t realized the intent is to respect the cache headers.

However, the onebox currently does not respect the headers. Instead, it always caches everything forever.

https://github.com/discourse/onebox/blob/master/lib/onebox/engine.rb#L65

It only does the correct thing for Google Maps currently by accident. If Google Maps changes the cache headers tomorrow, it will no longer do the correct thing.

1 Like

I think you are totally misunderstanding how we use the onebox gem in Discourse.

https://github.com/discourse/discourse/blob/master/lib/oneboxer.rb#L147

In fact if you look at the actual backend code you will notice that we never use the built-in moneta cache thingy in the onebox gem.

In fact, my vote here is that we should just nuke the moneta dependency and dump cache logic built into the onebox gem and have consumers be in charge of it.

As to “allowing origin to participate in caching decisions”, well who knows, maybe an interface can be added that allows providers to pass that information on.

Regardless, Discourse uses a “dumb” cache that caches all oneboxes for 1 day unless you rebake a post or force the cache invalidation.

2 Likes

Yes let’s do that please. Can you add to your list @techapj?

1 Like

@sam I see, so, Discourse caches the view for 1 day regardless of the cache headers.

The onebox gem caches only the data necessary to generate the view, which has a much smaller footprint than caching the entire view, often an order of magnitude smaller.
The consumers of the gem would not be able to do that with the current interface.

The consumers of the gem also would not be able to implement cache header-aware caching because they do not have this information.

As to “allowing origin to participate in caching decisions”, well who knows, maybe an interface can be added that allows providers to pass that information on.

If caching is left to the gem users, the interface should be changed at the same time to enable the use cases above. Otherwise, the users of the gem are left with an unsolvable problem.

Thank you for the explanation! By the way, I do not use Discourse. Only posting here because the gem has Issues disabled on GitHub.

Thing is, there are currently 0 providers that are cache header aware, and even if the actual data required to generate a onebox is usually much smaller, we are still talking pretty tiny payloads of view.

Regardless, I am totally happy to nuke this built in caching, because I find it super complicated to follow in the code and just have preview return all the data required to make the decisions.

Onebox.preview("http://xyz") =>

{
    data: {} # raw data used to render the onebox,
    valid_till: ... # optionally return when it is valid till
    etc... 
}

Then you can cache data in the consumer, and call preview forcing data if you wish to bypass http

Feel free to work on a PR in this direction I think it is a valid change

6 Likes

@sam :thumbsup: Also, there needs to be a way to pass the cached data to onebox, e.g. by adding a data argument to Onebox.preview.

2 Likes

Done via

https://github.com/discourse/onebox/commit/8c61a33a6e4459fc267cecb7a2938fa58544bd55

6 Likes