Does Onebox engine cache content locally?

faq-material

(Robert McIntosh) #1

Question: when Onebox pulls in content from a site, including title, image, description and (in some cases) price, is that information taken as a snapshot at the time it is posted, or is that information updated each time the thread is loaded?

I am interested in whether certain data fields that can change over time, such as price of a product on Amazon, would be fixed in a conversation, or whether that element would be updated as the original page is updated.

This is particularly relevant if, for example, a product is removed from that site, and the page no longer exists. Would the onebox in the discussion continue to show the original data, or will it return an error?

I know that images can be fetched but is this the case for the other fields as well?


(David Taylor) #2

Most onebox implementations take a “snapshot” when the post is created. That snapshot will only be updated if you “rebuild HTML” on the post, or when the automatic ‘rebake’ happens periodically (see the rebake old posts count site setting). This is the case for the amazon onebox.

Some oneboxes make use of an iFrame, so they should always show the up-to-date information from the site (e.g. youtube).


(Robert McIntosh) #3

Thanks @David_Taylor

So, in the case of standard implementation, a user posts a link (to Amazon) and a snapshot of that product is captured and placed on the site. This data remains until the author edits the post, or a rebake happens.

Looking at the site settings, however, it seems that the default site setting is to rebake the latest ‘x’ posts every 15 minutes, so does that mean that Amazon is polled every 15 minutes for a change to the data?

rebake old posts count [250]:
Number of old posts to be rebaked every 15 minutes.

Presumably, if this is an iFrame (such as Youtube) this is bespoke and shows live data each time.

I am asking as I am developing a new Discourse site and I am considering how to implement our own site as onebox (probably as a plugin). However, when posting a link to an amazon product from a previous conversation on this site, I got the following:

I cannot get this to update (and as you see it was 2 hours ago).

If you click on that title, you are taken to the correct page:


(David Taylor) #4

Yep, Discourse also caches any onebox requests for 24 hours, to prevent repeated requests to third party sites like amazon.

If you’re in a development environment and want to purge the cache, there are some suggestions in this thread. Do not purge the redis cache on a production site - it will mess with lots of things:


(Sam Stickland) #5

Hi,

Would you accept a PR that creates an additional setting for whitebox hosts that shouldn’t be cached?

24 hours gives us some grief, but waiting for the 15 minute rebake would be fine.