如何最大化 Discourse 的 CDN 命中/缓存率?

大家好,

我目前的 CDN 设置如下:

  • 一个用于 JS 的拉取区域(pullzone)
  • 一个用于徽标和主页图形(这些资源访问频繁)的拉取区域
  • 一个用于 S3 资源(托管在 Digital Ocean Spaces)的拉取区域

我使用的是 BunnyCDN。我的论坛流量分布约为:85% 来自北美,5% 来自英国/法国,10% 来自东南亚。

对于前两个拉取区域,我使用了 BunnyCDN 全球 34 个边缘服务器,因为这些文件访问频繁,可以在保持低延迟的同时维持较高的缓存命中率(分别为 92% 和 99.8%)。理想情况下,我希望进一步提高 JS 的缓存命中率,但挑战在于流量较低的数据中心由于 JS 访问频率极低(每月仅几次),导致缓存命中率较差。

对于第三个用于 S3 资源的拉取区域,我仅使用了北美和欧洲的 10 个边缘服务器。因为用户生成的 S3 资源有时访问频率较低,我不希望使用过多的 CDN 节点拖低缓存命中率,进而影响访问速度。理想情况下,我希望在东南亚部署一个边缘服务器,但 BunnyCDN 不允许我手动选择特定的数据中心。目前我在 BunnyCDN 的缓存命中率约为 78%,我认为这已经相当不错,因为例如有人访问完整尺寸的原始图片而非优化后的图片,或者搜索引擎引荐到旧的、很少被访问的讨论帖,都会显著拉低该百分比。我之前使用的是 Cloudfront,其缓存命中率约为 55%,但这可能是因为 Cloudfront 的 CDN 边缘服务器分布更广,或者我的论坛流量相对于其规模来说较小。(我之所以从 Cloudfront 迁移,是因为成本问题,毕竟我们是一个收入微薄的爱好者论坛。)

想请教大家:你们是否有任何策略或方法可以保持较高的缓存命中率?你们目前的缓存命中率大概是多少?

有什么建议可以帮助我优化设置以提高该百分比吗?是否有预算友好的 CDN 允许我手动选择边缘服务器位置?是否可以通过边缘规则实现?如果可以,我可以选择五个区域(美国西海岸、美国东海岸、美国南部、英国、新加坡),因为这些地方是我的流量集中地,从而保持较高的缓存命中率。

我有一个想法:将优化后的资源从 S3 提供,而原始资源从 Digital Ocean Spaces 提供,但默认情况下软件无法实现这种分离。

还有其他建议吗?

I see, so your idea is to reduce the number of geographical points to increase hit rate? Because if you have “one” location aka everything on one server, that’s a perfect hit rate. :wink:

I would imagine this comes down to knowing your specific audience, and where they are – so metrics would need to be gathered first about which geographical CDN points are being hit, then consolidate to the most used?

If you want a higher hit rate then your choices are 1) fewer PoPs or 2) longer retention. The first choice is going to make the experience worse for some clients, the second is going to cost more money (if it’s even available).

This is exactly what you need to find out. What are your misses and where are they coming from?

The 5 PoPs closest to the majority of our traffic are averaging about low 80s% hit cache rate, whereas the other ones with more sporadic traffic- the lower the traffic is, the lower the hit rate, sometimes below 50%. That’s when I think consolidating the PoPs can bring the hit rate up so the CDN doesn’t always have to go back to the origin to fetch, which, speedwise is worse than just serving from the origin. Its a tradeoff between additional latency with PoPs located farther away, and reduced latency from increased cache rate at the PoP.

Longer retention is tougher to solve. That’s the lever that can bring up the hit rate for the high traffic PoPs, and I don’t necessarily have a solution for that, yet.

I am curious about others’ experience- Does high-70s% and low-80s% cache rate for user-uploaded assets feel low/ about right?

It really depends how often the Discourse instance is updated/deployed though. For us, we deploy a lot so that colors the data significantly.