Why did Discourse start running their own Redis instead of using Elasticache?

Continuing the discussion from Meta is moving to the Cloud :cloud_with_lightning::

This sounds like a really interesting story. I’d love to hear it.

Have you told it already anywhere?

「いいね!」 5

Our main pain points were two.

1. Instance types

Elasticache didn’t allow you to run HA Redis without Cluster Mode on instance types below M4.large.

Support for cluster mode in Ruby Redis libraries is still being ironed out. MessageBus also does atomic operations on keys using LUA scripts and since that will span multiple servers it’s not allowed in cluster mode. We do host some BIG instances, so we may revisit using distributed writes for databases (Rails 6 is coming with that for PostgreSQL) but we aren’t needing this yet.

We do host sites which need instance types way larger than that. But also host a lot where a t3.small is more than enough.

Moving to running our own Redis allows us to pick and play with any instance types available in the target region.

2. Read only mode

Discourse can keep a connection open with read only nodes with both Discourse and Redis.

That allows people to keep reading on the site when masters go down.

Elasticache wasn’t very straightforward in providing endpoints for the replicas in the main cluster, that updated to current replicas in failover events.

「いいね!」 16

Also … not to forget. This offers us significant cost savings.

When you use ElasticCache you pay for the instances plus a tax for the ElasticCache service. Additionally we get much better control on utilisation since we can run multiple redises if we wish on a single instance.

「いいね!」 4

Falcoさん、ご指摘の点を踏まえると、無料のElasticacheはDiscourseのRedis要件をサポートしていないようです。したがって、Discourseサイトを切り離したい場合、他の場所でRedisサーバーを見つける必要があります。私の理解は正しいでしょうか?

それは違います。ElasticacheはDiscourseで問題なく動作します。また、2019年以降、AWSは上記で説明した両方の問題点を修正するために大きな進歩を遂げました。AWSでは、より小さいインスタンスタイプを選択でき、レプリカエンドポイントも提供されるようになりました。

「いいね!」 2

ファルコさん、ありがとうございます。試してみます!