Docker image update: Redis 6 and 25% smaller image size

Falco · November 26, 2020, 10:02pm

We just shipped a brand new container image that will be used on your next ./launcher rebuild app. As always, there is no need to change any configuration you have provided you followed our Discourse official Standard Installation. That said, there are new features that will help some installations out there.

Redis 6

We make heavy use of Redis in many places in Discourse, be it for cache, Sidekiq, MessageBus, Distributed Locks, Rate Limits All in all it’s been a rock solid choice for us.

However, at some very specific workloads, Redis could be a bottleneck. And because of Redis single threaded nature, coupled with our inability to use multiple instances due to our LUA scripts, it meant this was a hard bottleneck to workaround.

Thankfully, Redis 6 comes with support to using a thread pool for I/O operations, and during our tests it works very well with Discourse clusters bottlenecked by Redis.

So, if you are running on a machine with lots of CPU cores, and metrics showing Redis struggling to handle the load, you can now opt-in into using threads for writing operations via the app.yml params section:

params:
  redis_io_threads: "4" # 1 disables it, n>1 uses n-1 extra threads for IO writes

Smaller image

We opted to ship a large container image early on in the project, so we can make it easier for non-technical people to run Discourse, and handle all necessary dependencies, versioning, upgrades, etc.

That said, we recently went over 1GB for the compressed image, and that was a bit too much.

So in order to mitigate the ever increasing size of the image, we moved the Discourse source code from inside the image from a complete copy of the source code to a “shallow clone” containing only the most recent version of the code.

This changes makes the compressed image size 25% smaller, which results in less server space needed and faster rebuilds when a new image is released. It also should control the image growth over time.

We tested it on tests-passed/beta/stable, with both rebuilds and web updates and it doesn’t break any standard paths. However, users doing more exotic git stuff on the app.yml hooks may have to adapt their customizations.

merefield · November 27, 2020, 12:09pm

What happens if anything to the browser experience after such a redis upgrade? Any impact on cached assets? Does this get emptied as a result of the upgrade?

Falco · November 27, 2020, 2:21pm

Nothing.

Assets are saved into the local disk or object storage, and cached in the CDN. Redis doesn’t impact it.

The Redis data is kept during the upgrade.

codinghorror · November 28, 2020, 6:27am

What is the default value? 1?

Falco · November 28, 2020, 12:45pm

Yes. It comes from Redis own config file, where 1 means a single thread like the old version.

pfaffman · December 1, 2020, 8:30pm

I’ve got an instance where I have added:

after_redis:
  - replace:
      filename: "/etc/redis/redis.conf"
      from: /^databases.*/
      to: "databases 50"

And it’s failing to rebuild because:

25:M 01 Dec 2020 20:21:08.830 # FATAL: Data file was created with a Redis server configured to handle
 more than 16 databases. Exiting

Is there some other hook that I can snag to update the databases count before it tries to migrate or whatever it does?

Hmm. And now after the apparently failed rebuild, I see this in docker logs:

chgrp: invalid group: ‘syslog’

sam · December 1, 2020, 8:34pm

Why do you need more than 1 database?

pfaffman · December 1, 2020, 8:53pm

Multisite. Multiple instances using a single redis. I probably should have used a more generic redis container, but thought I’d stick with yours.

Falco · December 1, 2020, 8:56pm

Don’t multisite uses a single database and the standard Redis keyname spaces? AFAIK Redis databases are a thin layer and we have commands that go across their boundaries, so you should not rely on that.

sam · December 1, 2020, 8:59pm

Yeah multisite uses 1 database

pfaffman · December 1, 2020, 9:00pm

Oh. Then not multisite, just multiple instances running on one machine that each need a separate redis. I only upped the default 16 to 50 because I was too lazy to keep tight control over which redis databases were in use.

So I should run a separate redis container for each instance, I guess?

Falco · December 1, 2020, 9:02pm

Yes, otherwise you may run into stuff cross-talking.

pfaffman · December 1, 2020, 9:04pm

OH. Darn.

Thankfully, I learned this on a server that’s in use only for testing.

For the other sites, should I just give them a fresh redis and throw away what was scheduled? Do a backup/restore?

FWIW, I’ve not noticed any issues with crosstalk in the past year or so. And there is a way to set the DB. .

EDIT: Well, the good news is that I can enter the container, edit redis.conf and restart it and it starts working again.

If you’ve got a hint on how to move a site from DISCOURSE_REDIS_DB: 12 on one redis container to another redis container, I’d love to hear. Or maybe just don’t care about scheduled jobs?

Falco · December 1, 2020, 9:43pm

This. Discourse should reasonably survive a Redis flush. Some stuff is lost, but nothing critical.

pfaffman · December 1, 2020, 9:48pm

That’s what I’d though, as I don’t know any way that backups attempt to restore it (but there’s a lot I don’t know). It looks like I could do it like this: Copy all keys from one db to another in redis - Stack Overflow, and it appears to have worked in a test I just did, but it’ll be much easier to adjust my playbook to just create a new redis container and use it.

Thanks.

Now to figure out whether to run those redises on the database server or the web server. . .

pfaffman · December 3, 2020, 6:30pm

Then what is the db_id: 2 in the multisite config refer to?

Multisite configuration with Docker

  before_bundle_exec:
    - file:
        path: $home/config/multisite.yml
        contents: |
         secondsite:
           adapter: postgresql
           database: b_discourse
           pool: 25
           timeout: 5000
           db_id: 2
           host_names:
             - b.discourse.example.com

Falco · December 3, 2020, 6:34pm

A deprecated setting:

pfaffman · December 3, 2020, 6:35pm

LOL. Yes, It is indeed confusing!

Thanks. I’m working on a ‘multisite config with lets encrypt and no external reverse proxy’ topic, and when I do that, I’ll clean up the other one as well.

RGJ · December 3, 2020, 7:19pm

Just make sure to restart unicorn right after that, so it will recreate the scheduled tasks.
You will lose anything that is queued so you need to find a good moment to do this.

Ed_S · July 27, 2021, 7:30pm

Is this still working as it should be? Is there an easy one-liner to discover how large the compressed image is?

Topic		Replies	Views
How can I minimize the discourse Docker image size? Installation	7	137	August 14, 2025
Redis connection timed out Installation	30	9498	June 8, 2024
Cannot upgrade due to old version of docker Installation	5	1552	July 6, 2018
Stuck in a loop of freeing up space and filling it up again when rebuilding Installation server-resources	33	3770	June 8, 2024
Any advice on how to save space? Installation server-resources	25	1754	February 25, 2023

Docker image update: Redis 6 and 25% smaller image size

Redis 6

Smaller image

Related topics