AWS installation stuck in Read only mode

amoncadot · June 19, 2018, 3:56pm

Hello,

We’ve had our internal Discourse site for around 3 months now and suddenly it has gone into read only mode.

Based on other topics (Stuck in 'Read Only' Mode) I have already tried the following:

Login to docker instance: ./launcher enter app
Login to rails: rails c
Disable read only: Discourse.disable_readonly_mode(Discourse::USER_READONLY_MODE_KEY)
Quit rails: quit
Exit container: exit

After doing this, our application seems to come out of read only mode, then goes back to read only mode.

I tried to rebuild a container but now I am getting the following error:

"Caused by:
PG::ReadOnlySqlTransaction: ERROR:  cannot execute ALTER TABLE in a read-only transaction
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/rack-mini-profiler-1.0.0/lib/patches/db/pg.rb:92:in `async_exec'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/rack-mini-profiler-1.0.0/lib/patches/db/pg.rb:92:in `async_exec'"

I guess because it’s stuck in read only mode for some reason?

Current setup is:

4 containers (2 cannot rebuild based on above, 2 are still running from 2 weeks ago)
AWS
Elasticache Redis
RDS PostgreSQL

Regards,
amoncadot

amoncadot · June 19, 2018, 4:19pm

Hello,

Just followed the steps via Inheriting discourse install - need some assistance and still no luck.

I have two containers still running while the other two are offline.

My worry is that if I stop those two containers then I won’t be able to log back into Discourse.

I’d like to fix this while those two containers are still running.

Kind regards,
amoncadot

fefrei · June 19, 2018, 5:55pm

Can you check whether you’re out of disk space?

amoncadot · June 19, 2018, 6:03pm

Hi,

The web servers are not out of disk space and RDS is fine also.

Regards,
amoncadot

amoncadot · June 19, 2018, 6:12pm

It says that Redis is in a read only state:

I, [2018-06-19T18:01:55.777804 #13]  INFO -- : > cd /var/www/discourse && su discourse -c 'bundle exec rake db:migrate'

Preformatted textNo connection to db, unable to retrieve site settings! (normal when running db:create) WARN: Redis is in a readonly state.' Performed a noop Failed to report error: Connection lost (ECONNRESET) 2 Dropping undeliverable message: ERR Error running script (call to f_b06356ba4628144e123b652c99605b873107c9be): @user_script:14: @user_script: 14: -READONLY You can't write against a read only slave.

I have rebooted my Elasticache and failed over… yet Redis remains in a read only state… any ideas?

schleifer · June 19, 2018, 6:23pm

Is your ElastiCache setup Multi-AZ? That message suggests you are connecting to a secondary node in a cluster. Double check that the hostname you are using is the Primary Endpoint of the cluster.

amoncadot · June 19, 2018, 6:35pm

Yes ElastiCache is Multi-AZ. Setup is:

I’ve just tried to use an entirely different Redis cache and again the build failed.

My app.yml specifies the primary endpoint of the cluster.

So far I have no containers running, just postgreSQL and redis cache.

I am going to try and use a snapshot of RDS this morning with a new Redis cache but if that fails I am not sure what else I can do since I have no containers running to access the UI.

Why did Discourse suddenly go into read only mode without manual intervention?

schleifer · June 19, 2018, 6:43pm

The site can switch to read-only mode when databases return errors like “Redis is in a readonly state”.

There are multiple types of READONLY depending on the trigger. Discourse.disable_readonly_mode(Discourse::USER_READONLY_MODE_KEY) will only turn off one, you have to pass the other keys to turn off the other types.

amoncadot · June 19, 2018, 6:58pm

Ah okay.

When I had containers running, I disabled all three modes/keys you listed and it temporarily removed read only access and then returned back immediately. Hence why I have now moved onto trying to rebuild a different cache.

I had seen that the keys solution had worked for other people but for some reason it did not work for our application.

codinghorror · June 19, 2018, 11:14pm

Note that this is wildly on the enterprise side of complex setups, so there’s a limited amount we can help here.

amoncadot · June 20, 2018, 7:05am

Hi Jeff,

Thanks for the compliment. It’s nice to hear that the co-founder of Discourse/Stackoverflow considers our environment as enterprise even before it has been released

Problem solved. The issue was that Amazon Aurora was being used as RDS and by default this creates a cluster with two database instances inside - one primary and one replica.

Sometime yesterday an auto failover occurred and within our app.yml under the DISCOURSE_DB_HOST: parameter I had specified the DATABASE endpoint. Not the CLUSTER endpoint. The failover made the database endpoint specified in app.yml a read only replica, thus Discourse being locked into Read Only mode.

If anyone is running a similar setup:

EC2 instances with docker containers
Redis ElastiCache
Amazon Aurora RDS (PostgreSQL underneath)

Check that /var/discourse/containers/app.yml contains:

DISCOURSE_DB_HOST: RDS Cluster Endpoint (Go to RDS > Clusters > Access your cluster > Under Cluster endpoint is the endpoint you need to specify)
DISCOURSE_DB_PORT: RDS Cluster Port
DISCOURSE_REDIS_HOST: ElastiCache Primary Endpoint (Go to ElastiCache > Redis > Toggle the “play” shape button beside your Redis Cluster Name > Under Primary Endpoint is the endpoint you need to specify)
DISCOURSE_REDIS_PORT: Redis Cluster Port

Hope this helps someone!

@codinghorror Is there a way to run ./launcher rebuild app without pulling down the latest codebase from Discourse?

itsbhanusharma · June 20, 2018, 7:24am

You can pin the build number in your yml

amoncadot · June 20, 2018, 8:28am

Hi Bhanu,

Do you have a demonstration on how to do this?

itsbhanusharma · June 20, 2018, 8:32am

Yes,

Enable and use the version directive in your yml! it’s disabled by default and pinned to tests passed.

amoncadot · June 20, 2018, 8:37am

Hi Bhanu,

That “tests-passed” refers to a specific branch on Git correct? As I can see other branches such as stable where there are still commits being issued quite frequently.

Is there anyway we can remain on a specific version of Discourse even when we run a container rebuild? We want to be able to control our main Discourse codebase and update/test it within a release environment before deploying to prod.

itsbhanusharma · June 20, 2018, 8:48am

tests-passed will be the branch that is the most recent known working instance, not necessarily stable.

I would recomment you pin your dead containers to the version your active containers are on (you can probably see the version in your docker-manager) and then make these active first.

This is exactly why you’d pin your yml to only fetch a specific version, not tests-passed or stable.

Cameron_D · June 20, 2018, 8:57am

For the record, the version value is passed directly to a git checkout, so it can be any commit hash, branch or tag.

https://github.com/discourse/discourse_docker/blob/master/templates/web.template.yml#L115-L116

Edit: No onebox

techAPJ · June 20, 2018, 11:50am

Should be fixed now.

amoncadot · July 5, 2018, 8:35am

Thanks.

So to build a specific version of Discourse, all I’ll need to do is specify a commit hash as the version value?

Example of web.template.yml:

params:
  # Building from branch "stable" with latest commit
  version: 849b4b56853756a24f0646c04e733e5af7cc2a2b

This will then be picked up by:

- git fetch origin $version
        - git checkout $version

Is this correct?

Cameron_D · July 6, 2018, 4:47am

That should work, yes.

Topic		Replies	Views
Unable to get Redis out of read only mode post upgrade Installation docker , read-only	14	6056	March 30, 2019
Stuck in 'Read Only' Mode Support read-only	17	8349	June 8, 2024
Site automatically exits read-only mode when enabled by 'discourse enable_readonly' Dev read-only	10	382	March 30, 2024
Read-only mode please! Feature read-only	21	7507	May 7, 2015
This site is in read only mode Support	2	96	April 25, 2025

AWS installation stuck in Read only mode

Related topics