Bootstrap app container in 2 steps

In this topic @sam said:

Internally when we deploy we use a “pre-bootstrapped” image as our base image and then simply run “assets:precompile” and “rake db:migrate” as our only bit of custom bootstrapping code.

Is there an official template for doing that? It seems that there isn’t.

To achieve it I dismembered the web.template.yml in web.template.validate.yml, web.template.build.yml and web.template.run.yml.

The validate image only do some validations of env variables and the like (they are at the beginning of the original template).

The build image I use only for bootstrap. It downloads the discourse repository from git, install the ruby dependencies and plugins, aside from creating some files.

The run image is launched with the rebuild option. It references the image generated by the bootstrap with the build image (it can reference a local image as well as a remote one, in a registry like docker hub). The template will execute the rake tasks to make the db migration and to precompile the assets, as well as creating the stateful directory (/shared) and subdirectories.

The way it is now, doing everything in one step, the app.yml line:

 - "templates/web.template.yml"

Would be now equivalent to:

 - "templates/web.template.validate.yml"
 - "templates/web.template.build.yml"
 - "templates/web.template.run.yml"

The files I use instead are:

app-build.yml

 - "templates/web.template.build.yml"

(has only the above template)

app-run.yml

 - "templates/web.template.validate.yml"
 - "templates/web.template.run.yml"

(has other templates, like postgres, redis, letsencrypt, and so on)

If you see the new templates, they are basically the original template, but divided in parts, having nothing more nor less (the 2 rake tasks are run last, but I don’t think that affect the overall script, because the other tasks are basically creation of files. The order of the creation of the /shared directory is also changed, but I don’t see a problem, because it isn’t used when pulling the repositories and installing the ruby dependencies)

Show full files

(I use variables that come from Ansible in the containers files to define the values dynamically based on some environment variables, that’s why tere are variables between {{ }}, but I think it’s clear what the files are about. The templates files don’t use Ansible variables, they are just raw yaml files)

containers/app-build.yml

## this is the all-in-one, standalone Discourse Docker build image template
##
## BE *VERY* CAREFUL WHEN EDITING!
## YAML FILES ARE SUPER SUPER SENSITIVE TO MISTAKES IN WHITESPACE OR ALIGNMENT!
## visit http://www.yamllint.com/ to validate this file as needed

templates:
  - "templates/web.template.build.yml"
params:
  db_default_text_search_config: "{{ text_search_config | default('pg_catalog.english', true) }}"

  ## Set db_shared_buffers to a max of 25% of the total memory.
  ## will be set automatically by bootstrap based on detected RAM, or you can override
  db_shared_buffers: "{{ db_shared_buffers | default('128MB', true) }}"

  ## can improve sorting performance, but adds memory usage per-connection
  #db_work_mem: "40MB"

  ## Which Git revision should this container use? (default: tests-passed)
  version: {{ version | default('tests-passed', true) }}

## Plugins go here
## see https://meta.discourse.org/t/19157 for details
hooks:
  after_code:
    - exec:
        cd: $home/plugins
        cmd:
          - git clone https://github.com/discourse/docker_manager.git

## Any custom commands to run after building
run:
  - exec: echo "Beginning of custom commands"
  ## If you want to set the 'From' email address for your first registration, uncomment and change:
  ## After getting the first signup email, re-comment the line. It only needs to run once.
  #- exec: rails r "SiteSetting.notification_email='info@unconfigured.discourse.org'"
  - exec: echo "End of custom commands"

containers/app-run.yml

## this is the all-in-one, standalone Discourse Docker container template
##
## After making changes to this file, you MUST rebuild
## /var/discourse/launcher rebuild app
##
## BE *VERY* CAREFUL WHEN EDITING!
## YAML FILES ARE SUPER SUPER SENSITIVE TO MISTAKES IN WHITESPACE OR ALIGNMENT!
## visit http://www.yamllint.com/ to validate this file as needed

base_image: "{{ base_image | default('local_discourse/app-build') }}"

templates:
  - "templates/postgres.template.yml"
  - "templates/redis.template.yml"
  - "templates/web.template.validate.yml"
  - "templates/web.template.run.yml"
  - "templates/web.ratelimited.template.yml"
## Uncomment these two lines if you wish to add Lets Encrypt (https)
  {{ use_ssl | ternary('', '#') }}- "templates/web.ssl.template.yml"
  {{ use_ssl | ternary('', '#') }}- "templates/web.letsencrypt.ssl.template.yml"

## which TCP/IP ports should this container expose?
## If you want Discourse to share a port with another webserver like Apache or nginx,
## see https://meta.discourse.org/t/17247 for details
expose:
  - "{{ http_port | default(80, true) }}:80"   # http
  - "{{ https_port | default(443, true) }}:443" # https

env:
  LANG: {{ lang | default('en_US.UTF-8', true) }}
  DISCOURSE_DEFAULT_LOCALE: {{ locale | default('en', true) }}

  ## How many concurrent web requests are supported? Depends on memory and CPU cores.
  ## will be set automatically by bootstrap based on detected CPUs, or you can override
  UNICORN_WORKERS: {{ workers | default(2, true) }}

  ## TODO: The domain name this Discourse instance will respond to
  ## Required. Discourse will not work with a bare IP number.
  DISCOURSE_HOSTNAME: {{ hostname }}

  ## Uncomment if you want the container to be started with the same
  ## hostname (-h option) as specified above (default "$hostname-$config")
  #DOCKER_USE_HOSTNAME: true

  ## TODO: List of comma delimited emails that will be made admin and developer
  ## on initial signup example 'user1@example.com,user2@example.com'
  DISCOURSE_DEVELOPER_EMAILS: '{{ email }}'

  ## TODO: The SMTP mail server used to validate new accounts and send notifications
  # SMTP ADDRESS, username, and password are required
  # WARNING the char '#' in SMTP password can cause problems!
  DISCOURSE_SMTP_ADDRESS: {{ smtp_address }}
  DISCOURSE_SMTP_PORT: {{ smtp_port }}
  DISCOURSE_SMTP_USER_NAME: {{ smtp_user_name }}
  DISCOURSE_SMTP_PASSWORD: "{{ smtp_pass }}"
  DISCOURSE_SMTP_ENABLE_START_TLS: {{ start_tls | default('true', true) }}

  ## If you added the Lets Encrypt template, uncomment below to get a free SSL certificate
  {{ use_ssl | ternary('', '#') }}LETSENCRYPT_ACCOUNT_EMAIL: {{ ssl_email | default('', true) }}

  ## The CDN address for this Discourse instance (configured to pull)
  ## see https://meta.discourse.org/t/14857 for details
  {{ use_cdn | ternary('', '#') }}DISCOURSE_CDN_URL: {{ cdn_url | default('', true) }}

## The Docker container is stateless; all data is stored in /shared
volumes:
  - volume:
      host: {{ host_shared_volume | default('/var/discourse/shared/app', true) }}
      guest: /shared
  - volume:
      host: {{ host_log_volume | default('/var/discourse/shared/app/log/var-log', true) }}
      guest: /var/log

## Any custom commands to run after building
run:
  - exec: echo "Beginning of custom commands"
  ## If you want to set the 'From' email address for your first registration, uncomment and change:
  ## After getting the first signup email, re-comment the line. It only needs to run once.
  #- exec: rails r "SiteSetting.notification_email='info@unconfigured.discourse.org'"
  - exec: echo "End of custom commands"

templates/web.template.validate.yml

run:
  - exec: thpoff echo "thpoff is installed!"
  - exec: /usr/local/bin/ruby -e 'if ENV["DISCOURSE_SMTP_ADDRESS"] == "smtp.example.com"; puts "Aborting! Mail is not configured!"; exit 1; end'
  - exec: /usr/local/bin/ruby -e 'if ENV["DISCOURSE_HOSTNAME"] == "discourse.example.com"; puts "Aborting! Domain is not configured!"; exit 1; end'
  - exec: /usr/local/bin/ruby -e 'if (ENV["DISCOURSE_CDN_URL"] || "")[0..2] == "//"; puts "Aborting! CDN must have a protocol specified. Once fixed you should rebake your posts now to correct all posts."; exit 1; end'

templates/web.template.build.yml

env:
  # You can have redis on a different box
  RAILS_ENV: 'production'
  UNICORN_WORKERS: 3
  UNICORN_SIDEKIQS: 1
  # this gives us very good cache coverage, 96 -> 99
  # in practice it is 1-2% perf improvement
  RUBY_GLOBAL_METHOD_CACHE_SIZE: 131072
  # stop heap doubling in size so aggressively, this conserves memory
  RUBY_GC_HEAP_GROWTH_MAX_SLOTS: 40000
  RUBY_GC_HEAP_INIT_SLOTS: 400000
  RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR: 1.5
  
params:
  # SSH key is required for remote access into the container
  version: tests-passed

  home: /var/www/discourse
  upload_size: 10m

run:
  - exec: chown -R discourse /home/discourse
  # TODO: move to base image (anacron can not be fired up using rc.d)
  - exec: rm -f /etc/cron.d/anacron
  - file:
     path: /etc/cron.d/anacron
     contents: |
        SHELL=/bin/sh
        PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

        30 7    * * *   root	/usr/sbin/anacron -s >/dev/null
  - file:
     path: /etc/runit/1.d/copy-env
     chmod: "+x"
     contents: |
        #!/bin/bash
        env > ~/boot_env
        conf=/var/www/discourse/config/discourse.conf

        # find DISCOURSE_ env vars, strip the leader, lowercase the key
        /usr/local/bin/ruby -e 'ENV.each{|k,v| puts "#{$1.downcase} = '\''#{v}'\''" if k =~ /^DISCOURSE_(.*)/}' > $conf

  - file:
     path: /etc/service/unicorn/run
     chmod: "+x"
     contents: |
        #!/bin/bash
        exec 2>&1
        # redis
        # postgres
        cd $home
        chown -R discourse:www-data /shared/log/rails
        LD_PRELOAD=$RUBY_ALLOCATOR HOME=/home/discourse USER=discourse exec thpoff chpst -u discourse:www-data -U discourse:www-data bundle exec config/unicorn_launcher -E production -c config/unicorn.conf.rb

  - file:
     path: /etc/service/nginx/run
     chmod: "+x"
     contents: |
        #!/bin/sh
        exec 2>&1
        exec /usr/sbin/nginx

  - file:
     path: /etc/runit/3.d/01-nginx
     chmod: "+x"
     contents: |
       #!/bin/bash
       sv stop nginx

  - file:
     path: /etc/runit/3.d/02-unicorn
     chmod: "+x"
     contents: |
       #!/bin/bash
       sv stop unicorn

  - exec:
      cd: $home
      hook: code
      cmd:
        - git reset --hard
        - git clean -f
        - git remote set-branches --add origin master
        - git pull
        - git fetch origin $version
        - git checkout $version
        - mkdir -p tmp/pids
        - mkdir -p tmp/sockets
        - touch tmp/.gitkeep

  - exec:
      cmd:
        - "cp $home/config/nginx.sample.conf /etc/nginx/conf.d/discourse.conf"
        - "rm /etc/nginx/sites-enabled/default"
        - "mkdir -p /var/nginx/cache"

  - replace:
      filename: /etc/nginx/nginx.conf
      from: pid /run/nginx.pid;
      to: daemon off;

  - replace:
      filename: "/etc/nginx/conf.d/discourse.conf"
      from: /upstream[^\}]+\}/m
      to: "upstream discourse {
        server 127.0.0.1:3000;
      }"

  - replace:
      filename: "/etc/nginx/conf.d/discourse.conf"
      from: /server_name.+$/
      to: server_name _ ;

  - replace:
      filename: "/etc/nginx/conf.d/discourse.conf"
      from: /client_max_body_size.+$/
      to: client_max_body_size $upload_size ;

  - exec:
      cmd: echo "done configuring web"
      hook: web_config

  - exec:
      cd: $home
      hook: web
      cmd:
        # ensure we are on latest bundler
        - gem update bundler
        - find $home ! -user discourse -exec chown discourse {} \+

  - exec:
      cd: $home
      hook: bundle_exec
      cmd:
        - su discourse -c 'bundle install --deployment --retry 3 --jobs 4 --verbose --without test development'
        
  - file:
     path: /usr/local/bin/discourse
     chmod: +x
     contents: |
       #!/bin/bash
       (cd /var/www/discourse && RAILS_ENV=production sudo -H -E -u discourse bundle exec script/discourse "$@")

  - file:
     path: /usr/local/bin/rails
     chmod: +x
     contents: |
       #!/bin/bash
       # If they requested a console, load pry instead
       if [ "$*" == "c" -o "$*" == "console" ]
       then
        (cd /var/www/discourse && RAILS_ENV=production sudo -H -E -u discourse bundle exec pry -r ./config/environment)
       else
        (cd /var/www/discourse && RAILS_ENV=production sudo -H -E -u discourse bundle exec script/rails "$@")
       fi

  - file:
     path: /usr/local/bin/rake
     chmod: +x
     contents: |
       #!/bin/bash
       (cd /var/www/discourse && RAILS_ENV=production sudo -H -E -u discourse bundle exec bin/rake "$@")

  - file:
     path: /usr/local/bin/rbtrace
     chmod: +x
     contents: |
       #!/bin/bash
       (cd /var/www/discourse && RAILS_ENV=production sudo -H -E -u discourse bundle exec rbtrace "$@")

  - file:
     path: /usr/local/bin/stackprof
     chmod: +x
     contents: |
       #!/bin/bash
       (cd /var/www/discourse && RAILS_ENV=production sudo -H -E -u discourse bundle exec stackprof "$@")

  - file:
     path: /etc/update-motd.d/10-web
     chmod: +x
     contents: |
       #!/bin/bash
       echo
       echo Use: rails, rake or discourse to execute commands in production
       echo

  - file:
     path: /etc/logrotate.d/rails
     contents: |
        /shared/log/rails/*.log
        {
                rotate 7
                dateext
                daily
                missingok
                delaycompress
                compress
                postrotate
                sv 1 unicorn
                endscript
        }

  - file:
     path: /etc/logrotate.d/nginx
     contents: |
        /var/log/nginx/*.log {
          daily
          missingok
          rotate 7
          compress
          delaycompress
          create 0644 www-data www-data
          sharedscripts
          postrotate
            sv 1 nginx
          endscript
        }

  # move state out of the container this fancy is done to support rapid rebuilds of containers,
  # we store anacron and logrotate state outside the container to ensure its maintained across builds
  # later move this snipped into an intialization script
  # we also ensure all the symlinks we need to /shared are in place in the correct structure
  # this allows us to bootstrap on one machine and then run on another
  - file:
      path: /etc/runit/1.d/00-ensure-links
      chmod: +x
      contents: |
        #!/bin/bash
        if [[ ! -L /var/lib/logrotate ]]; then
          rm -fr /var/lib/logrotate
          mkdir -p /shared/state/logrotate
          ln -s /shared/state/logrotate /var/lib/logrotate
        fi
        if [[ ! -L /var/spool/anacron ]]; then
          rm -fr /var/spool/anacron
          mkdir -p /shared/state/anacron-spool
          ln -s /shared/state/anacron-spool /var/spool/anacron
        fi
        if [[ ! -d /shared/log/rails ]]; then
          mkdir -p /shared/log/rails
          chown -R discourse:www-data /shared/log/rails
        fi
        if [[ ! -d /shared/uploads ]]; then
          mkdir -p /shared/uploads
          chown -R discourse:www-data /shared/uploads
        fi
        if [[ ! -d /shared/backups ]]; then
          mkdir -p /shared/backups
          chown -R discourse:www-data /shared/backups
        fi

        rm -rf /shared/tmp/{backups,restores}
        mkdir -p /shared/tmp/{backups,restores}
        chown -R discourse:www-data /shared/tmp/{backups,restores}

  # change login directory to Discourse home
  - file:
     path: /root/.bash_profile
     chmod: 644
     contents: |
        cd $home

templates/web.template.run.yml

env:
  # You can have redis on a different box
  RAILS_ENV: 'production'
  UNICORN_WORKERS: 3
  UNICORN_SIDEKIQS: 1
  # this gives us very good cache coverage, 96 -> 99
  # in practice it is 1-2% perf improvement
  RUBY_GLOBAL_METHOD_CACHE_SIZE: 131072
  # stop heap doubling in size so aggressively, this conserves memory
  RUBY_GC_HEAP_GROWTH_MAX_SLOTS: 40000
  RUBY_GC_HEAP_INIT_SLOTS: 400000
  RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR: 1.5

  DISCOURSE_DB_SOCKET: /var/run/postgresql
  DISCOURSE_DB_HOST:
  DISCOURSE_DB_PORT:

params:
  home: /var/www/discourse

run:
  - exec:
      cd: $home
      cmd:
        - mkdir -p                    /shared/log/rails
        - bash -c "touch -a           /shared/log/rails/{production,production_errors,unicorn.stdout,unicorn.stderr,sidekiq}.log"
        - bash -c "ln    -s           /shared/log/rails/{production,production_errors,unicorn.stdout,unicorn.stderr,sidekiq}.log $home/log"
        - bash -c "mkdir -p           /shared/{uploads,backups}"
        - bash -c "ln    -s           /shared/{uploads,backups} $home/public"
        - bash -c "mkdir -p           /shared/tmp/{backups,restores}"
        - bash -c "ln    -s           /shared/tmp/{backups,restores} $home/tmp"
        - chown -R discourse:www-data /shared/log/rails /shared/uploads /shared/backups /shared/tmp

  - exec:
      cd: $home
      hook: db_migrate
      cmd:
        - su discourse -c 'bundle exec rake db:migrate'
        
  - exec:
      cd: $home
      hook: assets_precompile
      cmd:
        - su discourse -c 'bundle exec rake assets:precompile'

After setting up the files containers/app-build.yml and containers/app-run.yml I can start / rebuild discourse with just:

cd /var/discourse
./launcher bootstrap app-build
./launcher rebuild app-run

To add plugins I can include them the app-build.yml file and run the above commands.

Another approach is to bootstrap the image of app-build (it can be done even locally) and push to a registry, and then reference the image in app-run.yml. To rebuild would be just:

cd /var/discourse
./launcher rebuild app-run

(assuming the base image is in the correct version)


The pros of running in 2 steps (it will run in only one container in the end, so for the end user it would be transparent) instead of only 1 is that there will be less downtime because the steps that download the git repository and install the ruby dependences (that are in the build step) can be run before stopping the server (it reduced the downtime in a $5 droplet I tested from 6 minutes to 4, the precompile assets task being the main culprit of the 4 minutes, because, unfortunately, it depends on the database and must be run after the db migration).

The pros of running remotely in 2 steps instead of locally (both have the pro of the reduced downtime) is that you can use the same image for different environments, like staging and production, as well as different clients (if they use the same plugins), because the environment variables are used only in the 2nd step.

The cons (of remote bootstrap vs local) is that you mustn’t upgrade discourse through the UI, and when including plugins you should first push the new image with the plugin in the repository. Although I don’t actually see this as a con because I would first test either the new discourse version as well as new plugins in a staging environment to avoid unexpected surprises. Using the same remote image for staging and productions avoid cases liking testing in staging and having errors in production because the discourse repository or some plugin had a new commit in the meantime (the image would have the exact same files, so that could avoid a lot of edge cases between staging and production due to version (or commit) mismatch).


What I would like:

Although I don’t see the templates changing frequently in the discourse_docker repo, I would like to request the discourse team to create templates (could even use the ones I declared above, or I could even do a PR) so as to provide an official support for deploying the app container in 2 steps (I could adapt it for multicontainer step, but my request is only for the 1 container setup).

The reasons are:

  1. To have less downtime when rebuilding discourse.
  2. To use the same base image in staging and production to avoid problems caused by different commits in the discourse repository or in the plugins directory when updating discourse

I don’t know how the discourse team will see this request, but I hope to have conveyed my thoughts, as well as the pros of the change (it doesn’t need to change the default way that is done, in only 1 step, just have this way in 2 steps supported). I also haven’t added nor removed tasks in the web template, just divided it in parts so that it wouldn’t be a big change from the way it currently is done.

9 Likes

This sounds like a very valid feature request. Has anything happened meanwhile? Is there e.g. any activity on Github about it?

Please do keep in mind that internally here we have an extensive multisite setup with multiple containers to minimize downtime. The easiest way for self-hosters to minimize downtime is to run a two-container install (you can bootstrap the new, updated container before stopping the old one and starting the new one). This does increase the technical complexity of the site quite a bit, so it’s not for everyone.

1 Like

The discourse team didn’t contact me (perhaps this feature would increase the complexity of the project, but I think very little), so for now there is nothing (but the above should work anyway, although it wouldn’t be in the official discourse_docker repository).

@justin Actually my main reason isn’t even to minimize downtime, but keep a base image with all the ruby gems, plugins and the discourse repository already installed, which could be used in different environments, like staging and production, and even for different projects, if they use the same plugins, running only the db migration and assets precompile tasks (which must be run in the target machine), and would avoid problems when upgrading discourse.

The reduced downtime would be more like a colateral, a welcomed one. That said, in the end it’s up to the discourse team if this feature should be included or not, but I would be very pleased.

As someone who spends an insane amount of time supporting novice sysadmins, I think it increases the complexity very much. It takes an enormous amount of work to support just the one dead-simple foolproof configuration. Even managing a two-container install and knowing to do replace app with web_only (not to mention knowing how and when to update the data container) increases the complexity at least 4 fold.

This, I think is the big win.

2 Likes

@pfaffman If you give a look at the changes I made, I haven’t changed the instructions of the web template in discourse_docker, just separated them in 3 files, and that is the only thing I requested actually: one of them runs instructions that can be environment agnostic, like installing ruby gems, the discourse and plugins repository, and so on, another do some validations and the 3rd do environment specific stuff, like doing the db migration and precompiling the assets.

The complexity I think it adds is that when something is added on those files, the discourse team should consider if they are or aren’t environment specific (like depending on stuff in the db), and, based on that, define in which file it should be included.

The official install would work the same way it works now, except that it would include the 3 templates in the app.yml file (or make the templates/web.template.yml file include those 3, if it can be done, and in this case, from a user POV, nothing would be changed), instead of doing it in more than 1 step (it wouldn’t have the benefits of the 2 step install, but would work exactly how it’s done currently, so no breaking changes).

I know that sysadmin stuff can be really complex, but I tried to make it in the most straightforward way, considering how it’s currently done, to avoid much complexity (that’s why I haven’t said about k8s, separate the services in a docker-compose file, or similar stuff, because it would require big changes, and instead based myself on how it is done in the official install).

5 Likes