Yes
Whatever ENV you specify when you finally do you docker run on the image … will take.
Yes
Whatever ENV you specify when you finally do you docker run on the image … will take.
And doing those things at every boot of the container is reasonable?
Some people like this pattern, I do not, that is why rails introduces some super fancy locking around migrations to ensure they never run concurrently when a cluster of 20 app try to run migrations at the same time.
Some people just boot the same image and run the migrations in some out-of-band process.
But each container should still rake assets:precompile at least once?
Depends on how do you host those.
If you have a NFS share of some sort (like AWS EFS) or if you upload those to some object storage service that can only be done on bootstrap.
感谢大家提供的所有有益回复。我还有几个关于如何正确操作的问题。
我正在尝试准备一个最小化的 app.yaml 文件,用于引导构建,其中仅包含容器构建所需的信息。其中一些信息显然与容器运行时相关——例如卷挂载和端口映射。但我不太确定环境变量。我想我会先试试看,但这些环境变量是在容器构建期间使用(以某种方式注入到 Dockerfile 中),还是仅在容器运行时使用?如果是后者,我会确保将它们放入相应的 Kubernetes 配置文件中。
其次,这里有些人提到将镜像推送到私有容器仓库。这是必须的吗?换句话说,构建镜像中是否包含任何不应发布到 Docker Hub 等公共仓库的机密信息?(我们目前还没有私有容器仓库,我希望避免搭建一个。)
最后,app.yaml 是否有设置可以控制所创建容器的名称?这更多是为了完善细节,但如果有这个功能就太好了
。
提前感谢您的帮助!(抱歉打扰了旧线程。在 Google 上搜索如何在 Kubernetes 上安装 Discourse 时,这是第一个结果。)
@Geoffrey_Challen 你可以使用 Discourse 仓库和插件创建一个镜像,安装 Ruby gems 及其他依赖项,然后将其推送到镜像仓库(如 DockerHub)。该仓库应与环境无关,并且可以是公开的(除非你包含了私有插件或类似内容)。这个基础镜像可用于 staging 和生产环境,甚至可用于不同的项目(如果它们使用相同的插件)。
不过,像预编译资源、数据库迁移和生成 SSL 证书等步骤应在目标机器上执行,以生成最终的镜像。
我不太清楚如何将其集成到 Kubernetes 集群中。我采取了保守的做法,依据 Discourse 团队的官方指南,将其分为两个步骤来操作。
这部分我不太确定是否理解正确。这些操作不会在容器内部按需自动执行吗?我希望只需将其推送到我们的云端,而无需再获取该机器的 shell 访问权限——就像我很少(甚至从未)需要进入 Discourse Docker 容器一样。
您无需显式输入容器。我的意思是,您无法生成预编译的镜像(例如在 CI 流水线中生成)并直接使用它,因为这必须在目标机器上执行,而目标机器上存放着数据库(这可以自动化,但我尚未在 Kubernetes 中实现,尽管我曾用 Ansible 实现过)。
啊,好的。我使用的是“全合一”模板,将数据库包含在容器中。这适合我们的使用场景,即支持由 10 到 1000 名学生组成的班级——至少对于我的班级而言,在这种配置下,“全合一”模板运行良好。因此,数据库位于容器内部。
但无论如何,Discourse 在容器启动时不会运行数据库迁移或其他设置步骤吗?
您确定数据库在容器内部吗?还是指关系型数据库管理系统(本例中为 PostgreSQL)?官方支持的安装方式是将数据库部署在容器外部(这是预期行为),并将容器内的卷映射到外部(即主机)。此外,在容器重建后,容器会被重新创建,导致所有数据丢失。
如果数据库确实位于容器内部,我不确定如何根据官方安装方案进行升级,因为 launcher 脚本在重建过程中似乎会多次创建和销毁容器(并使用 --rm 参数运行),这意味着容器停止后,您将丢失所有数据,包括数据库。
我尚未尝试更改重建方式,但假设您能够修改配置,使所有操作都在容器内运行且无需重新创建容器,那么您应该可以将容器推送到注册表(请确保其为私有,因为其中包含密钥)。不过,出于多种原因(其中一些已在前文提及),我并不推荐这种做法。
标准安装会在容器内包含 nginx、rails、postgres 和 redis。它使用外部卷来存储 postgres 和 redis 的数据。在重建或升级时,这些卷不会被销毁。
是的,只是觉得奇怪,他说数据库在容器内,除非他改变了标准安装的方式,或者他指的是 PostgreSQL,而不是数据库本身。
不会——迁移和资产编译步骤是在 ./launcher bootstrap 阶段进行的,此时插件已解析完毕。完成后,容器可以根据需要多次重启,或者将 Web 进程拆分到多台机器上运行等。
理想情况下,设置应如下所示:
./launcher bootstrap,并将生成的镜像重命名后推送到私有镜像仓库(使用基于时间戳的标签,而非 latest)(此处 local_discourse 并不是一个好名字),然后将部署滚动更新到新的标签。
Postgres 在容器内运行。如果使用标准安装模板,它会将数据保存到容器外部,但本身在容器内运行。Redis 也是如此。我认为混淆在于,当我说“数据库在容器内运行”时,我指的是数据库服务器,即使数据库文件位于容器外部。(但数据库文件本身不会“运行”,所以我原以为我的表述很清晰——但显然还不够清晰
。)
附注:实际上,除非你配置 Docker 将该目录进行绑定挂载,否则数据甚至不一定保存在容器外部。我在初始化过程中跳过了这一步,但这可能不是个好主意,因为此时数据库内容无法在容器重启后保留。
我现在觉得这更有道理了,特别是读了关于 docker-compose、启动脚本等的长篇链接讨论之后。
以下是我希望实现的目标:
./launcher bootstrap,创建一个包含所有依赖项(如 postgres、redis 等)的“胖”Discourse 镜像。./launcher bootstrap 以更新镜像并重新部署,而无需销毁数据(废话)。我的理解是,这个“胖”Discourse 镜像不应依赖任何外部服务。不过,为了让数据在容器升级后得以保留,PostgreSQL 数据库文件必须存储在容器外部。这没问题——我可以为它们创建一个 Kubernetes 持久卷。
现在我预见到唯一的问题。./launcher bootstrap 期间发生的绝大多数操作只涉及容器内的文件。例如,预编译资源。这没问题,因为结果存储在容器内,无需在升级后保留。
这里的大例外是数据库迁移。 该步骤需要能够访问在引导完成后将使用的数据库。因此,对我来说,这似乎是轻松将“胖”Discourse 镜像部署到云端的最大障碍。
我注意到 @sam 多次提到,他们使用大致类似于我上述描述的工作流为客户重新部署 Discourse。但我怀疑,之所以可行,是因为他们的 Discourse 镜像被配置为使用运行在其集群上的数据库服务器(可能还有 Redis)。这对于支持多个部署来说是有道理的,但并非我想要的做法。这意味着引导过程可以修改生产数据库——或者可能只是跳过了数据库迁移步骤,因为数据库升级和迁移是外部处理的。@sam:能否确认一下?
总之,对我来说,这意味着我需要找到一种方法,在容器启动时运行数据库迁移,而不是在 ./launcher bootstrap 期间运行。我想,那时的一种做法可能是:
./launcher bootstrap 在本地构建“胖”Discourse 容器,挂载一个指向空本地数据库的卷,因为该数据库稍后不会被使用。这将使容器内的所有内容都就绪,只是未完成 PostgreSQL 的相关工作。您可能对多站点配置感兴趣。
您目前面临两个主要问题:Discourse 尚未准备好支持 Kubernetes,因此需要自定义代码。此外,您正在涉足 Discourse 团队的主要盈利领域(托管大量论坛),因此您获得的支援水平将会下降。
我的建议是:采用多站点配置,将静态调度部署在虚拟机上,完全独立于您的集群之外。(或者使用 Service Type=ExternalName 指向虚拟机,以保持一致的 Ingress 配置。)
OK… I managed to figure out one way of doing this. I’m not 100% happy with it, but it does work and may be appealing to others that are trying a simple single-container fat (includes postgres, redis, etc.) Discourse image deployment to Kubernetes.
After examining the bootstrap process it became clear to me that unfortunately it mixes in two different kinds of operations—ones that only affect the underlying container, and others that poke out into the surrounding environment, mainly through the /shared volume mount where the postgres data files live. Rather than trying to tease these steps apart, it seems more sane to just run the bootstrapping steps in the environment where the container is actually going to be deployed.
Unfortunately, launcher bootstrap wants to create a container and use Docker. So running launcher inside another container (for example, in a container running on our cloud) means either tangling with a Docker-in-Docker setup (doable, but not considered best practices) or exposing the underlying Docker daemon. I’m not even sure that that second approach would work, since I think that it would interpret a volume mount against the node’s local filesystem, whereas in our scenario we want to volume mount /shared to a persistent Kubernetes volume. Maybe the Docker-in-Docker route would work, but then you’d also have a weird triple volume mount from inside the nested container into the outer container and from there to the persistent Kubernetes volume. That sounds… unwise.
However, essentially launcher bootstrap creates one large .yml file by processing the templates value in the app.yml and then passes that to the Discourse base image when finishes the bootstrap process. So if we can extract the configuration file we can generate the configuration on any machine and then we only need to figure out how to pass it to a container we start in the cloud.
So as an overview, here are the steps we are going to follow:
launcherpups) and then start DiscourseHere’s the required change to launcher to support a dump command that writes the merged configuration to STDOUT:
run_dump() {
set_template_info
echo "$input"
}
(Note that this command is available in our fork of discourse_docker.)
So the first step is to use the new launcher dump command added above to create our bootstrap configuration:
# Substitute whatever your container configuration is called for app
./launcher dump app > bootstrap.yml
Next we need a container that knows to run pups to bootstrap the container before booting via /sbin/boot. I used the following Dockerfile to make a tiny change to the base discourse image:
FROM discourse/base:2.0.20191219-2109
COPY scripts/bootstrap.sh /
CMD bash bootstrap.sh
Where scripts/bootstrap.sh contains:
cd /pups/ && /pups/bin/pups --stdin < /bootstrap/bootstrap.yml && /sbin/boot
I published this as geoffreychallen:discourse_base:2.0.20191219-2109. (Note that you could probably also accomplish the same thing by modifying the boot command of the base Discourse docker image, but I was having a hard time getting that to work with the shell redirection required to get pups to read the configuration file.)
Now we need our Kubernetes configuration. Mine looks like this:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: kotlin-forum-pvc
namespace: ikp
spec:
storageClassName: rook-ceph-block
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 64Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: kotlin-forum-deployment
namespace: ikp
spec:
replicas: 0
selector:
matchLabels:
app: kotlin-forum
template:
metadata:
labels:
app: kotlin-forum
spec:
volumes:
- name: kotlin-forum
persistentVolumeClaim:
claimName: kotlin-forum-pvc
- name: bootstrap
configMap:
name: kotlin-forum-bootstrap
containers:
- name: kotlin-forum
image: geoffreychallen/discourse_base:2.0.20191219-2109
imagePullPolicy: Always
volumeMounts:
- name: kotlin-forum
mountPath: /shared/
- name: bootstrap
mountPath: /bootstrap/
ports:
- containerPort: 80
env:
- name: TZ
value: "America/Chicago"
- name: LANG
value: en_US.UTF-8
- name: RAILS_ENV
value: production
- name: UNICORN_WORKERS
value: "3"
- name: UNICORN_SIDEKIQS
value: "1"
- name: RUBY_GLOBAL_METHOD_CACHE_SIZE
value: "131072"
- name: RUBY_GC_HEAP_GROWTH_MAX_SLOTS
value: "40000"
- name: RUBY_GC_HEAP_INIT_SLOTS
value: "400000"
- name: RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR
value: "1.5"
- name: DISCOURSE_DB_SOCKET
value: /var/run/postgresql
- name: DISCOURSE_DEFAULT_LOCALE
value: en
- name: DISCOURSE_HOSTNAME
value: kotlin-forum.cs.illinois.edu
- name: DISCOURSE_DEVELOPER_EMAILS
value: challen@illinois.edu
- name: DISCOURSE_SMTP_ADDRESS
value: outbound-relays.techservices.illinois.edu
- name: DISCOURSE_SMTP_PORT
value: "25"
---
apiVersion: v1
kind: Service
metadata:
name: kotlin-forum
namespace: ikp
spec:
type: NodePort
ports:
- name: http
port: 80
targetPort: 80
selector:
app: kotlin-forum
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
namespace: ikp
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
name: kotlin-forum-ingress
spec:
rules:
- host: kotlin-forum.cs.illinois.edu
http:
paths:
- backend:
serviceName: kotlin-forum
servicePort: 80
Yours will look different. Note that I’m terminating HTTPS upstream, hence the modifications to the Ingress configuration. I also like to put everything in one file, delete pieces that don’t work as I iterate, and then let Kubernetes skip duplicates on the next kubectl create -f. Also note that I set replicas: 0 so that the deployment doesn’t start as soon as its configured. That’s because we have one bit of additional configuration to finish.
I copied the list of environment variables from what I saw being passed to the container by launcher start. I don’t know if all of these are necessary and others may be missing depending on your configuration. YMMV.
Note that we have two volume maps pointing into the container: the first is for postgres, configured as a persistent volume that will survive pod restarts. The second is a configuration mapping created like this:
kubectl create configmap kotlin-forum-bootstrap --from-file=bootstrap.yml=<path/to/bootstrap.yml>
Where kotlin-forum-bootstrap needs to match your Kubernetes configuration and path/to/bootstrap.yml is the path to the bootstrap.yml file we created using launcher dump above.
Once your configmap is in place, you should be able to scale your deployment to one replica and see Discourse booting and running the same bootstrap process that launcher bootstrap would have performed. That takes a few minutes. When that is done, your Discourse installation will boot.
A few other notes that I ran on the way to getting this (at least for now) fully configured:
X-Forwarded headers, including both X-Fowarded-For, X-Forwarded-Proto, X-Forwarded-Port. Not doing so will result in strange authentication errors when trying to use Google login and probably other login providers.nginx ingress controller must be configured to pass headers by setting use-forwarded-headers in the global config map. This took me a while to get right, since at least several times I edited the wrong configuration map, and then expected my ingress containers to restart when the configuration map changed. (They didn’t.)To update the deployed installation, you regenerate the new bootstrap.yml file, update the config map, and then restart the container (easiest by scaling to 0 and then back to 1 replica).
This does incur a bit of downtime since the bootstrapping happens before the container is built. But this seems inevitable to me in cases where you need to update the configuration and/or change the base image. launcher rebuild is documented as stop; bootstrap; start, meaning that the bootstrap process will still cause downtime even if performed using the launcher script.
This fat container Discourse deployment pattern would be much easier to support if the launcher script would more cleanly separate (a) bootstrap steps that could be performed offline and only affect the files in the container and (b) bootstrap steps that modify or need access to the database or other state outside the container. The approach described above is a bit frustrating because you do see all kinds of JS uglification, asset minification, and other things that could be done with the previous deployment running… but they are just too mixed in with other things (like database migrations) that can’t be done without access to the database. I briefly thought about creating a container that would only perform the steps in templates/postgres.yml, but then noticed that database migrations were being done by the web template, and thought about plugins, and then just gave up
.
With better separation redeployment for fat containers could work something like this:
That would result in a bit less downtime. It’s probably not worth the effort for that reason alone, but I can imagine that this might also simplify more complex deployment scenarios involving shared databases or whatever.
这样更合理。我在将引导阶段分为两步时就是这样做的。第一步可以在隔离环境中运行(例如 CI 流水线),生成一个包含 Discourse 仓库、gem 和插件的基础镜像;第二步则需要在目标机器上运行(或至少能访问生产数据库),以执行数据库迁移并生成资源文件(这一步是在引导过程中完成的,而不是在启动容器时)。
是的,那将非常棒。我已经提出过该请求,但我不确定是否会实施以及何时实施。
要在完全独立的环境中实现这一点会比较困难,因为资源预编译任务需要访问数据库(例如用于自定义 CSS 等)。但如果仅依赖数据库的部分可以单独执行,而其他不依赖数据库的资源文件可以独立预编译,那将非常理想。不过,我不确定在技术上实现这一点的可行性如何。
这基本上就是我在我部署的 Kubernetes 环境中所做的。我无法想象如何在没有独立的数据容器和 Web 容器(或其他类型的外部 PostgreSQL 和 Redis)的情况下使用 k8s——我为客户部署的环境都使用 GCP 资源来实现这一点。
此外,还有一个环境变量 skip_post_migration_updates,要实现真正的零停机升级,你需要了解它。相关内容在 这里 有描述。