在清理空间和重新填充空间之间陷入循环

我的重建因磁盘空间不足而失败,因此我需要释放一些空间。但我陷入了一个循环:./launcher cleanup 释放了足够的空间,使可用空间超过 5 GB。于是我启动重建,但重建过程又占用了刚释放的空间,导致无法完成。详见下文。

我该如何让系统重新运行起来?

$ sudo ./launcher cleanup

以下命令将:
- 删除所有旧容器的 Docker 镜像
- 删除所有已停止和孤立的容器

您确定吗 (Y/n):

开始清理(可用字节数:3931580)
清理完成(可用字节数:5903356)
$ sudo ./launcher rebuild app
警告:Docker 版本 17.05.0-ce 已弃用,建议升级到 17.06.2 或更高版本。

警告:我们即将开始下载 Discourse 基础镜像
此过程可能需要几分钟到一小时,具体取决于您的网络速度

请耐心等待

无法在本地找到镜像 'discourse/base:2.0.20180802'
2.0.20180802: 正在从 discourse/base 拉取
8ee29e426c26: 正在拉取文件系统层
6e83b260b73b: 正在拉取文件系统层
e26b65fd1143: 正在拉取文件系统层
40dca07f8222: 正在拉取文件系统层
b420ae9e10b3: 正在拉取文件系统层
b89ccfe9dadc: 正在拉取文件系统层
40dca07f8222: 等待
b420ae9e10b3: 等待
b89ccfe9dadc: 等待
e26b65fd1143: 正在验证校验和
e26b65fd1143: 下载完成
6e83b260b73b: 正在验证校验和
6e83b260b73b: 下载完成
b420ae9e10b3: 正在验证校验和
b420ae9e10b3: 下载完成
40dca07f8222: 正在验证校验和
40dca07f8222: 下载完成
8ee29e426c26: 正在验证校验和
8ee29e426c26: 下载完成
8ee29e426c26: 拉取完成
6e83b260b73b: 拉取完成
e26b65fd1143: 拉取完成
40dca07f8222: 拉取完成
b420ae9e10b3: 拉取完成
b89ccfe9dadc: 正在验证校验和
b89ccfe9dadc: 下载完成
b89ccfe9dadc: 拉取完成
摘要:sha256:be738714169c78e371f93bfa1079f750475b0910567d4f86fa50d6e66910b656
状态:已下载 discourse/base:2.0.20180802 的新镜像
您所在的 /var/lib/docker 所在磁盘的可用空间不足 5GB。您需要更多空间才能继续。
文件系统              大小  已用  可用  使用%  挂载点
/dev/mapper/vg-lv_root   19G   14G  3.8G  79%  /

您是否希望通过清理系统中的 Docker 镜像和容器来尝试恢复空间?(y/N)y
警告!这将删除:
        - 所有已停止的容器
        - 所有未被至少一个容器使用的卷
        - 所有未被至少一个容器使用的网络
        - 所有悬空镜像
您确定要继续吗?[y/N] y
总共回收的空间:0B
如果清理成功,您现在可以尝试再次操作
$

Clean up some more space so that the build has enough breathing room to complete. I find that Docker’s cleanup system is not wonderful at purging old images sometimes, so I sometimes have to do a docker images followed by a long docker rmi <ID> <ID> <ID> ....

How do I know which images I can delete?

$ docker images
REPOSITORY                      TAG                 IMAGE ID            CREATED             SIZE
discourse/base                  2.0.20180802        d6f8b6029227        4 weeks ago         1.74GB
local_discourse/web_only        latest              0301af96b2e8        5 months ago        2.95GB
local_discourse/data            latest              a8b8c3da644a        8 months ago        1.85GB
local_discourse/mail-receiver   latest              a322c9207234        8 months ago        142MB
discourse/base                  2.0.20171204        64d62a045a4e        8 months ago        1.81GB
discourse/mail-receiver         1.1.2               44042627246b        14 months ago       142MB
samsaffron/docker-gc            latest              54ca424ca8d6        2 years ago         57.7MB

Anything not in use by a running container is usually safe enough, as far as Discourse is concerned, because it’ll be re-downloaded and/or rebuilt when you do the needful. There’s not a huge pile of images there, though; it’s probably time for you to get a disk upgrade.

So deleting discourse containers is always safe?

Not really. This is a tiny and very low activity forum that has hardly grown in the past months…


docker rmi 64d62a045a4e gives me

Error response from daemon: conflict: unable to delete 64d62a045a4e (cannot be forced) - image has dependent child images

Is there any way I can stop it from downloading the latest discourse base image every time I try to rebuild or start the app? I’d like it to just use the old one for now so that I can go to bed…

It only downloads it if it is not on local, we really only download an image once. We only bump the required image once every few months in launcher. There are ways to specify a base image BUT you do not want to do that for a rainbow of reasons.

Okay, so is there any way I can get rid of one of those images? I surely don’t need two base images?

How much disk space do you have?

Can you delete some backups?

Sure…

Have you ran ./launcher cleanup it will purge all images not in use.

$ df -h
Filesystem              Size  Used Avail Use% Mounted on
udev                    476M     0  476M   0% /dev
tmpfs                   100M  8.9M   91M   9% /run
/dev/mapper/vg-lv_root   19G   14G  4.1G  77% /
tmpfs                   497M  1.1M  496M   1% /dev/shm
tmpfs                   5.0M     0  5.0M   0% /run/lock
tmpfs                   497M     0  497M   0% /sys/fs/cgroup
/dev/sda1               461M  160M  278M  37% /boot
tmpfs                   100M     0  100M   0% /run/user/1000

Deleted them already.

Yes, I tried that. and I get

But when I try to rebuild or start the app, the base images that was presumably deleted by cleanup gets downloaded again and I’m back to where I started.

My suggestion here then if you are ULTRA tight on space is just to go nuclear and and start from scratch.

What is the output of docker ps, you basically need to kill your app container so you can free up the old base image.

$ docker ps
CONTAINER ID        IMAGE                           COMMAND             CREATED             STATUS              PORTS                                      NAMES
1fba0860cbc3        local_discourse/web_only        "/sbin/boot"        5 months ago        Up 29 minutes       0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp   web_only
aa6b422d88ca        local_discourse/data            "/sbin/boot"        8 months ago        Up 29 minutes                                                  data
2940a1603151        local_discourse/mail-receiver   "/sbin/boot"        8 months ago        Up 29 minutes       0.0.0.0:25->25/tcp                         mail-receiver

What do you mean by that? Will I need a discourse backup? Cause I don’t have one…

docker rm -f aa6b422d88ca
docker rm -f 1fba0860cbc3
./launcher cleanup 
./launcher rebuild data
./launcher rebuild web_only

But really the root cause here is that you are just way too tight on disk space. I would recommend doubling the size /dev/mappper/vg-lv_root

The last step (rebuild web_only) fails:

Bundle complete! 110 Gemfile dependencies, 201 gems now installed.
Gems in the group development were not installed.
Bundled gems are installed into `./vendor/bundle`

I, [2018-08-31T00:50:57.518677 #13]  INFO -- : > cd /var/www/discourse && su discourse -c 'bundle exec rake db:migrate'
rake aborted!
Gem::LoadError: can't activate public_suffix-2.0.5, already activated public_suffix-3.0.2
/var/www/discourse/lib/plugin_gem.rb:18:in `load'
/var/www/discourse/lib/plugin/instance.rb:501:in `gem'
/var/www/discourse/plugins/discourse-sync-to-dropbox/plugin.rb:7:in `activate!'
/var/www/discourse/lib/plugin/instance.rb:431:in `instance_eval'
/var/www/discourse/lib/plugin/instance.rb:431:in `activate!'
lib/discourse.rb:164:in `block in activate_plugins!'
lib/discourse.rb:161:in `each'
lib/discourse.rb:161:in `activate_plugins!'
/var/www/discourse/config/application.rb:223:in `<class:Application>'
/var/www/discourse/config/application.rb:39:in `<module:Discourse>'
/var/www/discourse/config/application.rb:38:in `<top (required)>'
/var/www/discourse/Rakefile:5:in `require'
/var/www/discourse/Rakefile:5:in `<top (required)>'
/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/rake-12.3.1/exe/rake:27:in `<top (required)>'
/usr/local/bin/bundle:23:in `load'
/usr/local/bin/bundle:23:in `<main>'
(See full trace by running task with --trace)
I, [2018-08-31T00:51:18.343389 #13]  INFO -- : gem install geocoder -v 1.4.4 -i /var/www/discourse/plugins/discourse-locations/gems/2.5.1 --no-document --ignore-dependencies
Successfully installed geocoder-1.4.4
1 gem installed
gem install public_suffix -v 2.0.5 -i /var/www/discourse/plugins/discourse-sync-to-dropbox/gems/2.5.1 --no-document --ignore-dependencies
Successfully installed public_suffix-2.0.5
1 gem installed




FAILED
--------------------
Pups::ExecError: cd /var/www/discourse && su discourse -c 'bundle exec rake db:migrate' failed with return #<Process::Status: pid 507 exit 1>
Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn'
exec failed with the params {"cd"=>"$home", "hook"=>"bundle_exec", "cmd"=>["su discourse -c 'bundle install --deployment --verbose --without test --without development --retry 3 --jobs 4'", "su discourse -c 'bundle exec rake db:migrate'", "su discourse -c 'bundle exec rake assets:precompile'"]}
36b568dc13935b0491b99afd4763776899dad8d57a1df149633b9c7c02fd3c0b
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one

Trying without the dropbox plugin now…

Hmmmm … are you on tests passed… stable or beta?

Also, remove discourse-sync-to-dropbox asap. It looks like it is not working at the moment.

Phew! After removing the dropbox plugin my site is finally back up and running. Thanks a lot to everyone for your support!

Since I never changed that, I guess I’m on tests-passed.

I guess you’re right, but how does that relate to discourse’s minimum requirements of 10 GB HDD?

10GB is ambitious and absolutely not the case if you are running seperate data and web containers.

Ah, I see. I don’t think that was mentioned anywhere. I’ll check tomorrow where that information might be usefully added.

However, the situation seems to look much brighter now:

$ df -h
Filesystem              Size  Used Avail Use% Mounted on
udev                    476M     0  476M   0% /dev
tmpfs                   100M   12M   88M  12% /run
/dev/mapper/vg-lv_root   19G  8.4G  9.0G  49% /
tmpfs                   497M  1.4M  495M   1% /dev/shm
tmpfs                   5.0M     0  5.0M   0% /run/lock
tmpfs                   497M     0  497M   0% /sys/fs/cgroup
/dev/sda1               461M  160M  278M  37% /boot
tmpfs                   100M     0  100M   0% /run/user/1000

Not sure where all the space came from but it looks good, doesn’t it? Do you still think I need to upgrade the server?

The data container was using the old image, so you had a situation where:

“data container” was saying HEY I need the old Discourse base image
“web container” was saying I need the new image

Not sure where all the space came from but this is a chunk of it.