Problems with Discourse install in VMWare


(RBoy) #21

Anyways this is all FYI/A as appropriate. I’ve started over, removed docker and discourse and setting it up again but this time I’m using these instructions from dockers’ website:

You may also want to see this:


(Robby O'Connor) #22

I been saying that get.docker.com was silently deprecrated…but very few people listen :slight_smile:


(RBoy) #23

Issues aside, a word to the team, I’m exploring the feature of discourse and one can immediately see deeply thoughtful the team is and what an outstanding product they have built here. Kudos to you all!


(Jeff Atwood) #24

If it is silent that is Docker’s problem. They need to fix their process.


(Jay Pfaffman) #25

And few speak


(RBoy) #26

@sam can you share what setup you’re using for your VM? I’m using Ubuntu 16.04 running in VMWare but after a few reboots it just stops working (nothing but simple reboots).

qaz@ubuntu:/var/discourse$ sudo ./launcher stop app

  • /usr/bin/docker stop -t 10 app
    app
    qaz@ubuntu:/var/discourse$ sudo ./launcher start app
    starting up existing container
  • /usr/bin/docker start app
    Error response from daemon: mkdir /var/run/docker/libcontainerd/containerd/a11bbb9dfc6a0e04a5bcd27424cfdc5f431bd34c1498a45b121edab95debb126: file exists
    Error: failed to start containers: app

(Sam Saffron) #27

I don’t know what is going on with your docker install.

What version are you running? I would strongly recommend


(RBoy) #28

That’s exactly what I did, uninstalled it, nuked the directories, rebooted and reinstalled EVERYTHING including discourse. Everything worked fine last night and this morning, rebooted and I couldn’t connect to the website anymore, tried to restart the app above and that’s what I get.

Can you describe your setup please? I’m wondering what’s the deal here, I’m trying it with a 1CPU 2GB RAM and 25GB HD. Ram usage is about 80%, SWAP file usage is 0% (7MB out of 3GB)


(RBoy) #29

And I rebooted again and it’s working again. So what do you think is going on?


(Sam Saffron) #30

hmmm … what does

sudo docker ps -a
sudo docker version 

return

Where is this installed? On a local desktop? A server class machine?


(RBoy) #31

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a11bbb9dfc6a local_discourse/app “/sbin/boot” 12 hours ago Up 5 minutes 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp app

Client:
Version: 17.03.1-ce
API version: 1.27
Go version: go1.7.5
Git commit: c6d412e
Built: Mon Mar 27 17:14:09 2017
OS/Arch: linux/amd64

Server:
Version: 17.03.1-ce
API version: 1.27 (minimum version 1.12)
Go version: go1.7.5
Git commit: c6d412e
Built: Mon Mar 27 17:14:09 2017
OS/Arch: linux/amd64
Experimental: false

It’s a an old server 2 CPU’s with 6GB RAM, I have had other server VM’s run on it without any issue for years.


(Sam Saffron) #32

Well then, discourse is running according to docker.

What does this do

docker rm -f a11bbb9dfc6a 

After that do:

./launcher start discourse

How is disk space on your VM? How is memory?


(RBoy) #33

VM has 2GB RAM (80% used) with 3GB SWAP (7MB used) and 25GB with 13GB free (it’s Ubuntu + LXDE + Docker + Discourse, that’s it!).

I think I may know what’s going on, docker + discourse = EXTREMELY heavy on resources (especially RAM and DiskIO)
I had another VM running which I shut down. (actually I’ve been running up to 3 VM’s on this “old” server without issue for years, but each with only 1GB or 512MB RAM and 1CPU - other webservers including IIS are very lightweight)

I realized that the DiskIO generated by Docker + Discourse was excessive which was causing the entire system to slow down. I had a look at the previous errors and after googling a bit (i’m still new to Docker as you can see) I think Docker was timing out on all the operations. It was expecting say 10 seconds to shutdown discourse etc. But due to excessive IO everything was taking like 30 seconds so it would end up killing other processes prematurely (Stop docker-engine timeout and cannot start container after start docker-engine back · Issue #25246 · moby/moby · GitHub) leaving behind artifacts which shouldn’t be there if all had been done in a timely fashion which could cause issues when trying to restart it (or after a reboot).

I’m going to give it more resources and see how it does, basically it may be just that Discourse needs a good deal of server resources to run, much much more than a just a regular webserver.

Thanks for staying with me on this one Sam and good learning for other folks too.

Can you share your VMWare VM setup (RAM/CPU/Disk etc) just for comparative purposes?


(Sam Saffron) #34

Curious, this is the results I get on digital ocean:

~# sudo fdisk -l 

  Device Boot      Start         End      Blocks   Id  System
/dev/vda1               1    83890175    41945087+  ee  GPT


~# sudo hdparm -t /dev/vda1

/dev/vda1:
 Timing buffered disk reads: 1422 MB in  3.00 seconds = 473.91 MB/sec

What do you get there?


(RBoy) #35

Device Boot Start End Sectors Size Id Type
/dev/sda1 * 2048 50331647 50329600 24G 83 Linux

Timing buffered disk reads: 248 MB in 3.07 seconds = 80.68 MB/sec

Do keep in mind that this is now AFTER shutting down other VM’s and dedicating more resources to it so it was probably a good deal slower earlier when there were timing out issues I’m guessing more like 50% to 70% slower than this. This disk speed is working good I’m getting response times of page loads in about 100ms ± 30%.

Were you running your VM’s on Digital Ocean or bare metal?


(Sam Saffron) #36

That is stock digital ocean, what you get for 20 dollars a month, shared virtualized cloud hosting.

My own dev box is a fair bit faster:


/dev/sda:
 Timing buffered disk reads: 2422 MB in  3.00 seconds = 807.14 MB/sec

So what you have is a server that has IO 10 times slower than what I am running in dev and 5 times slower than what you get from Digital Ocean for 20 bucks a month.


(Sam Saffron) #37

What we clearly need here is:

./launcher diagnose

Which does basic io, memory, disk space, docker version and so on checking.

In fact I think we need this so much I am going to add this to my list.


(Jeff Atwood) #38

Basically “go buy a SSD for $200” :wink:


(RBoy) #39

Oh yes no doubt, for production one should had dedicated resources and Digital Ocean sounds like a good setup.

As for the results, atleast we got to know the threshold for even trying to get discourse up and running to play around with. I’m guessing at the time with other VM’s / stuff happening during a reboot the DiskIO dropped to probably 10-20Mbps or lower which is why it started timing out, I’m guess that’s threshold, if you’re getting under 30-50Mbps of consistent throughput you need to get more resources for discourse.

That is absolutely great stuff


(RBoy) #40

Yes that’s a given (I have a few just need to swap em out) - tail end of the list now probably moving up the list. Better yet is to use Digital Ocean and let them worry about this stuff.

But Sam hit the keynote, a diagnostic would be fabulous. I got this out by looking at the resource dashboard I had setup monitoring all the params and everything was just red (vis-a-vis other VM’s)
Basically discourse is awesome but you need to be ready to handle awesome with an awesome amount of resources :slight_smile: