Installation notes for Discourse on Bash for Windows

Here are some rough notes on how I went about install Discourse on Bash for Windows:

Install required packages

Mostly a cut-and-paste job from our docker base image.


curl http://apt.postgresql.org/pub/repos/apt/ACCC4CF8.asc | sudo apt-key add -
echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -sc)-pgdg main" | sudo tee /etc/apt/sources.list.d/postgres.list

sudo apt-get update

sudo apt-get install build-essential git build-essential git wget \
      libxslt1-dev libcurl4-openssl-dev \
      libssl-dev libyaml-dev libtool \
      libxml2-dev gawk  \
      postgresql-client-9.5  libpq-dev libreadline-dev \
      language-pack-en \
      psmisc whois redis-server \
      advancecomp jhead jpegoptim libjpeg-turbo-progs optipng libjemalloc-dev imagemagick

Installing rbenv

git clone https://github.com/rbenv/rbenv.git ~/.rbenv
cd ~/.rbenv && src/configure && make -C src
echo 'export PATH="$HOME/.rbenv/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(rbenv init -)"' >> ~/.bashrc

exit shell and start again so env is there.

Install ruby

rbenv install 2.3.1

echo 2.3.1 >> ~/.rbenv/version

Make :cocktail: this will take a while cause the filesystem is very slow.

Install Postgres on Windows

https://www.postgresql.org/download/windows/

This is required cause installing postgres is not yet supported.

Install Discourse

git clone https://github.com/discourse/discourse.git

gem install bundler
cd discourse
bundle install

# correct bundler permissions per http://stackoverflow.com/questions/37211007/installing-rails-on-ubuntu-bash-windows-10
chmod -R +t ~/.bundle/cache

bundle install

Hack database config

Edit config/database.yml:

Add the following into test and development sections:

host: localhost
username: postgres
password: YOURPWD
port: 5432

Start up redis

redis-server --bind 0.0.0.0

Migrate Discourse

bin/rake db:create db:migrate

Discourse now runs! hooray :confetti_ball: :champagne:

At the end of this process Discourse running on Windows without ANY VM installed. This is quite impressive.

bundle exec puma

The good news!

Discourse runs fine, its a little bit fiddly but all the gems install and the test suite runs.

Once booted up Discourse is quite fast and usable.

The bad news!

The filesystem mapping technology keeps the original files in an NTFS partition and performs internal mapping, pretending this filesystem is actually a Linux native one. Unfortunately NTFS + mapping overhead means that it is much slower than using ext4fs directly or some other native solution.

The effect of this in real terms is:

  • Running our test suite is slower

Bash on Ubuntu on Windows takes 8:20 to run the test suite. My ubuntu VM takes 6:15. Same hardware.

  • Installing ruby / gems and so on is a lot slower

Ruby has lots of little files, I did not time it but it feels at least twice slower to compile ruby from source.

  • Booting Discourse takes more than twice longer:
# Linux VM 
sam@ubuntu discourse % time bin/rails r "puts 'hi'"
hi

2.54s user 
0.50s system 
99% cpu 
3.059 total


# Bash on Ubuntu on Windows

sam@SAM-PC:~/discourse$ time bin/rails r "puts 'hi'"
hi

real    0m7.123s
user    0m3.063s
sys     0m4.000s

7.12secs VS 3.05secs

This means that starting up a dev web server and opening up a rails console take much longer.

This is a huge issue, it is big enough for me to recommend against this setup for now.

There is an open ticket that once resolved will allow bash on Windows to run native filesystems, once that is implemented we should revisit these results:

https://github.com/Microsoft/BashOnWindows/issues/131

Note: All tests were done on the beta version of Bash on Ubuntu on Windows that is bundled with Windows 10 anniversary edition (made public on 2nd of August 2016)

15 Likes

Awesome stuff, Sam. Thanks for making this work :grinning:

We are aware of the file system challenges you highlight. Aiming to make big improvements in future releases.

This is one if the reasons we are depending NOT to rely on Bash/WSL for server scenarios in this first release, but this is a great dev scenario.

8 Likes

I just upgraded to creators update, with latest WSL.

The good news

Postgres can now be installed inside WSL and run as a service, no need to run it on the host :confetti_ball: . Overall everything is much more polished, you can see tasks that are running in WSL in the task manager, it feels like one cohesive unit.

The not so good news

Filesystem access is still terrible, in fact this round of testing on my new computer (that uses bitlocker) gives me 35 seconds to run bin/rake routes, the same command takes 3.2 seconds on the same computer under vmware.

My recommendation here is to keep observing, but still use a VM for now for any real dev work needed.

12 Likes

Did you leave feedback to Microsoft on this? They currently seems to be quite responsive on WSL issues :slight_smile:

4 Likes

If you can provide a filesystem benchmark I can run it @sam my system is close to optimal and I am full up to date.

I think bonnie++ will do the trick…

I ran:

sudo -i
mkdir /root/test
bonnie++ -d /root/test -u root

Output is a bit cryptic, however there is a perl tool that converts it to readable HTML. You would need to run it both in WSL and on the same machine in a VM.

It took about 5 minutes to run in my VM, it has been 5 minutes now and it feels like no progress was made in WSL. I am going to terminate it now and may run it over lunch … or just give up cause it is so egregiously bad.

This is the result from my VM

20 minutes later, this is the result of me running bonnie++ in WSL

root@windows:~# bonnie++ -d /root/test -u root
Using uid:0, gid:0.
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...Bonnie: drastic I/O error (rmdir): Directory not empty
Cleaning up test directory after error.

I guess if we can not run benchmarking tools, the best I have to offer is

  1. How long does it take to compile Ruby (which is heavily filesystem focused)
  2. How long does it take to run rake routes which is almost entirely filesystem focused
3 Likes

I think it is worth stressing just how intense the IO is during the time you boot up a Rails web application:

Take this little sample script from your rails root:

puts Process.pid
gets
require File.expand_path("config/environment", __FILE__)

In another window run:

sudo perf trace -s -p WHATEVER_THE_PID_IS
ruby (22987), 898453 events, 100.0%, 0.000 msec

   syscall            calls    total       min       avg       max      stddev
                               (msec)    (msec)    (msec)    (msec)        (%)
   --------------- -------- --------- --------- --------- ---------     ------
   read                6949  1072.940     0.000     0.154    32.835      5.61%
   write                 19     1.497     0.004     0.079     0.855     55.02%
   open              358969  2075.510     0.002     0.006    24.415      1.45%
   close               6736    26.939     0.002     0.004     0.102      1.46%
   stat               11169    94.922     0.002     0.008     2.574      4.95%
   fstat               9003    25.918     0.002     0.003     0.087      0.95%
   lstat              35925   163.917     0.002     0.005     2.782      3.04%
   poll                  16    12.934     0.003     0.808     6.429     56.12%
   lseek                271     0.770     0.002     0.003     0.022      5.36%
   mmap                 214     2.053     0.003     0.010     0.176      9.45%
   mprotect             174     2.077     0.003     0.012     0.065      5.19%
   munmap                15     0.592     0.007     0.039     0.160     29.49%
   brk                 1462    26.240     0.003     0.018     0.247      1.74%
   rt_sigaction           3     0.018     0.002     0.006     0.013     57.71%
   rt_sigprocmask        12     0.043     0.002     0.004     0.012     22.55%
   ioctl               3242     9.450     0.002     0.003     0.033      1.37%
   access                21     0.123     0.003     0.006     0.018     17.40%
   select                29    65.635     0.003     2.263    43.631     66.60%
   sched_yield            6     0.032     0.003     0.005     0.013     30.14%
   mremap                 2     0.079     0.023     0.039     0.055     41.03%
   sendfile               2    11.890     4.990     5.945     6.900     16.07%
   socket                 6     0.049     0.004     0.008     0.012     13.02%
   connect                6     0.285     0.013     0.047     0.095     28.69%
   sendto                18     0.517     0.004     0.029     0.066     14.21%
   recvfrom              43     0.351     0.003     0.008     0.029     13.37%
   getsockname            1     0.003     0.003     0.003     0.003      0.00%
   setsockopt             3     0.012     0.004     0.004     0.004      1.18%
   getsockopt             4     0.038     0.003     0.010     0.030     71.75%
   clone                  4     0.100     0.021     0.025     0.032     10.75%
   vfork                  6     1.678     0.216     0.280     0.382      9.12%
   wait4                  6     0.068     0.006     0.011     0.018     18.45%
   fcntl               2666     7.253     0.002     0.003     0.067      1.66%
   getdents            1070    61.229     0.002     0.057    32.941     53.93%
   getcwd               410     4.933     0.002     0.012     0.153      3.85%
   chdir                 34     0.261     0.003     0.008     0.026     13.11%
   mkdir                  2     0.329     0.017     0.164     0.312     89.62%
   getrusage              1     0.060     0.060     0.060     0.060      0.00%
   getuid              2662    11.740     0.002     0.004     0.065      1.89%
   getgid              2661     6.467     0.002     0.002     0.025      1.52%
   geteuid             2663     6.773     0.002     0.003     0.032      1.78%
   getegid             2661     6.163     0.002     0.002     0.031      1.31%
   getresuid              6     0.014     0.002     0.002     0.002      1.22%
   getresgid              6     0.013     0.002     0.002     0.002      0.71%
   futex                 36     3.614     0.002     0.100     2.040     56.53%
   tgkill                 1     0.035     0.035     0.035     0.035      0.00%
   pipe2                 12     0.075     0.003     0.006     0.018     22.15%
   getrandom              1     0.019     0.019     0.019     0.019      0.00%

So … the interesting stat here is that open is called 358969 times. stat and family are called about 55 thousand times. When booting Discourse (or any Rails app) there is carnage on the filesystem, it is probably one of the best “read” filesystem benchmarks out there, cause it reads from tons of files in tons of directories and so on.

7 Likes

Wait there’s more open calls than read + getdents + fstat + fcntl + ioctl combined - what the heck is Ruby doing with all those open files?

And only 6k closes.

Or are those failed open calls?

2 Likes

This is an awesome question and is left as an exercise for the reader :slight_smile:

Here’s what I got with a blazing fast 960 Pro SSD filesystem

jeff@WUMPUS:~$ bonnie++ -f
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...Bonnie: drastic I/O error (rmdir): Directory not empty
Cleaning up test directory after error.

For comparison:

root@WUMPUS:~# dd bs=1M count=512 if=/dev/zero of=test conv=fdatasync
512+0 records in
512+0 records out
536870912 bytes (537 MB, 512 MiB) copied, 0.4312 s, 1.2 GB/s

We should at the very least make them aware of this topic. What’s the default contact point for WSL feedback? Please feel free to contact them on our behalf :slightly_smiling_face:

@bitcrazed is the PM, I am pretty sure he is following this discussion and is aware of the problems. He replies here, and pretty sure he is watching this topic.

Improvements are planned :wink:

2 Likes

@bitcrazed @shanselman a few months ago we upgraded Discourse to use: https://github.com/Shopify/bootsnap

This heavily reduced file IO during boot which results in acceptable albeit a bit a bit slower boot performance on Windows in WSL.

Locally I see Discourse boot times of

  • 2.2 seconds on a VMWare Ubuntu VM
  • 4.5 seconds on the same machine in WSL

Without bootsnap I was seeing 31 second boot times!

This is quite enormous! It means that WSL is an acceptable option for local development of Discourse (and other Rails applications).

Sure it can be faster but given how ubiquitous, integrated and easy to set up it is this is an endorsable option! Also, keep in mind, Rails is embracing bootsnap and will ship with it from the next version onwards!

10 Likes

FTR I just installed Discourse in WSL and I was able to install postgresql 10 with no issues at all.

Unicorn still doesn’t work so you have to use puma but other than that, I’d say the UX is acceptable :+1:

2 Likes

@bitcrazed one huge pain point we have now is that we can not use Unicorn at all in WSL per:

https://github.com/Microsoft/WSL/issues/1982#issuecomment-412730481

https://bogomips.org/unicorn-public/CAAtdryOtTO8HGTeKLy_JbeRhWLC7JZpCABVpbpEK+z67JHx=ew@mail.gmail.com/T/#u

This means we need special instructions for developers on WSL.

2 Likes

Hey Sam – best thing to do for WSL issues is file an issue (or update an existing one reporting the same issue) on our GitHub: https://github.com/microsoft/wsl, and to ping taraj who can triage with the team.

Rich.

4 Likes

No probs … done per: https://github.com/Microsoft/WSL/issues/3496

2 Likes

Many thanks Sam. Any chance you can throw an strace up there too? We find them invaluable in tracing exactly what’s failing in your specific setup.

Rich.