QUnit tests won't pass in discourse_dev docker image

I’m trying to use the docker image for tests both on my mac, and also on travis. For a while now the qunit tests have been failing on exactly the same tests, and I can’t figure out why.

The steps I’m taking are the same on my mac and on travis:

./bin/docker/boot_dev # Load the Discourse development docker image, and set up
./bin/docker/bundle install --retry=3 --jobs=3 # Install dependencies
./bin/docker/psql "-c 'ALTER USER discourse WITH SUPERUSER;'"
RAILS_ENV=test ./bin/docker/rake db:drop db:create db:migrate # Migrate the database

(the psql hack is due to this issue).

Every time, I get these test failures:

Module Failed: Acceptance: Composer
  Test Failed: Tests the Composer controls
    Assertion Failed: clicking the toggle hides the preview
      Expected: true, Actual: false
  :49
Module Failed: Acceptance: Modal
  Test Failed: modal
    Assertion Failed: there is no modal at first
      Expected: true, Actual: false
  :49
Module Failed: Acceptance: Search - Full Page
  Test Failed: update username through advanced search ui
    Assertion Failed: "autocomplete" popup is visible
      Expected: true, Actual: false    
    Assertion Failed: "autocomplete" popup has an entry for "admin"
      Expected: true, Actual: false    
    Assertion Failed: Error: Element .search-advanced-options .autocomplete ul li a:first not found.
      Expected: true, Actual: false
  Test Failed: update category through advanced search ui
    Assertion Failed: has "faq" populated
      Expected: true, Actual: false    
    Assertion Failed: has updated search term to "none #faq"
      Expected: none #faq, Actual: none
  :49
Module Failed: Acceptance: Topic
  Test Failed: Reply as new message
    Assertion Failed: it fills up the composer with the right user to start the PM to
      Expected: someguy, Actual:     
    Assertion Failed: it fills up the composer with the right user to start the PM to
      Expected: test, Actual:     
    Assertion Failed: it fills up the composer with the right group to start the PM to
      Expected: Group, Actual: 
  :49
Time: 134884ms, Total: 2029, Passed: 2019, Failed: 10
phantomjs /src/vendor/assets/javascripts/run-qunit.js http://localhost:60099/qunit 200000

Does anyone have any ideas what could cause this? Is there some dependency that’s out of date in the discourse_dev image that doesn’t get updated with bundle? I’ve tried apt-get update; apt-get upgrade but that doesn’t seem to help.

You can see one of the failing travis jobs here - qunit tests are around line 6730.

1 Like

Works fine for me on Linux with the latest tests-passed branch checked out:

~/Projects/discourse sudo d/bundle exec rake "qunit:test['200000']"
phantomjs /src/vendor/assets/javascripts/run-qunit.js http://localhost:60099/qunit 200000
Unable to access network

  phantomjs://code/run-qunit.js:34
Puma starting in single mode...
* Version 3.6.0 (ruby 2.3.3-p222), codename: Sleepy Sunday Serenity
* Min threads: 0, max threads: 16
* Environment: development
* Listening on tcp://0.0.0.0:60099
Use Ctrl-C to stop
phantomjs /src/vendor/assets/javascripts/run-qunit.js http://localhost:60099/qunit 200000

Running: {}

...............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Time: 159725ms, Total: 2098, Passed: 2098, Failed: 0

Tests Passed

I’m not sure this helps a great deal…

1 Like

Yeah, it also works fine for me running the tests natively on my mac. And natively on travis it also seems to be fine (judging by discourse’s own CI tests).

It’s just the discourse_dev image that seems to be breaking things for me, and I can’t figure out what’s different about it… :confused:

1 Like

Try installing the latest version of PhantomJS in the image. I believe we’re running an outdated version.

4 Likes

We should get that fixed I thought @Falco posted a new dev image recently? no

2 Likes

I’ll give that a try - thanks :slight_smile:

I was under the impression that you guys used the docker image for the internal CI server that runs before pushing to tests_passed?

Or is this thread out of date?

The boot_dev is using a somewhat old image, just a sec.

Sorry, that’s what I meant. Running the discourse_dev image in docker (in Linux) the tests pass fine for me… I see how I wasn’t as clear as I could have been.

~/Projects/discourse sudo docker images
REPOSITORY                   TAG                 IMAGE ID            CREATED             SIZE
[...]
discourse/discourse_dev      latest              bed996f08446        5 months ago        1.67GB
[...]
~/Projects/discourse sudo docker ps
CONTAINER ID        IMAGE                            COMMAND             CREATED             STATUS              PORTS                                            NAMES
bba77cff4858        discourse/discourse_dev:latest   "/sbin/boot"        3 hours ago         Up 3 hours          0.0.0.0:1080->1080/tcp, 0.0.0.0:3000->3000/tcp   discourse_dev

I frequently do d/shutdown_dev, d/reset_db, d/boot_dev cycles, could that be affecting things?

1 Like

Yes, though we should be a bit smarter about checking once a week for a new dev image, or something along those lines.

OK, boot_dev --init will use the latest image:

https://github.com/discourse/discourse/commit/2e152f4d39e3b0f46769851d5840b30624d17b39

Be sure to nuke old image first… do docker ps to see if it is running.

9 Likes

Thanks @Falco. That new image certainly initialises quicker now that all the dependencies are up-to-date.

When I ran boot_dev --init I got this warning

Migrating database...
rake aborted!
Gem::LoadError: You have already activated rake 12.0.0, but your Gemfile requires rake 11.3.0. Prepending `bundle exec` to your command may solve this.
/src/config/boot.rb:11:in `<top (required)>'
/src/config/application.rb:1:in `<top (required)>'
/src/rakefile:5:in `<top (required)>'
LoadError: cannot load such file -- bundler/setup
/src/config/boot.rb:11:in `<top (required)>'
/src/config/application.rb:1:in `<top (required)>'
/src/rakefile:5:in `<top (required)>'
(See full trace by running task with --trace)
rake aborted!

Taking the advice of the error, I amended /bin/docker/rake to use bundle exec rake, and it worked better. If that’s the correct option, I made a PR here

Unfortunately, the change doesn’t seem to have fixed the issue I’m having with QUnit, I’m still getting those same tests failing :frowning:

1 Like

@david Can you try again with this fix?

https://github.com/discourse/discourse/commit/6c0a29698bed3e66eecc33bf083815e407c13c40

I was fixing a problem with discourse/discourse_test where running Qunit tests with RAILS_ENV=test would cause the Qunit tests to fail when it runs for the first time. Subsequent runs would pass as it should.

3 Likes

Sadly hasn’t fixed this issue. I’m getting fairly consistent failures for those 4 tests running using docker_test or docker_dev:

These are the same ones that seem to fail a lot on Travis, but I can’t pinpoint exactly why. I can’t repro the failures consistently enough to try and fix it either… :frowning:

I wonder whether it’s something to do with Disk i/o speed, as that would explain why I have issues on mac, but @LeoMcA is fine on Linux (mac docker volume mounts are a bit rubbish). I would also imagine that Travis’s disk i/o performance isn’t great. That’s purely speculation though - I have no evidence.

Running stuff natively on my mac, without Docker, is absolutely fine. That’s how I’m doing development at the moment.

5 Likes

@david Can you do me a favor and run the following commands?

docker run -ti --rm -e LOAD_PLUGINS=1 -e JS_ONLY=1 -e COMMIT_HASH=c86028f9a53b389a3163bf87e2f1cc77b21af503 -e LD_PRELOAD=/usr/lib/libjemalloc.so.1 --entrypoint="/bin/bash" discourse/discourse_test:release

Please let me know if the tests fail for you. I can reproduce the test failure with discourse_dev as well.

3 Likes

Think I’ve ruled that out - I stuck the Discourse filesystem and postgres data onto an old (very slow) USB stick and it still works fine. (as an aside: never do this, it’s painfully slow!)

That drops me into a bash session. I’m trying it it without the --entrypoint bit, will let you know if it works.

2 Likes

Oops yea please run it without the --entrypoint flag

2 Likes

If I run that command, it gets passed the database migration, and then gives up with

Warming up Rails server
rake aborted!
Errno::EADDRNOTAVAIL: Failed to open TCP connection to localhost:60099 (Cannot assign requested address - connect(2) for "localhost" port 60099)
/var/www/discourse/lib/tasks/qunit.rake:64:in `block in <main>'
/var/www/discourse/bundle/ruby/2.4.0/gems/rake-12.0.0/exe/rake:27:in `<top (required)>'
/usr/local/bin/bundle:22:in `load'
/usr/local/bin/bundle:22:in `<main>'
Errno::EADDRNOTAVAIL: Cannot assign requested address - connect(2) for "localhost" port 60099
/var/www/discourse/lib/tasks/qunit.rake:64:in `block in <main>'
/var/www/discourse/bundle/ruby/2.4.0/gems/rake-12.0.0/exe/rake:27:in `<top (required)>'
/usr/local/bin/bundle:22:in `load'
/usr/local/bin/bundle:22:in `<main>'
Tasks: TOP => qunit:test
(See full trace by running task with --trace)
The latest bundler is 1.15.3, but you are currently running 1.15.1.
To update, run `gem install bundler`
Terminating
LOG:  received smart shutdown request
LOG:  autovacuum launcher shutting down
209:signal-handler (1500994721) Received SIGTERM scheduling shutdown...
LOG:  shutting down
LOG:  database system is shut down
209:M 25 Jul 14:58:41.166 # User requested shutdown...
209:M 25 Jul 14:58:41.166 # Redis is now ready to exit, bye bye...
The latest bundler is 1.15.3, but you are currently running 1.15.1.
To update, run `gem install bundler`

I have no other docker containers running, so I don’t understand why it is having issues with ports…


The test I did earlier which got 4 fails was with

docker run -ti --rm -e JS_ONLY=1 discourse/discourse_test:release

although I now realise that will be running an old version of discourse since I didn’t specify a commit hash.

I think this is a case of bugs on top of bugs :laughing:

Have fixed this by adjusting the “server warmup” checks in the qunit rake task
https://github.com/discourse/discourse/pull/4995

So… If I run

docker run -it --rm -e LOAD_PLUGINS=1 -e JS_ONLY=1 -e COMMIT_HASH=origin/tests-passed --entrypoint=/bin/bash discourse/discourse_test:release

Then

su discourse
git remote update
git checkout origin/tests-passed
vi lib/tasks/qunit.rake #and make the PR change manually
bundle
bundle exec rake docker:test

It works and all the tests pass :tada:

Once that PR is merged I’ll run it without the manual hackery to check it’s not just a fluke.

5 Likes

Awesome, just tested this locally and it’s working. Merging the PR.

2 Likes

:goodnews: Hooray!

So, just to confirm, I can now run

docker run -it -e JS_ONLY=1 -e COMMIT_HASH=origin/master discourse/discourse_test:release

and get consistent qunit passes. So it looks like @tgxworld fixed the issue I created this topic about, but then another bug got in the way.

5 Likes