Some network requests fail on every first time page load


(Christopher Nelson) #1

I’m using 1.9.x beta 13, which appears to be the very latest. I have a private install running under docker, and using the subfolder method documented elsewhere here.

Everytime a user goes to the site for the first time with a new browser (or after doing a hard refresh) one or two requests timeout. The server is not heavily loaded, and the problem is completely reproducible with wget.

Any suggestions would be appreciated.

wget 'http://mepsfoundation.bethel.jw.org/forum/assets/fontawesome-webfont-2adefcbc041e7d18fcf2d417879dc5a09997aa64d675b7a3c4b6ce33da13f3fe.woff2?http://mepsfoundation.bethel.jw.org/forum/&2&v=4.7.0'
--2017-10-26 13:44:07--  http://mepsfoundation.bethel.jw.org/forum/assets/fontawesome-webfont-2adefcbc041e7d18fcf2d417879dc5a09997aa64d675b7a3c4b6ce33da13f3fe.woff2?http://mepsfoundation.bethel.jw.org/forum/&2&v=4.7.0
Resolving mepsfoundation.bethel.jw.org (mepsfoundation.bethel.jw.org)... 10.114.23.89
Connecting to mepsfoundation.bethel.jw.org (mepsfoundation.bethel.jw.org)|10.114.23.89|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 77160 (75K) [application/octet-stream]
Saving to: ‘fontawesome-webfont-2adefcbc041e7d18fcf2d417879dc5a09997aa64d675b7a3c4b6ce33da13f3fe.woff2?http:%2F%2Fmepsfoundation.bethel.jw.org%2Fforum%2F&2&v=4.7.0’

fontawesome-webfont-2adefcbc041e7d18fcf2d417879dc5a09997aa64d675b7a3c4b6ce33da1  42%[==================================================================================>                                                                                                                    ]  31.65K  --.-KB/s    in 65s     

2017-10-26 13:45:12 (501 B/s) - Connection closed at byte 32409. Retrying.

--2017-10-26 13:45:13--  (try: 2)  http://mepsfoundation.bethel.jw.org/forum/assets/fontawesome-webfont-2adefcbc041e7d18fcf2d417879dc5a09997aa64d675b7a3c4b6ce33da13f3fe.woff2?http://mepsfoundation.bethel.jw.org/forum/&2&v=4.7.0
Connecting to mepsfoundation.bethel.jw.org (mepsfoundation.bethel.jw.org)|10.114.23.89|:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 77160 (75K), 44751 (44K) remaining [application/octet-stream]
Saving to: ‘fontawesome-webfont-2adefcbc041e7d18fcf2d417879dc5a09997aa64d675b7a3c4b6ce33da13f3fe.woff2?http:%2F%2Fmepsfoundation.bethel.jw.org%2Fforum%2F&2&v=4.7.0’

fontawesome-webfont-2adefcbc041e7d18fcf2d417879dc5a09997aa64d675b7a3c4b6ce33da1 100%[+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++===================================================================================================================>]  75.35K  --.-KB/s    in 0.006s  

2017-10-26 13:45:13 (7.47 MB/s) - ‘fontawesome-webfont-2adefcbc041e7d18fcf2d417879dc5a09997aa64d675b7a3c4b6ce33da13f3fe.woff2?http:%2F%2Fmepsfoundation.bethel.jw.org%2Fforum%2F&2&v=4.7.0’ saved [77160/77160]

(Jeff Atwood) #2

Beta 3 is very far from latest.


(Christopher Nelson) #3

Sorry, that was a typo. I actually have 1.9.0.beta13+149


(Christopher Nelson) #4

I have discovered that removing the ?http://mepsfoundation.bethel.jw.org/forum/&2&v=4.7.0' seems to make the problem go away. Of course, that’s not particularly useful for me since I am not generating the links.

Also, I have tried with and without subfolder. I am using nginx as the server on the virtual machine serving up the site. I have tried forwarding through the domain socket, and over TCP. I have tried disabling rate limiting and adjusting memory and unicorn workers. I’m kind of at a loss as to what else to look for.


(Christopher Nelson) #5

I have discovered a work around that works for me. I’m not sure what the root cause of the problem is, but it appears that static resources send approximately half their contents and then timeout. I have seen errors which reported “incorrect size”, but I’m not sure if that’s real or just a side-effect of the transfer terminating too soon. In any case, if you retry the download (especially with resume-capable clients) it will nearly always work.

Consequently, I have put an nginx cache in front of the discourse part of the site which covers forum/assets/*. That way nginx retries the item until is succeeds, and then caches a complete item. All clients retrieve from the outside nginx server, and after the first client, all others retrieve items immediately.

I tried to investigate the problem in the docker image, but it wasn’t clear why the transfer was failing. This effectively resolves the problem for me.


(Sam Saffron) #6

Can you try latest, we were serving static content out of Rails by mistake post Rails 5.1 upgrade. We fixed that a couple of days ago.


(Christopher Nelson) #7

I updated to the latest version and it was still kind of odd, but I did not have an opportunity to disable the outside caching when I performed the test. I will try to test this later and let you know.


(Christopher Nelson) #8

I recently disabled the outside caching, and this is still a problem. Everytime I upgrade it makes users think the site has gone down when it’s really just random load failures. Stopping the page load after a minute or so and then hitting refresh (not hard refresh) always fixes the problem.


(Sam Saffron) #9

Well upgrading from the command line (./launcher rebuild) is not seamless unless

  1. You have a dedicated database and redis container
  2. You have multiple web containers
  3. You have a load balancer in front of the web containers

Do you have 1 - 3 going?


(Christopher Nelson) #10

I’m not upgrading from the command-line, I use the web upgrader. I’m not talking about the downtime due to the upgrade, I’m talking about once the upgrade is finished most clients experience persistent failures downloading assets.


(Christopher Nelson) #11

This is still a problem with v2.0.0.beta3 +145. It’s quite reproducible. After I run the upgrade and discourse has restarted, the first client that connects will hang indefinitely waiting for some resources.

If I open a new tab and connect, the site will load immediately but some resources will be slow. Eventually, everything arrives for later clients. However, the first client never recovers.


(Christopher Nelson) #12

I just updated to the latest beta version, and I’m having the same problems. The root of the problem generally seems to be content length mismatches. The webserver in the discourse docker container has problems serving up assets. Service requests appear to generally work. The log is:

(index):34 GET http://mepsfoundation.bethel.jw.org/forum/assets/ember_jquery-27e777857b8c0730dacfe09cb11711365d21a5db4f9ee0b85d494e4259cf6cda.js net::ERR_CONTENT_LENGTH_MISMATCH
preload-store-ec90ffab9d7a6d9e507dda7cf7343e9d50b8bce624f7f44486ac8fd6b9814309.js:1 Uncaught ReferenceError: define is not defined
    at preload-store-ec90ffab9d7a6d9e507dda7cf7343e9d50b8bce624f7f44486ac8fd6b9814309.js:1
(anonymous) @ preload-store-ec90ffab9d7a6d9e507dda7cf7343e9d50b8bce624f7f44486ac8fd6b9814309.js:1
_vendor-fcfb2b5247e3716df46f307580cb540b5fc2978c700950ee7e85c77ecc0b5da3.js:1486 Uncaught ReferenceError: Ember is not defined
    at _vendor-fcfb2b5247e3716df46f307580cb540b5fc2978c700950ee7e85c77ecc0b5da3.js:1486
    at _vendor-fcfb2b5247e3716df46f307580cb540b5fc2978c700950ee7e85c77ecc0b5da3.js:1488
(anonymous) @ _vendor-fcfb2b5247e3716df46f307580cb540b5fc2978c700950ee7e85c77ecc0b5da3.js:1486
(anonymous) @ _vendor-fcfb2b5247e3716df46f307580cb540b5fc2978c700950ee7e85c77ecc0b5da3.js:1488
_vendor-fcfb2b5247e3716df46f307580cb540b5fc2978c700950ee7e85c77ecc0b5da3.js:33 Uncaught ReferenceError: $ is not defined
    at window.onerror (_vendor-fcfb2b5247e3716df46f307580cb540b5fc2978c700950ee7e85c77ecc0b5da3.js:33)
window.onerror @ _vendor-fcfb2b5247e3716df46f307580cb540b5fc2978c700950ee7e85c77ecc0b5da3.js:33
error (async)
(anonymous) @ _vendor-fcfb2b5247e3716df46f307580cb540b5fc2978c700950ee7e85c77ecc0b5da3.js:10
(anonymous) @ _vendor-fcfb2b5247e3716df46f307580cb540b5fc2978c700950ee7e85c77ecc0b5da3.js:36
(index):40 GET http://mepsfoundation.bethel.jw.org/forum/assets/pretty-text-bundle-d8a2e94eb658e9772d1ffdfcc0c3c5d31eba61f2c36e5f86a0d59f2d76e795ba.js net::ERR_CONTENT_LENGTH_MISMATCH
application-d68f066b365a02a9f49f4c34a2e60a9e79a15ab40389a0cc43217519b962e5da.js:1 Uncaught ReferenceError: define is not defined
    at application-d68f066b365a02a9f49f4c34a2e60a9e79a15ab40389a0cc43217519b962e5da.js:1
(anonymous) @ application-d68f066b365a02a9f49f4c34a2e60a9e79a15ab40389a0cc43217519b962e5da.js:1
_plugin-18bdcf59fdb3e559b437b61626c79ceada79ad8994c97d272ef413226bef7e49.js:58 Uncaught ReferenceError: jQuery is not defined
    at _plugin-18bdcf59fdb3e559b437b61626c79ceada79ad8994c97d272ef413226bef7e49.js:58
(anonymous) @ _plugin-18bdcf59fdb3e559b437b61626c79ceada79ad8994c97d272ef413226bef7e49.js:58
plugin-third-party-067f2e1f1274234b3dfead587a69961a4873c3916fcf74e24bd0ec0cf4d0275f.js:1 Uncaught ReferenceError: define is not defined
    at plugin-third-party-067f2e1f1274234b3dfead587a69961a4873c3916fcf74e24bd0ec0cf4d0275f.js:1
(anonymous) @ plugin-third-party-067f2e1f1274234b3dfead587a69961a4873c3916fcf74e24bd0ec0cf4d0275f.js:1
(index):50 GET http://mepsfoundation.bethel.jw.org/forum/assets/admin-df2063af5c4693fff66917c700756e4fd7644887948651199a4c0398445a012b.js net::ERR_CONTENT_LENGTH_MISMATCH
(index):198 Uncaught ReferenceError: require is not defined
    at (index):198
    at (index):209
(anonymous) @ (index):198
(anonymous) @ (index):209
(index):215 Uncaught ReferenceError: Ember is not defined
    at (index):215
(anonymous) @ (index):215
(index):226 Uncaught ReferenceError: require is not defined
    at (index):226
    at (index):242
(anonymous) @ (index):226
(anonymous) @ (index):242