Sporadic missing letter avatars (cache problem?)

I’ve been seeing a sporadic problem with avatars loading as well.
Seems to always be new members that I haven’t “seen” before.
I’ve been having cable connection problems lately (finally found it was due to a sharply bent telephone line in the telephone base) so I had thought it was at my end only.

In all cases (except when a dropped connection) a page refresh has brought up the avatars.

It hasn’t happened consistently enough (often, but not consistently) and I haven’t tried debugging, but I’ll look more deeply next time it happens.

EDIT
And it just happened. I went to https://meta.discourse.org/t/vagrant-appears-not-be-setting-a-real-localhost/34248

And got this until I reloaded the page.

Checking the mark-up before and after looks the same, so maybe it’s some kind of cache issue?

initial page load

<div class="topic-avatar">

        <a href="/users/gustavo_souza" classnames="trigger-user-card main-avatar" data-user-card="gustavo_souza">
<img alt="" src="https://avatars.discourse.org/v2/letter/g/41988e/45.png" class="avatar" title="gustavo_souza" height="45" width="45">
</a>

      <!---->
    </div>

after page reload

<div class="topic-avatar">

        <a href="/users/gustavo_souza" classnames="trigger-user-card main-avatar" data-user-card="gustavo_souza">
<img alt="" src="https://avatars.discourse.org/v2/letter/g/41988e/45.png" class="avatar" title="gustavo_souza" height="45" width="45">
</a>

      <!---->
    </div>
1 Like

Interesting… thanks for that additional info, @Mittineague. I don’t believe I’ve seen that particular behaviour, but I’ll keep a closer eye out in the future. I’d say you’re right, that it’s a caching issue, but tracking down exactly where the bug is is going to be fun times. Does the Chrome devtools console show any errors loading the avatar URL? I find that if there’s a failed asset load, the exact problem is usually reported in the console, so that might give us some clues to start with.

I’ve been running Firefox with Network open - on two Discourse forums - waiting for it to happen again.

I’m becoming more convinced this isn’t a Bug, but rather a Heisenbug

1 Like

(sorry for hijacking, and I have Win 7 until I get the nerve to do the upgrade to Win 10)

I still have feeling it’s cache related.

It hasn’t happened again for me with the “post” avatar yet, but it just happened with a User Card avatar.

.

@mpalmer - is it possible that there’s some interim state where:

  • a request for the avatar is pending
  • the target file has been created but not written to
  • another request comes in and reads the newly created 0-byte file; this gets cached
  • the original request completes
2 Likes

That sounds entirely plausible, @riking. I’ll go through the code this morning and try and confirm your hypothesis. Thanks.

An update on this topic:

  1. I’ve gone through the code, and confirmed that the avatar file is generated in a temporary location and then moved into place, which should mean it’s atomic and there’s no possibility of the file being empty at any point. The creation and move is all happening on one filesystem, too.
  2. I’ve greatly increased the amount of logging we’re doing on avatar service, so @Mittineague, if you get the problem again on any forum that’s using our letter avatars service, if you can give me a time (to within a couple of minutes) and the exact URL that misbehaved and then came good (and, preferably, the IP address you were using), then I should be able to see exactly what’s going wrong.
4 Likes

Around 11:20 AM Eastern Daylight Time

<a href="/users/stephan2307" classnames="trigger-user-card main-avatar" data-user-card="stephan2307">
<img alt="" src="https://avatars.discourse.org/letter/s/34f0e0/45.png" class="avatar" title="stephan2307" height="45" width="45">
</a>

Hovering over the img src in dev tools shows a toolitp with
“Could not load the image”

71.192.198.203
or IPV6
2601:19b:4100:1ac9:b9da:50c1:4d32:ed5d

1 Like

Ah, you’re a legend, @Mittineague! That’ll do very nicely.

Hmm, I’m having trouble finding any relevant-looking requests in the logs. Can you PM me the source IP address of the requests? Also, for clarity, which Eastern Daylight Time are you using? (I’m currently assuming UTC-0400, America/New_York).

We had a memory leak in the avatars process since forever, but I set up a weekly task to restart the avatars container about a month ago, so all should be well. I’ve verified that memory is under control since then.

1 Like