502 error - chatables

I got a 502 error here on Meta today. Apart from the pop up, I didn’t notice any unexpected behavior. I am not really sure what caused this, and I am not consistently able to reproduce it, but I was able to trigger it a few times.

Here is what I did:

  1. I used the + in the sidebar to open the DM chat with someone I talked to before but who is currently not in my sidebar.
  2. I used the full-screen chat button.
  3. I changed the size of the browser window so it was smaller.
  4. I made the browser window full screen again.
  5. I switched from full-screen chat back to the small chat window.
  6. About 7 seconds later, I saw

The browser console showed:

That’s all I have. I hope someone can make more of it than I can. If it helps, I have a video showing how I reproduced this.

1 Like

Maybe it’s caused by this step? Is it possible that it happens when you search for someone in that chat filter?

1 Like

I think I found the step I need to reproduce: When I type “L” into the chat filter the error appears about 30 seconds later

2 Likes

There was an bad query that was used to return the count of users with chat enabled in the chat group serializer and it was taking ~30s for your account which is the request timeout on our hosting (hence why you were getting it “randomly”)

3 Likes

Hmm, I see that it has been merged. Does that mean the error should no longer occur?

If I root caused it correctly, yes :sweat_smile:

Is it still happening?

It does, maybe less frequently. It’s strange; sometimes it takes a few seconds and the users are shown, and sometimes it fails.

:sad_but_relieved_face:

When it happens, can you show the network’s tab and the request that’s taking a long time?

Sigh. You ask that as if it only took two mouse clicks.

I’ll try.

That’s what I mean by saying that sometimes it takes a few seconds, and sometimes it fails:

image

Ok, I figured it out :sweat_smile:

My first fix only addressed part of the issue :man_facepalming: There was another inefficient database query happening when searching for groups in the chat filter. Depending on which groups matched your search term, the query could take a very long time to complete – sometimes exceeding our request timeout.

Interestingly, this only affected “regular” users and not “admins”, hence why I wasn’t able to reproduce myself :thinking:

When searching for groups, results are returned alphabetically. Admins can see all groups, so their first 10 results for “L” were small groups starting with ‘a’ (like “ai-personas” and other non-public groups). Regular users have more limited visibility, so their results included the large trust level groups :grimacing:, which is what caused the slow query.

Regular user sees:

  • trust_level_0: 62,506 users
  • trust_level_1: 34,494 users
  • trust_level_2: 4,727 users
  • trust_level_3: 39 users
  • trust_level_4: 13 users
  • plus some smaller groups

Total: ~102,000 users to load :collision:

Admin sees:

  • a*****: 4 users
  • a*****: 76 users
  • a*****: 0 users
  • a*****: 2 users
  • ai-personas: 138 users
  • etc.

Total: ~240 users to load :relieved:

1 Like

Why does searching for “L” return that group?

Just me failing at “anonymizing” the data and coming up with an example :man_facepalming:

1 Like

The full name of the group contains an L, so I wasn’t sure if that was the reason, or if it was a random example.