Infinite requests to `/presence/update`

Ok it might not be a bug per say and maybe it was designed like that but I have some interrogations on how the presence is being updated on the js side. The file app/assets/javascripts/discourse/app/services/presence.js

What I understood so far is that the JS client:

  • Will throttle the update to 1 per second when the queue of updates is not empty
  • Then one request every 30 seconds when there is no changes

What I noticed is that if the server errors for any reason the js will keep making requests every seconds. Also creating errors in the console Uncaught (in promise) line 565.
Shouldn’t it stop at some point after N errors to not overload the server ?

How to reproduce ?

Simulate an error on the presence_controller.rb update method. (In my case I rendered status 405 just for the sake of the demo)

  def update

    # JS client is designed to throttle to one request per second
    # When no changes are being made, it makes one request every 30 seconds
    RateLimiter.new(nil, "update-presence-#{current_user.id}", 20, 10.seconds).performed!
    render json: { error: "Not authorized" }, status: 405
    return
    ...

Then on the client go to any screen with a presence management. I chose the topic reply screen and started replying.

Result is that it will keep going for ever and every second

1 Like

Also is this really necessary to recompute the presence every 30 seconds ? Because there is:

  • a computation when an enter action is taken
  • a computation when a leave action is taken
  • and a computation when the userPresence changes

Is the feature request here to “back off” on error, that way we would back off all the way to 30 seconds or so on repeat errors…

1 Like

@AhmedLoud is this the issue that you fixed via FIX: use `_presentChannels.size` instead of `_presentChannels.length`… · discourse/discourse@589add7 · GitHub, or is it a separate problem?

1 Like

@david sorry for the misunderstanding.

My first topic post was not related to this: FIX: use `_presentChannels.size` instead of `_presentChannels.length`… · discourse/discourse@589add7 · GitHub

Even without the scheduled update every 30 seconds if there is any error from the backend the algorithm will try again every second because the queue of events will not be empty:

In _scheduleNextUpdate

   } else if (this._queuedEvents.length > 0) {
      this._cancelTimer();
      cancel(this._debounceTimer);
      this._debounceTimer = debounce(
        this,
        this._throttledUpdateServer,
        this._presenceDebounceMs // This is 1 sec I believe
      );

In fact in the _updateServer method → events are put back in the queue on error:

  } catch (e) {
      // Put the failed events back in the queue for next time
      this._queuedEvents.unshift(...queue);

This was just something I noticed afterwards… When everything goes well ( meaning no errors ), is it really necessary to keep updating the server every other 30 seconds ?

} else if (
      !this._nextUpdateTimer &&
      this._presentChannels.size > 0 &&
      !isTesting()
    ) {
      this._nextUpdateTimer = discourseLater(
        this,
        this._throttledUpdateServer,
        PRESENCE_INTERVAL_S * 1000 // This is 30 seconds
      );
    }
1 Like

I don’t believe there is a back off. It keeps retrying every 1 second without stopping when there is any error. But maybe I missed something.

1 Like

Ok I think I understood why it needs to “check-in” again every 30 seconds.

There is a background job that is triggered every minutes PresenceChannelAutoLeave that will make expired members leave the channel.

It is also commented in presence_channel.rb for the present(user:, client_id:) method:

    # Mark a user's client as present in this channel. The client_id should be unique per
    # browser tab. This method should be called repeatedly (at least once every DEFAULT_TIMEOUT)
    # while the user is present in the channel.
    def present(user:, client_id:)

Still wondering about the error part that is triggered every second

We were already handling 429 ‘rate limited’ errors correctly. But for other, unexpected, errors I’ve added some exponential backoff logic (capped at 30s). Thanks for bringing this to our attention @AhmedLoud

3 Likes

This topic was automatically closed after 9 days. New replies are no longer allowed.