Forum skips posts, recovering requires F5/full page refresh

Occasionally a topic’s autoloading ajaxification will skip a post, and when it does it never recovers and loads it.
Since the topic is-read is based on the last post # you’ve read, you’ll never find out that you had skipped a post unless you either 1) get a notification because the skipped post was actually a reply to you, or 2) click the link icon to make sure every single ajaxed post loads in sequence without skipping a post id.
The only way I’ve found to recover the skipped post is to do a full page refresh.

When the topic gets a post that skips a post_id and is a fresh post (because old posts potentially were deleted, so skipping an id is not an error in that case), then it should re-query the server to verify that the id it missed should actually have been skipped… Or at least do SOMETHING to maintain synchronization.

8 Likes

Not a bug unless you have repro steps that work on try or meta.

This sounds like a race condition… which would be insanely hard to repro predictably.

I’ve added a delay to Post#publish_change_to_clients! to simulate a spike in network latency that affects posts made by user A. Setup: user A and B are both at the bottom of the same topic. User A posts. Within the artificial delay, user B posts. After the message bus has caught up, the posts in user B’s post stream are out of order and a loading indicator remains visible permanently:

(The first post has post numer 5, the second one has post number 6.)

Additionally, this error message pops up in the JS console: Error: Assertion Failed: Attempted to register a view with an id already in use: post-cloak-5. Backtrace from the server logs:

EmberError@http://majestix:3000/assets/development/ember.js?body=1:13741:17
Ember.assert@http://majestix:3000/assets/development/ember.js?body=1:3903:15
.enter@http://majestix:3000/assets/development/ember.js?body=1:41726:1
View<._transitionTo@http://majestix:3000/assets/development/ember.js?body=1:43714:35
Renderer.prototype.didInsertElement@http://majestix:3000/assets/development/ember.js?body=1:40038:9
Renderer_renderTree@http://majestix:3000/assets/development/ember.js?body=1:10129:11
.ensureChildrenAreInDOM@http://majestix:3000/assets/development/ember.js?body=1:41304:13
ContainerView<._ensureChildrenAreInDOM@http://majestix:3000/assets/development/ember.js?body=1:41267:9
Queue.prototype.invoke@http://majestix:3000/assets/development/ember.js?body=1:849:11
Queue.prototype.flush@http://majestix:3000/assets/development/ember.js?body=1:914:13
DeferredActionQueues.prototype.flush@http://majestix:3000/assets/development/ember.js?body=1:719:13
Backburner.prototype.end@http://majestix:3000/assets/development/ember.js?body=1:144:11
Backburner.prototype.run@http://majestix:3000/assets/development/ember.js?body=1:199:15
run@http://majestix:3000/assets/development/ember.js?body=1:17897:14
Discourse.Ajax<.ajax/performAjax/args.success@http://majestix:3000/assets/discourse/mixins/ajax.js?body=1:66:9
jQuery.Callbacks/fire@http://majestix:3000/assets/development/jquery-2.1.1.js?body=1:3074:10
jQuery.Callbacks/self.fireWith@http://majestix:3000/assets/development/jquery-2.1.1.js?body=1:3186:7
done@http://majestix:3000/assets/development/jquery-2.1.1.js?body=1:8252:5
.send/callback/<@http://majestix:3000/assets/development/jquery-2.1.1.js?body=1:8599:1

The post stream clearly does not handle out-of-order updates well. I’ve tested under development conditions, the net effect might be different in production mode, if the out-of-order post lags more than one post behind or if all reordered posts arrived through the message bus instead of one being generated locally.

4 Likes

Repro steps involve being in a highly active forum where you’re likely to get caught in a race condition or have failed post loads from other posters in the same topic. This forum is hardly active enough to cause it with any regularity, unless you want me to spam this place or try.d hoping I can “win” this race.

I know the insinuation here is that our forums are broken by our own doing. But this bug was reported quite a long ago and has persisted for months.

1 Like

I have seen this, very rarely, myself. Will look at adding some sort of algorithm that stablises stuff.

@jens keep in mind, message bus has guaranteed ordering per channel, this is much more likely timing related.

@darkmatter keep in mind, I flushed redis a few days ago on your site, which would have caused a fair amount of weird till you refresh browsers (plus import as been wreaking havoc)

Wrong @username? :smiley:

Anyway, the first out-of-order message is composed locally and doesn’t arrive through the message bus, so message bus ordering doesn’t apply. Also, even if all messages were received through the message bus, it still doesn’t apply since posts are not guaranteed to be pushed into the message bus in the same order they were stored in the database and had their post number allocated. (Requires unicorn with at least 2 workers.)

2 Likes

Yeah, I was quite confused for a (short) while. =)

2 Likes