I think I solved this one.
https://github.com/discourse/discourse/commit/87ec11e298e950203966a988fae4d1a9e197f9d7
We had the data when storing post timings, but weren’t using it. There are still a few cases where the count won’t be 100% accurate, but it’s so much better than counting every post in a topic as read.
As for repairing existing stats… Maybe we don’t?