Redefining "Top" scores

P16 · June 4, 2019, 11:36am

Not sure about the interest in this, but I had a few suggestions for tweaking the Top algorithm in order to increase engagement and provide a deeper discussion/experience for the user.

Give weight to (all adjustable in settings so as to be customizable for each forum):

Actual content of Topic:

Total Views & Views of Original Post (Lower rating if total views are significantly lower than Original Post views)
Original Post likes
Total likes (We need to measure these by number of replies - accordingly give higher or lower rating)
Total number of Replies (less than 3 get a negative rating)
Number of posts with 2 or more replies (higher number gives higher rating)
(Likes + Replies + Views) by time
Exceptional number of likes for any post (outlier post might be interesting)
Subscriptions to Topic (watched, tracked or muted), subscriptions in the past hour
Median Reading Time compared to estimated Reading Time, adjusted for number of posts
Analysis of median Reading time (decline or uptake over past 24 hours, past 12 hours, past 3 hours, past hour)
Number of users that have replied in topic, adjusted for number of posts
Number of users online in topic
Number of posts bookmarked by users TL2 or higher, adjusted for number of posts
Writing time for posts within topic, especially for Original Post
Number of tags (1 or more, but not more than 3 get higher rating)
Number of posts viewed in topic before closing on average, adjusted for number of posts
Questions without solutions (higher weighting for certain time)
Allowance for bumping old topics, in case they disappeared unfairly
Original Poster reputation (level, badges, solutions, blocks, flags, likes)
Referral traffic (higher score from within forum, lower score for outside)

Since we are measuring these for a specific time period, we would need to measure median metrics for each month to account for user growth and weight the above against monthly numbers. Also, maybe we could exclude TL0 views/likes/etc. to protect against algorithm gaming.

If we wanted to go the extra mile and personalize the feed for each user we could also give weight to:

User Interests (tracking/watching categories/subcategories and tags)
Visits by User (categories/subcategories & tags of topics ordered by times visited, negative rating for ignored categories in feed)
Time since Topic posted from last online (slight priority for recent posts, within the selected time period - today, week, month, quarter, year)
Topics posted by top Users you have liked, and topics with original posts liked by top 5 people you have liked
Potential User Interests (categories/subcategories not frequented and watched/tracked by user) - lower rating if user continues to ignore
Show content based on analysis of last 30 interactions of User

If we wanted to go full crazy without getting into Machine Learning, we could also do a discussion analysis. This could then also be used when summarizing a topic to expand the better posts.

I have borrowed the following analysis text from an LMS I worked with - we can draft a similar algorithm for analyzing discussions:

Substantive posts

Substantive posts is the number of responses or replies that contribute to the discussion’s development. A substantive post contains sentences that establish or support a user’s position or ask thoughtful questions. These posts also show critical thinking or sophisticated composition, based on word choice and variety.

Non-substantive posts may be short or underdeveloped. A user should expand on their post and explain a position to make the response or reply substantial.

Sentence complexity

Sentence complexity is measured by the number of sentences, words, and syllables in each response. We look at the complexity of words and how often the words are used. This measurement is a linguistic standard called Flesch-Kincaid. The complexity of each user’s total posts is represented by a grade level from 1st grade to 16th grade. Content with a Flesch-Kincaid grade level of 10 should be easily understood by a person in 10th grade.

Lexical variation

Lexical variation analyzes the substance of a user’s responses or replies based on the words they’ve used.

Content words carry meaning in a user’s response or reply. These words show a user’s feelings or thoughts regarding the prompt. When compared with total word count, content words help show the lexical density of a user’s responses and replies. A high count can indicate more sophisticated writing.

Functional words unite the semantic elements of a sentence together and indicate proper grammar. Prepositions, conjunctions, pronouns, and articles are functional words.

Think of functional words as the glue that holds a user’s response together. The words may not have substantial meaning themselves.

Critical thinking

Critical thinking indicates words and phrases within a user’s total posts that demonstrate critical thinking. Twelve dictionaries are used to identify the words, which then fall into one of the weighted categories of critical thinking:

Argue a position

Include supporting data

Cite literature or experience

Evaluate

Summarize

Reference data

Offer a hypothesis

How we measure critical thinking

The weighted number of the words and phrases in each category are combined and then compared to the site average to create the critical thinking score. The score is the difference between the user’s critical thinking and the site average.

The score falls in a decimal range of -1 to 1. A negative score means the user’s critical thinking is below the site average. A positive score means the user’s critical thinking is above the site average. A score close to 0 means the user’s critical thinking is at the site average level. These scores are represented by a range of low to high:

-1 < -0.06 = Low

-0.06 to -0.03 = Below Average

-0.03 to 0.03 = Average

0.03 to 0.06 = Above Average

.06 to 1 = High

Critical thinking is represented visually to show each user’s score compared to the site average.

Examples:

Empirical research shows disagreeing displays a higher level of critical thinking than agreeing. In a discussion, the statement “I agree with John” receives a score of 0.113, while “I disagree with John” receives a score of 0.260.

If users summarize a passage but add no opinion or argument, they score lower than others who argue a position.

If users cite literature, they receive a lower score than others who offer a hypothesis.

Word variation

Word variation measures the number of unique words in a user’s submission as a percentage. A higher percentage of unique words can show that the user’s composition contains multiple ideas, significantly supports a position, or engages other users to think about other perspectives.

We can compare the user’s percentage to the site average.

codinghorror · June 4, 2019, 11:02pm

This sounds incredibly complicated. What evidence do you have that the current much simpler methods are not good enough?

P16 · June 5, 2019, 1:34am

It’s not that the current system is not good enough - it works well, but can be made better as with any system. In particular, by making personalised recommendations unique to every user. Interest, relationship and usage should be taken into account in order to display the most interesting posts to a user, as ‘interesting’ is so subjective. By refining a user feed, it could make for an even deeper discussion and engagement.

Discourse currently uses the following criteria in deciding what are the most interesting topics for a user:

Likes
Number of Posts
Original Post likes
Views

It already has a bunch of data that could better filter the most interesting posts:

Reading time:
If users are spending a lot of time reading a topic (adjusted for number of posts), it would tend to more interesting than a topic where users skim or close it after reading a few posts.

Topic with multiple replies to posts:
A topic that has a lot of replies to posts might have an interesting discussion going on.

Subscriptions to topic:
If users are tracking/watching a topic with interest, it would have more utility than usual.

Number of posts bookmarked:
If a lot of posts are getting bookmarked, the topic would tend to have some important information.

High number of likes:
If a single post has an unusual number of likes compared to the median, it would tend to have community interest.

Number of users online:
A large number of users lurking on a topic might indicate an interesting discussion.

OP Reputation:
A user with a higher trust level and reputation (badges, solutions, likes) would tend to have higher quality topics. Similarly, a user who has been flagged multiple times for example, would tend to post lower quality topics.

Referral traffic:
If a topic is getting a lot of outside referral traffic, it would have something of interest to the community.

Additionally, a user would be more interested in content that is tailored to their taste. For instance:

User interest:
If a user is tracking/watching a category/subcategory, they would be more likely to find a topic within those interesting.

Liked users:
If I consistently like a user’s posts, I tend to find their content more interesting. Showing topics started by the most liked users would tend to be of interest.

Potential user interest:
On the other hand, users could also be shown a small number of topics from categories they do not watch/track, in case that is something of potential interest.

As with the current system, admins would be able to set the weights and so customize it to their taste.

Remah · June 5, 2019, 10:54pm

I find your lists very interesting IMO (I probably have no Humble opinion), I can’t see it working because it sounds, as @codinghorror says, “incredibly complicated” without a lot more refining of the focus.

I also got confused which is why I’d like to see your suggestions more clearly presented:

An overview of what you are proposing because there appears to be more than one suggestion. I had to read a lot before I could see what wasn’t clear.
I realize that you originally posted in the topic Improving the "Top" criteria but I would have had the same difficulty even in that context. Your lists are so comprehensive that your posts deserves to be a new discussion.
A logical separation of your suggestions. I can see two main ideas:
- To improve the Top (most active topics) view for the forum, which was indicated in the topic title.
- To create a new “Top” view customized per user.
Some attempt to indicate the most critical metrics because I can’t see how being able to customize more than 30 parameters will help administrators to provide a better experience for their users. An alternative might be to suggest some useful profiles for different purposes, e.g. critical thinking, news, support,or media upload. It may be that the most effective profiles use less than a dozen parameters in total.
Some engagement with existing topics on such metrics. There’s a lot of useful info there which I’m not sure that you’ve even looked at. Here’s some examples but there’s quite a bit more if you search.
- Displaying metrics in columns to indicate “top” topics:
  https://meta.discourse.org/t/should-views-column-be-removed/18941/6?u=remah
  https://meta.discourse.org/t/further-simplifying-the-columns-quality-score-view-count/18879?u=remah
- Topic likes:
  What are "Likes"?
- Reputation systems:
  On the Necessity of Up/Down Vote-Based Reputation Systems on Boards Involving Subjective Discussion
  What is the reason to not use reputation points?
  My thoughts about ‘reputation’ or ‘karma’
- Plug-ins that provide something similar:
  User scores / Reputation plugin

Topic		Replies	Views
Improving the "Top" criteria Feature rfc , spec	19	8175	June 4, 2019
Calculating "Top" topics in Discourse Using Discourse reference , interface	2	2019	July 8, 2025
Poll: Would you like to see Personalized / Customized User Feed for Discourse? Feature	13	2473	June 6, 2018
Topic of the week Feature	8	1876	January 24, 2017
An understandable explanation of 'Top'? Support	2	492	October 4, 2022

Redefining "Top" scores

Related topics