Not sure about the interest in this, but I had a few suggestions for tweaking the Top algorithm in order to increase engagement and provide a deeper discussion/experience for the user.
Give weight to (all adjustable in settings so as to be customizable for each forum):
Actual content of Topic:
- Total Views & Views of Original Post (Lower rating if total views are significantly lower than Original Post views)
- Original Post likes
- Total likes (We need to measure these by number of replies - accordingly give higher or lower rating)
- Total number of Replies (less than 3 get a negative rating)
- Number of posts with 2 or more replies (higher number gives higher rating)
- (Likes + Replies + Views) by time
- Exceptional number of likes for any post (outlier post might be interesting)
- Subscriptions to Topic (watched, tracked or muted), subscriptions in the past hour
- Median Reading Time compared to estimated Reading Time, adjusted for number of posts
- Analysis of median Reading time (decline or uptake over past 24 hours, past 12 hours, past 3 hours, past hour)
- Number of users that have replied in topic, adjusted for number of posts
- Number of users online in topic
- Number of posts bookmarked by users TL2 or higher, adjusted for number of posts
- Writing time for posts within topic, especially for Original Post
- Number of tags (1 or more, but not more than 3 get higher rating)
- Number of posts viewed in topic before closing on average, adjusted for number of posts
- Questions without solutions (higher weighting for certain time)
- Allowance for bumping old topics, in case they disappeared unfairly
- Original Poster reputation (level, badges, solutions, blocks, flags, likes)
- Referral traffic (higher score from within forum, lower score for outside)
Since we are measuring these for a specific time period, we would need to measure median metrics for each month to account for user growth and weight the above against monthly numbers. Also, maybe we could exclude TL0 views/likes/etc. to protect against algorithm gaming.
If we wanted to go the extra mile and personalize the feed for each user we could also give weight to:
- User Interests (tracking/watching categories/subcategories and tags)
- Visits by User (categories/subcategories & tags of topics ordered by times visited, negative rating for ignored categories in feed)
- Time since Topic posted from last online (slight priority for recent posts, within the selected time period - today, week, month, quarter, year)
- Topics posted by top Users you have liked, and topics with original posts liked by top 5 people you have liked
- Potential User Interests (categories/subcategories not frequented and watched/tracked by user) - lower rating if user continues to ignore
- Show content based on analysis of last 30 interactions of User
If we wanted to go full crazy without getting into Machine Learning, we could also do a discussion analysis. This could then also be used when summarizing a topic to expand the better posts.
I have borrowed the following analysis text from an LMS I worked with - we can draft a similar algorithm for analyzing discussions:
Substantive posts
Substantive posts is the number of responses or replies that contribute to the discussion’s development. A substantive post contains sentences that establish or support a user’s position or ask thoughtful questions. These posts also show critical thinking or sophisticated composition, based on word choice and variety.
Non-substantive posts may be short or underdeveloped. A user should expand on their post and explain a position to make the response or reply substantial.
Sentence complexity
Sentence complexity is measured by the number of sentences, words, and syllables in each response. We look at the complexity of words and how often the words are used. This measurement is a linguistic standard called Flesch-Kincaid. The complexity of each user’s total posts is represented by a grade level from 1st grade to 16th grade. Content with a Flesch-Kincaid grade level of 10 should be easily understood by a person in 10th grade.
Lexical variation
Lexical variation analyzes the substance of a user’s responses or replies based on the words they’ve used.
Content words carry meaning in a user’s response or reply. These words show a user’s feelings or thoughts regarding the prompt. When compared with total word count, content words help show the lexical density of a user’s responses and replies. A high count can indicate more sophisticated writing.
Functional words unite the semantic elements of a sentence together and indicate proper grammar. Prepositions, conjunctions, pronouns, and articles are functional words.
Think of functional words as the glue that holds a user’s response together. The words may not have substantial meaning themselves.
Critical thinking
Critical thinking indicates words and phrases within a user’s total posts that demonstrate critical thinking. Twelve dictionaries are used to identify the words, which then fall into one of the weighted categories of critical thinking:
Argue a position
Include supporting data
Cite literature or experience
Evaluate
Summarize
Reference data
Offer a hypothesis
How we measure critical thinking
The weighted number of the words and phrases in each category are combined and then compared to the site average to create the critical thinking score. The score is the difference between the user’s critical thinking and the site average.
The score falls in a decimal range of -1 to 1. A negative score means the user’s critical thinking is below the site average. A positive score means the user’s critical thinking is above the site average. A score close to 0 means the user’s critical thinking is at the site average level. These scores are represented by a range of low to high:
-1 < -0.06 = Low
-0.06 to -0.03 = Below Average
-0.03 to 0.03 = Average
0.03 to 0.06 = Above Average
.06 to 1 = High
Critical thinking is represented visually to show each user’s score compared to the site average.
Examples:
Empirical research shows disagreeing displays a higher level of critical thinking than agreeing. In a discussion, the statement “I agree with John” receives a score of 0.113, while “I disagree with John” receives a score of 0.260.
If users summarize a passage but add no opinion or argument, they score lower than others who argue a position.
If users cite literature, they receive a lower score than others who offer a hypothesis.
Word variation
Word variation measures the number of unique words in a user’s submission as a percentage. A higher percentage of unique words can show that the user’s composition contains multiple ideas, significantly supports a position, or engages other users to think about other perspectives.
We can compare the user’s percentage to the site average.