Find the users which are more likely to become TL3

Is there a method to get a sorted list of users who might become Trust Level 3?
Maybe using the Data Explorer plugin and a query (?)

5 Likes

That’s an interesting question. There will be no specific query since each member ‘earns’ TL3 through interaction and engagement, but you could monitor the TL2 users to see who are visiting most regularly and engaging with more content I suppose.

Is there a reason to find this in advance rather than waiting to see who earns it?

Some third party measurement tools create ‘leaderboards’ that might point to members who have been particularly active. Is that what you need?

In the past I have also created queries in excel based on data exported from the Users table and custom Data Explorer queries. I didn’t look at what you are asking, but did create monitoring tools to look at different types of activity, such as reading and posting, to better segment my members.

Let us know what you are trying to achieve and maybe we could come up with cleverer suggestions.

(also, this should probably be moved to #community where we can discuss these topics)

7 Likes

I thought about TL3 Requirements
Checking which users have the most number of :white_check_mark: Requirements and sort by that

3 Likes

You could do that, write a query to track the key fields and limit it to current TL2 users

There are loads of great query ideas for Data Explorer in another thread, and it seems that you could do a version of the User Directory, with the TL2 limit, to answer your query.

Still interested to know what you are trying to achieve in trying to ‘predict’ TL3 before it happens. Sounds like Minority Report :wink:

7 Likes

Though I have no doubt that some form of query could be put together, I have serious doubt that the amount of work needed would justify the questionable value of the results.

It is one thing to do this per user from a members admin user page, a whole different story for many accounts all at once.

At best, there will be many “moving parts” to take into consideration and arbitrary values to be decided on.

Some criteria could be used to reduce the “haystack”. i.e. only TL2 accounts that are not already TL3, accounts that are activated and not suspended. That might help somewhat.

Because many of the requirements may have been tweaked from their default values, those values would be needed to base a members values against.

Even then, most member values are unpredictable and unstable. eg. Likes could be given / received at any time changing a “0 - requirements not met” to “requirements met” in a heartbeat. Similar with flags given / received.

And what constitutes an “almost TL3” state? How many of the 12 requirements are already met? A percentage? eg.

if (value < requirement) 
 && ((value / requirement) > arbitrary_percent) { 

The “all time” values should in theory stagnate or increase only. But the “100 days” could be a problem. Should an algorithm somehow “drop” values associated with older days when it is trying to predict values for future days?

Anyway, long story short, if you can put together exact detailed specifications for how such a feature could work it would make it easier for someone to come up with the code needed to meet those specs.

I have the beginnings of this progress towards level 3 report which allows admins to view the progress of users so far, which I want to then use to email out messages of encouragement to users who are close (We like to promote TL3 user who share our tone of voice to moderators)

Someone kindly sent me the the trust level 3 requirements rb file which has helped a lot. however my limited knowledge and understanding how to convert the fields within the document into sql is limited, Maybe someone can help finish it off?

This is what I have so far.

Data Explorer Query

-- [params]
-- int :from_days_ago = 0
-- int :duration_days = 100

with
t as (
  select 
    CURRENT_TIMESTAMP - ((:from_days_ago + :duration_days) * (INTERVAL '1 days')) as start,
    CURRENT_TIMESTAMP - (:from_days_ago * (INTERVAL '1 days')) as end

),

-- Users
pr AS (
SELECT user_id, 
        count(1) as visits
FROM user_visits, t
WHERE visited_at > t.start
  and visited_at < t.end
GROUP BY user_id
ORDER BY visits DESC
),

-- Visits (all time)
vi as (
    select user_id, 
        count(1) as visits
    from user_visits, t
    group by user_id
),

-- Topics replied to
trt as (
    select user_id,
           count(distinct topic_id) as topic_id
    from posts, t
    where created_at > t.start
      and created_at < t.end
    group by user_id
),

-- Topics Viewed All Time
tva as (
    select user_id,
           count(topic_id) as topic_id
    from posts
    group by user_id
),

-- Posts Read
pra as (
    select user_id, 
        sum(posts_read) as posts_read
    from user_visits, t
    where visited_at > t.start
        and visited_at < t.end
    group by user_id
),

-- Posts Read All Time
prat as (
    select user_id, 
        sum(posts_read) as posts_read
    from user_visits, t
    group by user_id
)



SELECT  pr.user_id,
        coalesce(pr.visits,0) as "Visits",
        coalesce(trt.topic_id,0) as "Topic replied to",
        coalesce(tva.topic_id,0) as "Topic viewed (AT)",
        coalesce(pra.posts_read,0) as "Posts Read",
        coalesce(prat.posts_read,0) as "Posts Read (AT)"
    

FROM pr
left join vi using (user_id)
left join trt using (user_id)
left join tva using (user_id)
left join pra using (user_id)
left join prat using (user_id)





ORDER BY
  pr.visits DESC
7 Likes

Great start here, thanks!

I made a few tweaks / fixes:

  • Added ‘posts_read > 0’ condition for more accurate user visits calculation
  • Removed ‘visits (all time)’ which didn’t seem to be necessary
  • Fixed ‘topics viewed’ calculations which was using the wrong table
  • Added current trust level (to only get tl2 users)
  • Added where clauses for other relevant conditions, set at 50% of current threshold

Also parameterized a bunch of things so you can set your own values for each of the required metrics (since they may vary forum by forum), and also set a threshold percentage to show only users who meet at least that % of ALL the metrics.

So for example the below by default lists only tl2 users who meet 50% or more of all the requirements for visits, topics replied to, topics viewed, posts read…you could set it to 30% or 85% or whatever if it seems to be returning too many or too few results.

I did not add the requirements for likes given/received, or for flags/silences/suspensions. For us at least, the latter are super rare anyway, and I figure likes is one of the easier barriers to get people over if they know about it (some people just barely ever give likes). So this works pretty well for us. But the rest of the requirements could be added if you wanted.

For reference, on our forum we have ~1,000 TL2 users, ~10 TL3 users, and this query identifies ~30 ‘potential/almost TL3’ users with the 50% threshold.

-- [params]
-- int :from_days_ago = 0
-- int :duration_days = 100
-- int :trust_level = 2
-- int :threshold = 50
-- int :visits = 50
-- int :topics_replied_to = 10
-- int :topics_viewed = 76
-- int :topics_viewed_all_time = 200
-- int :posts_read = 755
-- int :posts_read_all_time = 500

-- NOTES
-- trust_level      show current TL2 users only
-- threshold        show only at users >= this percentage of all above metrics
-- topics_viewed    depends on total # of topics (default 25%)
-- posts_read       depends on total # of posts (default 25%)

WITH
t AS (
SELECT 
    CURRENT_TIMESTAMP - ((:from_days_ago + :duration_days) * (INTERVAL '1 days')) AS start,
    CURRENT_TIMESTAMP - (:from_days_ago * (INTERVAL '1 days')) AS end
),

-- User Visits
pr AS (
SELECT user_id, 
    count(1) as visits
FROM user_visits, t
WHERE visited_at > t.start
    AND visited_at < t.end
    AND posts_read > 0
GROUP BY user_id
ORDER BY visits DESC
),

-- Topics Replied To
trt AS (
SELECT user_id,
    COUNT(distinct topic_id) AS topic_id
FROM posts, t
WHERE created_at > t.start
    AND created_at < t.end
GROUP BY user_id
),

-- Topics Viewed
tva AS (
SELECT user_id,
    COUNT(distinct topic_id) AS topic_id
FROM topic_views, t
WHERE viewed_at > t.start
    AND viewed_at < t.end
GROUP BY user_id
),

-- Topics Viewed (All Time)
tvat AS (
SELECT user_id,
    COUNT(distinct topic_id) AS topic_id
FROM topic_views
GROUP BY user_id
),

-- Posts Read
pra AS (
SELECT user_id, 
    SUM(posts_read) AS posts_read
FROM user_visits, t
WHERE visited_at > t.start
    AND visited_at < t.end
GROUP BY user_id
),

-- Posts Read (All Time)
prat AS (
SELECT user_id, 
    SUM(posts_read) AS posts_read
FROM user_visits, t
GROUP BY user_id
),

-- Current Trust Level
tl AS (
SELECT id,
    trust_level
FROM users
)

SELECT pr.user_id,
    -- tl.trust_level AS "Trust Level",
    coalesce(pr.visits,0) AS "Visits",
    coalesce(trt.topic_id,0) AS "Topic Replied To",
    coalesce(tva.topic_id,0) AS "Topics Viewed",
    coalesce(tvat.topic_id,0) AS "Topics Viewed (AT)",
    coalesce(pra.posts_read,0) AS "Posts Read",
    coalesce(prat.posts_read,0) AS "Posts Read (AT)"
FROM pr

LEFT JOIN trt USING (user_id)
LEFT JOIN tva USING (user_id)
LEFT JOIN tvat USING (user_id)
LEFT JOIN pra USING (user_id)
LEFT JOIN prat USING (user_id)
LEFT JOIN tl ON (tl.id = pr.user_id)

WHERE
tl.trust_level = :trust_level
AND pr.visits >= :visits * :threshold / 100
AND trt.topic_id >= :topics_replied_to * :threshold / 100
AND tva.topic_id >= :topics_viewed * :threshold / 100
AND tvat.topic_id >= :topics_viewed_all_time * :threshold / 100
AND pra.posts_read >= :posts_read * :threshold / 100
AND prat.posts_read >= :posts_read_all_time * :threshold / 100

ORDER BY
pr.visits DESC
6 Likes

This seems to be exactly what I am searching for, however, I get the following error when executing the query:

PG::QueryCanceled: ERROR:  canceling statement due to statement timeout

Any ideas how to make this work?

2 Likes

Hello,

This report returns

  • Total topics
  • Total topics in AT

but the requirements for TL3 use

  • Total topics excluding private messages.
  • Total topics in AT excluding private messages.

Anyone knows how to adapt the query to exclude the private messages?

Thanks in advance

1 Like

Yes, tried that: 60, 80, 95, 99 -> no effect at all, always the same error message.

1 Like

Here’s a modified version with no parameters: I have lazily (efficiently?) hard-coded the default TL3 requirements into the query. I have expanded the data set slightly to include likes given and received, although not likes/unique days or likes/unique users. Flags, silences and suspensions are also still missing.

This is a gap report, so it does the subtraction to show you what each user is missing.

In a couple places, it doesn’t exactly match what shows up on the user admin page:

  • likes given (my count is somehow higher)
  • posts last 100 days (my count is lower)

There’s a good bit of guesswork in my count of the last-100-days-posts

But in case it’s helpful:

with
t as (
  select 
    CURRENT_TIMESTAMP - ((0 + 100) * (INTERVAL '1 days')) as start,
    CURRENT_TIMESTAMP - (0 * (INTERVAL '1 days')) as end

),

-- Topic count last 100 days 25%
tclhd AS (
    SELECT LEAST(floor(count(id)*.25)::REAL,500) as all_topics
    FROM topics, t
    WHERE created_at > t.start 
        AND archetype = 'regular'
        AND deleted_at is null
),

-- Post count last 100 days 25%
pclhd AS (
SELECT LEAST(FLOOR(count(id)*.25)::REAL,20000) AS all_posts
FROM t, posts
WHERE posts.created_at > start 
    AND posts.deleted_at is null
    AND posts.hidden_at is null
    AND posts.last_editor_id >0  -- Omit Discobot & System
    AND (action_code is null OR action_code != 'assigned')

),

-- Users
pr AS (
    SELECT user_id, 
        count(1) as visits
    FROM user_visits, t
    WHERE visited_at > t.start
      AND visited_at < t.end
    GROUP BY user_id
    ORDER BY visits DESC
),


-- Topics replied to
trt as (
    select user_id,
           count(distinct topic_id) as topic_id
    from posts, t
    where created_at > t.start
      and created_at < t.end
    group by user_id
),

-- Topics Viewed All Time
tvat as (
    select tv.user_id,
        COUNT(distinct tv.topic_id) AS topic_id
    FROM topic_views tv
    LEFT JOIN topics t on tv.topic_id=t.id
    WHERE 
        t.archetype = 'regular'
        AND t.deleted_at is null
    group by tv.user_id
),

-- Topics Viewed
tva AS (
SELECT tv.user_id,
    COUNT(distinct tv.topic_id) AS topic_id
FROM t, topic_views tv
    LEFT JOIN topics on topic_id=topics.id
    WHERE 
        topics.archetype = 'regular'
        AND topics.deleted_at is null
        AND viewed_at > t.start
        AND viewed_at < t.end
GROUP BY tv.user_id
),

-- Posts Read
pra as (
    select user_id, 
        sum(posts_read) as posts_read
    from user_visits, t
    where visited_at > t.start
        and visited_at < t.end
    group by user_id
),

-- Posts Read All Time
prat as (
    select user_id, 
        sum(posts_read) as posts_read
    from user_visits, t
    group by user_id
),

-- Current Trust Level
tl AS (
SELECT id,
    trust_level
FROM users
),

likes AS (
SELECT user_id,
    likes_given, likes_received
from user_stats
)


SELECT  pr.user_id, 
        greatest(50-coalesce(pr.visits,0),0) as "Days visited gap",
        greatest(10-coalesce(trt.topic_id,0), 0)  as "Topic reply gap",
        greatest(tclhd.all_topics-coalesce(tva.topic_id,0),0) AS "Topics Viewed gap",
        greatest(200-coalesce(tvat.topic_id,0),0) as "Topic viewed (AT) gap",
        greatest(pclhd.all_posts - coalesce(pra.posts_read,0),0) as "Posts Read gap",
        greatest(500-coalesce(prat.posts_read,0),0) as "Posts Read (AT) gap",
        greatest(30-likes.likes_given,0) as "Likes given gap",
        greatest(20-likes.likes_received,0) as "Likes received gap"

FROM pclhd, tclhd, pr
left join trt using (user_id)
LEFT JOIN tva USING (user_id)
left join tvat using (user_id)
left join pra using (user_id)
left join prat using (user_id)
LEFT JOIN likes using (user_id)
LEFT JOIN tl ON (tl.id = pr.user_id)

WHERE tl.trust_level = 2

ORDER BY
  pr.visits DESC
4 Likes

Thanks for your version!
However, I’m getting back the same error

PG::QueryCanceled: ERROR:  canceling statement due to statement timeout

I guess we have a problem then with our installation.

3 Likes

For me this currently runs in 8,379.4 ms, so close to the limit, I suppose. You must have a larger community.

Adding LIMIT 50 at the very end slices off 1k ms for me. Maybe you could play around with that until you get something back.

4 Likes

This is slightly more efficient. If it still doesn’t run for you, then you can try stripping out some of the columns, with their associated joins and queries.

EDIT Okay, I finally got my join types straight (it’s been a while). This updated query is much more efficient

with
t as (
  select 
    CURRENT_TIMESTAMP - ((0 + 100) * (INTERVAL '1 days')) as start,
    CURRENT_TIMESTAMP - (0 * (INTERVAL '1 days')) as end
),

-- Topic count last 100 days 25%
-- lesser of 25% topics created in last 100 days 
-- OR 500, the system default max requirement for TL3
tclhd AS (
    SELECT LEAST(floor(count(id)*.25)::REAL,500) as all_topics
    FROM topics, t
    WHERE created_at > t.start 
        AND archetype = 'regular'
        AND deleted_at is null
),

-- Post count last 100 days 25%
-- lesser of 25% posts created in last 100 days 
-- OR 20k, the system default max requirement for TL3
pclhd AS (
SELECT LEAST(FLOOR(count(id)*.25)::REAL,20000) AS all_posts
FROM t, posts
WHERE posts.created_at > start 
    AND posts.deleted_at is null
    AND posts.hidden_at is null
    AND posts.last_editor_id >0  -- Omit Discobot & System
    AND (action_code is null OR action_code != 'assigned')
),

-- Trust Level 2 users
tl AS (
SELECT id as user_id, trust_level
FROM users
WHERE trust_level = 2
),

-- Users, visits & posts read last 100 days
pr AS (
    SELECT user_id, 
        count(1) as visits,
        sum(posts_read) as posts_read
    FROM t, user_visits
    INNER JOIN tl using (user_id)
    WHERE visited_at > t.start
      AND visited_at < t.end
    GROUP BY user_id
    ORDER BY visits DESC
),

-- Posts Read All Time
prat as (
    select user_id, 
        sum(posts_read) as posts_read
    from t, user_visits
    INNER JOIN tl using (user_id)
    group by user_id
),

-- Topics replied to
trt as (
    select user_id,
           count(distinct topic_id) as topic_id
    from t, posts
    INNER JOIN tl using (user_id)
    where posts.created_at > t.start
      and posts.created_at < t.end
    group by user_id
),

-- Topics Viewed All Time
tvat as (
    select tv.user_id,
        COUNT(distinct tv.topic_id) AS topic_id
    FROM topic_views tv
    LEFT JOIN topics t on tv.topic_id=t.id
    INNER JOIN tl on tv.user_id=tl.user_id
    WHERE 
        t.archetype = 'regular'
        AND t.deleted_at is null
    group by tv.user_id
),

-- Topics Viewed
tva AS (
SELECT tv.user_id,
    COUNT(distinct tv.topic_id) AS topic_id
FROM t, topic_views tv
    LEFT JOIN topics on topic_id=topics.id
    INNER JOIN tl on tv.user_id=tl.user_id
    WHERE 
        topics.archetype = 'regular'
        AND topics.deleted_at is null
        AND viewed_at > t.start
        AND viewed_at < t.end
GROUP BY tv.user_id
),

likes AS (
    SELECT user_id,
        likes_given, likes_received
    from user_stats
    INNER JOIN tl using (user_id)
)


SELECT  pr.user_id, 
        greatest(50-coalesce(pr.visits,0),0) as "Days visited gap",
        greatest(10-coalesce(trt.topic_id,0), 0)  as "Topic reply gap",
        greatest(tclhd.all_topics-coalesce(tva.topic_id,0),0) AS "Topics Viewed gap",
        greatest(200-coalesce(tvat.topic_id,0),0) as "Topic viewed (AT) gap",
        greatest(pclhd.all_posts - coalesce(pr.posts_read,0),0) as "Posts Read gap",
        greatest(500-coalesce(prat.posts_read,0),0) as "Posts Read (AT) gap",
        greatest(30-likes.likes_given,0) as "Likes given gap",
        greatest(20-likes.likes_received,0) as "Likes received gap"

FROM pclhd, tclhd, pr
left join trt using (user_id)
LEFT JOIN tva USING (user_id)
left join tvat using (user_id)
left join prat using (user_id)
LEFT JOIN likes using (user_id)


ORDER BY
  pr.visits DESC
  
LIMIT 50
6 Likes

Aaand… here’s the TL2 gap report:

with

-- Trust Level 1 users
tl AS (
    SELECT id as user_id, trust_level, last_seen_at
    FROM users
    WHERE trust_level = 1
),

-- Users seen in last 3mo + visits, posts read, reading time
pr AS (
    SELECT user_id, 
        count(1) as visits,
        sum(posts_read) as posts_read,
        SUM(time_read)/60 as minutes_reading_time,
        DATE(last_seen_at) AS last_seen
    FROM user_visits
    INNER JOIN tl using (user_id)
    WHERE DATE(last_seen_at) >= CURRENT_DATE - INTERVAL '3 month'
    GROUP BY user_id, last_seen
    ORDER BY visits, last_seen DESC
),

-- Topics replied to
trt as (
    select posts.user_id,
           count(distinct topic_id) as replied_count
    from posts
    INNER JOIN tl using (user_id)
    INNER JOIN topics ON topics.id = posts.topic_id
    WHERE topics.user_id <> posts.user_id
        AND posts.deleted_at IS NULL AND topics.deleted_at IS NULL
--        AND topics.archetype <> 'private_message'
        AND archetype = 'regular'
    GROUP BY posts.user_id
    ORDER BY replied_count DESC
),

-- Topics Viewed All Time
tvat as (
    select tv.user_id,
        COUNT(distinct tv.topic_id) AS topic_id
    FROM topic_views tv
    LEFT JOIN topics t on tv.topic_id=t.id
    INNER JOIN tl on tv.user_id=tl.user_id
    WHERE 
        t.archetype = 'regular'
        AND t.deleted_at is null
    group by tv.user_id
),

likes AS (
    SELECT user_id,
        likes_given, likes_received
    from user_stats
    INNER JOIN tl using (user_id)
)

SELECT  pr.user_id, 
        pr.last_seen as "Last seen",
        -- days visited: 15
        greatest(15-coalesce(pr.visits,0),0) as "Days visited gap",
        -- topic replies: 3
        greatest(3-coalesce(trt.replied_count,0), 0)  as "Topic reply gap",
        -- topics entered: 20
        greatest(20-coalesce(tvat.topic_id,0),0) as "Topic viewed gap",
        -- posts read: 100
        greatest(100-coalesce(pr.posts_read,0),0) as "Posts Read gap",
        -- time spent reading posts: 60min
        greatest(60-pr.minutes_reading_time,0) as "Reading time gap",
        -- likes given: 1
        greatest(1-likes.likes_given,0) as "Likes given gap",
        -- likes received: 1
        greatest(1-likes.likes_received,0) as "Likes received gap"

FROM pr
left join trt using (user_id)
left join tvat using (user_id)
LEFT JOIN likes using (user_id)


ORDER BY
  pr.visits DESC
  
LIMIT 500
2 Likes