SQL: The most N used words per user (speak their language!)

So, here it is:

“Speak their language” SQL query

This query fetches 60 lexemes that are most frequently used by users in their messages from last 90 days, per user.

Notes

  • users are ordered by the number of words they posted in the last 90 days, most active posters first
  • only those messages are included that were posted in the last 90 days
  • instead of counting word tokens, the query counts lexems, so that “dogs” and “dog” are interpreted as the same word

Application

  • Control your language. I noticed I use the word “flag” too often. Maybe not a good habit for an admin who tries to be friendly!
  • Find who cares about what the most. For instance:
    • From the screenshot, I can see that user shcher86 used word cheese 517 times, cottage cheese 133 times and recipe 202 times. Now I know what to ask her about :wink:
    • Admin Brovarchanka really cares about goats (353), and also uses the :wink: smile all the time. I should care more about users and goats too, rather than flags!

SQL Query

WITH

posters AS (
  SELECT SUM(p.word_count) as total_words,
    u.*
    
  FROM users u
  LEFT JOIN posts p ON p.user_id = u.id
  WHERE p.created_at > now()::date - 91
  GROUP BY u.id
)


SELECT u.id as user_id,
  
  (SELECT string_agg(word || ' - ' || nentry, ' ') FROM

    (SELECT * FROM ts_stat('
    
      SELECT search_data
      FROM post_search_data psd
      LEFT JOIN posts p ON p.id = psd.post_id
      WHERE p.created_at > now()::date - 91
        AND p.user_id = ' || u.id)

    ORDER BY nentry DESC, ndoc DESC, word
    LIMIT 60
    ) freq
  ) as top_words
  
  
FROM posters u
ORDER BY u.total_words DESC
LIMIT 100

Tuning

  • LIMIT 100 limits to the top 100 posters; adjust to your needs
  • p.created_at > now()::date - 91 limits analysis to posts in last 90 days:
    91 = 90 days
    90 = 89 days
    … and so on
    You have to change it in 2 places in the query
  • LIMIT 60 limits output to 60 lexemes, that’s where you may change it if needs be

Enjoy learning your users’ language!
P.S. And don’t ask your users to flag too often :wink:

11 Likes