What cool data explorer queries have you come up with?

data-explorer

(Dave McClure) #67

User participation

Shows the following columns over a specified time range (in days):

       user_id,
       username,
       name,
       email,
       visits, 
       topics_viewed,
       posts_read, 
       posts_created,
       topics_created,
       topics_with_replies,
       likes_given,
       likes_received

Full query:

-- [params]
-- int :from_days_ago = 0
-- int :duration_days = 30

with
t as (
  select 
    CURRENT_TIMESTAMP - ((:from_days_ago + :duration_days) * (INTERVAL '1 days')) as start,
    CURRENT_TIMESTAMP - (:from_days_ago * (INTERVAL '1 days')) as end
),
pr as (
    select user_id, 
        count(1) as visits,
        sum(posts_read) as posts_read
    from user_visits, t
    where posts_read > 0
    and visited_at > t.start
    and visited_at < t.end
    group by user_id
),
pc as (
  select user_id, 
      count(1) as posts_created
  from posts, t
  where created_at > t.start
  and created_at < t.end
  group by user_id
),
ttopics as (
 select user_id, posts_count
  from topics, t
  where created_at > t.start
  and created_at < t.end
),
tc as (
  select user_id, 
      count(1) as topics_created
  from ttopics
  group by user_id
),
twr as (
  select user_id, 
      count(1) as topics_with_replies
  from ttopics
  where posts_count > 1
  group by user_id
),
tv as (
 select user_id, 
        count(distinct(topic_id)) as topics_viewed
  from topic_views, t
  where viewed_at > t.start
  and viewed_at < t.end
  group by user_id
),
likes as (
  select 
      post_actions.user_id as given_by_user_id, 
      posts.user_id as received_by_user_id
  from t, post_actions
  left join posts
  on post_actions.post_id = posts.id
  where post_actions.created_at > t.start
  and post_actions.created_at < t.end
  and post_action_type_id = 2

),
lg as (
  select given_by_user_id as user_id, 
      count(1) as likes_given
  from likes
  group by user_id
),
lr as (
  select received_by_user_id as user_id, 
      count(1) as likes_received
  from likes
  group by user_id
),
e as (
  select email, user_id
  from user_emails u
  where u.primary = true
)
select pr.user_id,
       username,
       name,
       email,
       visits, 
       coalesce(topics_viewed,0) as topics_viewed,
       coalesce(posts_read,0) as posts_read, 
       coalesce(posts_created,0) as posts_created,
       coalesce(topics_created,0) as topics_created,
       coalesce(topics_with_replies,0) as topics_with_replies,
       coalesce(likes_given,0) as likes_given,
       coalesce(likes_received,0) as likes_received
from pr
left join tv using (user_id)
left join pc using (user_id)
left join tc using (user_id)
left join twr using (user_id)
left join lg using (user_id)
left join lr using (user_id)
left join e using (user_id)
left join users on pr.user_id = users.id
order by 
  visits desc, 
  posts_read desc, 
  posts_created desc

Bug or bot? A 'new' user showing as read a lot of posts
Attempting to create Monthly Leaderboard query - head exploding
(Dave McClure) #68

Participation Histograms

(Output from these used to make graphs shown in this post).

Posts Read User Counts

Number of users who have read at least N posts over the specified time period.

-- [params]
-- int :from_days_ago = 0
-- int :duration_days = 28

with
t as (
  select 
    current_date::timestamp - ((:from_days_ago + :duration_days) * (INTERVAL '1 days')) as start,
    current_date::timestamp - (:from_days_ago * (INTERVAL '1 days')) as end
),
read_visits as (
    select user_id, 
        count(1) as visits,
        sum(posts_read) as posts_read
    from user_visits, t
    where posts_read >= 1
    and visited_at > t.start
    and visited_at < t.end
    group by user_id
)
select
      case when posts_read >= 1 and posts_read <= 1    then '0001'
          when posts_read >= 2 and posts_read <= 3   then '0002 - 03'
          when posts_read >= 4 and posts_read <= 7   then '0004 - 07'
          when posts_read >= 8 and posts_read <= 15  then '0008 - 15'
          when posts_read >= 16 and posts_read <= 40  then '0016 - 31'
          when posts_read >= 32 and posts_read <= 63  then '0032 - 63'
          when posts_read >= 64 and posts_read <= 127  then '0064 - 127'
          when posts_read >= 128 and posts_read <= 255  then '0128 - 255'
          when posts_read >= 256 and posts_read <= 511  then '0256 - 511'
          when posts_read >= 512 and posts_read <= 1023  then '0512 - 1023'
          when posts_read >= 1024 and posts_read <= 2047  then '1024 - 2047'
          else '>= 2048'
      end as num_posts_read,
      count(*) as num_users
  from
      read_visits
  group by num_posts_read
  order by num_posts_read

Posts Created User Counts

Number of users who have created at least N posts over the specified time period.

-- [params]
-- int :from_days_ago = 0
-- int :duration_days = 28

with
t as (
  select 
    current_date::timestamp - ((:from_days_ago + :duration_days) * (INTERVAL '1 days')) as start,
    current_date::timestamp - (:from_days_ago * (INTERVAL '1 days')) as end
),
pc as (
  select user_id, 
      count(1) as posts_created
  from posts, t
  where created_at > t.start
  and created_at < t.end
  group by user_id
)
select
      case when posts_created >= 1 and posts_created <= 1    then '01'
          when posts_created >= 2 and posts_created <= 3   then '02 - 03'
          when posts_created >= 4 and posts_created <= 7   then '04 - 07'
          when posts_created >= 8 and posts_created <= 15  then '08 - 15'
          when posts_created >= 16 and posts_created <= 31  then '16 - 31'
          when posts_created >= 32 and posts_created <= 63  then '32 - 63'
          when posts_created >= 64 and posts_created <= 127  then '64 - 127'
          when posts_created >= 128 and posts_created <= 255  then '128 - 255'
          else '> 256'
      end as num_posts_created,
      count(*) as num_users
  from
      pc
  group by num_posts_created
  order by num_posts_created

(Carson) #70

Hi, can anyone provide a data explorer query that provides a list of all new users within the past week and month? It’d be ideal if the format was so they were already @mentioned, making it easy to copy-paste it into a new post that welcomes these new members to the community.


#71

@outofthebox

This should work ok for you -

    select concat ('@', username)
    from users
    where created_at >= CURRENT_DATE - INTERVAL '1 week'
    order by created_at desc 

Just change the 1 week to 1 month or what ever interval you want - you can then export as csv and should be able to copy and paste that into a post.


(Robert McIntosh) #72

There is a badge for “First Onebox” so I assume this means that the database tracks what users post links in their posts - is that true?

Is it possible to query the database to see how many links that users have posted?

More interestingly, would it be possible to get a list of the URLs so we can track where they linked to? I would be interested to know how many members have posted to a specific URL (of ours)


(Jeremy M) #73

I updated to include column header and parameter so you can easily change the interval:

-- [params]
-- string :interval = 1 week

select concat ('@', username) as "new users"
    from users
    where created_at >= CURRENT_DATE - INTERVAL :interval
    order by created_at desc

(Holly Ross) #82

Hey @dmcclure - Thanks for the great example

This has been immensely helpful to us. I am still working on my SQL street cred (as in: I see what you are doing, sort of, but I can’t seem to replicate). Would you be willing to help me incorporate a couple more types of participation into this query? Specifically we are looking to add bookmarks and give/receive solutions checkbox to this mix.

Also - A Question:

For the days parameters in that query, I want to make sure I am interpreting this correctly. If I enter 0/30 respectively, that gives me the counts for all users who have done any of those activities in the last 30 days. If I enter 30/60, then I get all users who have done those activities in the period 30 to 60 days ago.

Is that right?

Thanks so much!


(Ben Leong) #83

Hi! I’m looking for a bit of help modifying a query to return usernames, instead of user ID numbers.

This is what I’ve been using so far (modified from @DavidGNavas’ excellent Network Chart thread)

WITH pairs AS (
    SELECT p.user_id liked, pa.user_id liker
    FROM post_actions pa
    LEFT JOIN posts p ON p.id = pa.post_id
    LEFT JOIN topics t ON t.id = p.topic_id
    LEFT JOIN categories c ON c.id = t.category_id
    WHERE post_action_type_id = 2
    AND c.id = 47
)
SELECT liker liker_user_id, liked liked_user_id, count(*)
FROM pairs
GROUP BY liked, liker
ORDER BY count DESC

Which I’ve then used a very clunky vlookup to match user names with ID numbers…

Once I have a list of pairs (likers & liked) expressed as usernames, it lets me follow the method used in the Network Chart thread to generate these sorts of charts - showing activity within specific categories, as we have very distinct sub-communities on different parts of our forums:

The live charts are a valuable way of exploring our different sub-communities, quickly identifying influential members and clusters of people.

For example, compare our musician community with our graphic designers - there’s a big difference between the two :slight_smile:


(Dave McClure) #84

30/60 would get you the 60-day period 30-90 days ago. It’d be lovely if there were a way to use a date picker instead, but I’m not aware of one :slight_smile:


(Dave McClure) #85

Here’s a modified version that does that:

WITH pairs AS (
    SELECT p.user_id liked_id, pa.user_id liker_id
    FROM post_actions pa
    LEFT JOIN posts p ON p.id = pa.post_id
    LEFT JOIN topics t ON t.id = p.topic_id
    LEFT JOIN categories c ON c.id = t.category_id
    WHERE post_action_type_id = 2
    AND c.id = 1
)
SELECT 
  liker.username as liker,
  liked.username as liked,
  count(*)
FROM pairs
LEFT JOIN users liker
ON liker_id = liker.id
LEFT JOIN users liked
ON liked_id = liked.id
GROUP BY liked, liker
ORDER BY count DESC

(Ben Leong) #86

Thanks @mcwumbly - that is perfect :slight_smile:

I’ll start putting together network charts for all the other bits of our forums, now the process doesn’t require a workaround to get hold of usernames.


(Chris Beach) #90

A pattern I’ve spotted on my forum: one or two users habitually “like” posts that break our community guidelines.

These users may not intentionally be trolling, but their “likes” are serving to encourage bad behaviour.

Here’s a query to identify the users who have “liked” the most flagged posts, where the flags have been “agreed” by a moderator:

SELECT likes.user_id, count(*) as count 
FROM post_actions pa 
    JOIN post_action_types pat ON pa.post_action_type_id = pat.id
    JOIN post_actions likes ON likes.post_id = pa.post_id AND
        likes.post_action_type_id = 2
WHERE 
    pat.is_flag AND
    pat.name_key NOT IN ('notify_user') AND
    pa.agreed_by_id IS NOT NULL
GROUP BY 
    likes.user_id
ORDER BY 
    count DESC
LIMIT 100

#91

Here are some incredibly useful queries that @Simon_Cossar wrote:

New topics

Lists all new topics created with a given month, ordered by category and creation_date. The query accepts a ‘months_ago’ parameter. It defaults to 0 to give you the stats for the current month.

-- [params]
-- int :months_ago = 1

WITH query_period as (
    SELECT
        date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
        date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
    t.id as topic_id,
    t.category_id
FROM topics t
RIGHT JOIN query_period qp
    ON t.created_at >= qp.period_start
        AND t.created_at <= qp.period_end
WHERE t.user_id > 0
    AND t.category_id IS NOT NULL
ORDER BY t.category_id, t.created_at DESC

New topics per category (by month)

Returns the number of new topics created in each category for a given month, ordered by topic_count. The query accepts a ‘months ago’ parameter, defaults to 0 to give the results for the current month.

-- [params]
-- int :months_ago = 1

WITH query_period as (
    SELECT
        date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
        date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
    t.category_id,
    count(1) as topic_count
FROM topics t
RIGHT JOIN query_period qp
    ON t.created_at >= qp.period_start
        AND t.created_at <= qp.period_end
WHERE t.user_id > 0
    AND t.category_id IS NOT NULL
GROUP BY t.category_id
ORDER BY topic_count DESC

Top 50 active topics (and their associated categories)

Returns the top 50 active topics per month. It’s based on the number of replies created for a topic in a given month. The query accepts a ‘months_ago’ parameter, defaults to 0 to give results for the current month.

This is from @vinothkannans

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
t.id as topic_id,
t.category_id,
COUNT(p.id) as reply_count
FROM topics t
JOIN posts p
ON t.id = p.topic_id
JOIN query_period qp
ON p.created_at >= qp.period_start
AND p.created_at <= qp.period_end
WHERE t.archetype = 'regular'
AND t.user_id > 0
GROUP BY t.id
ORDER BY COUNT(p.id) DESC, t.score DESC
LIMIT 50

Solved count by members (monthly)

Count of topics solved by regular members (defaults to last month)

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
ua.user_id,
count(1) AS solved_count
FROM user_actions ua
INNER JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
INNER JOIN users u
ON u.id = ua.user_id
WHERE ua.action_type = 15
AND u.admin = 'f'
AND u.moderator = 'f'
GROUP BY ua.user_id
ORDER BY solved_count DESC

Solved count by staff (monthly)

Count of topics solved by staff(defaults to last month)

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
ua.user_id,
count(1) AS solved_count
FROM user_actions ua
INNER JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
INNER JOIN users u
ON u.id = ua.user_id
WHERE ua.action_type = 15
AND (u.admin = 't' OR u.moderator = 't')
GROUP BY ua.user_id
ORDER BY solved_count DESC

Topics solved by regular users (lists topics)

Returns topics solved by regular users over a given monthly period, ordered by solution_date. The query accepts a ‘months_ago’ parameter, defaults to 0 to give the results for the current month.

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
ua.target_topic_id,
ua.target_post_id
FROM user_actions ua
INNER JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
INNER JOIN users u
ON u.id = ua.user_id
WHERE ua.action_type = 15
AND (u.admin = 'f' AND u.moderator = 'f')
ORDER BY ua.created_at DESC

Topics solved by staff (lists topics)

Returns topics solved by staff over a given monthly period, ordered by solution_date. The query accepts a ‘months_ago’ parameter, defaults to 0 to give the results for the current month.

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
ua.target_topic_id,
ua.target_post_id
FROM user_actions ua
INNER JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
INNER JOIN users u
ON u.id = ua.user_id
WHERE ua.action_type = 15
AND (u.admin = 't' OR u.moderator = 't')
ORDER BY ua.created_at DESC

Top 50 posters

Returns the top 50 posters for a given monthly period. Results are ordered by post_count. It accepts a ‘months_ago’ parameter, defaults to 0 to give results for the current month.

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
),

user_posts_in_period AS (
SELECT
p.user_id
FROM posts p
INNER JOIN query_period qp
ON p.created_at >= qp.period_start
AND p.created_at <= qp.period_end
WHERE p.user_id > 0
)

SELECT
up.user_id,
count(1) as post_count
FROM user_posts_in_period up
GROUP BY up.user_id
ORDER BY post_count DESC
LIMIT 50

Top 50 likers

Returns the top 50 likers for a given monthly period. Results are ordered by like_count. It accepts a ‘months_ago’ parameter, defaults to 0 to give results for the current month.

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
ua.user_id,
count(1) AS like_count
FROM user_actions ua
INNER JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
WHERE ua.action_type = 1
GROUP BY ua.user_id
ORDER BY like_count DESC
LIMIT 50

Revised queries

Time to respond by staff (monthly)

Average time to first staff response for topics created by regular users in a given time period for a hard-coded array of categories. The categories array can be changed by altering this line: AND t.category_id = ANY ('{46,25,43,40,44,35,22,7,20,17,6,12}'::int[]). It’s possible to alter the query so that the category array can be supplied as a string parameter, written in the form {1, 2, 3}.

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
),
staff_responses AS (
SELECT
DISTINCT ON (p.topic_id)
p.topic_id,
p.created_at,
t.category_id,
DATE_TRUNC('minute', p.created_at - t.created_at) AS response_time
FROM posts p
JOIN topics t
ON t.id = p.topic_id
AND t.category_id = ANY ('{46,25,43,40,44,35,22,7,20,17,6,12}'::int[])
JOIN users u
ON u.id = p.user_id
WHERE p.post_number > 1
AND u.admin = 't' OR u.moderator = 't'
ORDER BY p.topic_id, p.created_at
),
user_topics AS (
SELECT
t.id
FROM topics t
JOIN users u
ON u.id = t.user_id
WHERE u.admin = 'f' AND u.moderator = 'f'
)

SELECT
sr.category_id,
AVG(sr.response_time) AS "Average First Response Time",
COUNT(1) AS "Topics Responded to"
FROM staff_responses sr
JOIN query_period qp
ON sr.created_at >= qp.period_start
AND sr.created_at <= qp.period_end
JOIN user_topics t
ON t.id = sr.topic_id
GROUP BY sr.category_id

Topics solved by staff

The number of topics solved by staff for a given time period, broken down by categories. The categories array can be altered by editing this line: WHERE t.category_id = ANY ('{46,25,43,40,44,35,22,7,20,17,6,12}'::int[]). It is possible to alter the query so that it returns results for all categories, or so that the categories array is supplied as a string parameter in the form {1, 2, 3}.

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
t.category_id,
COUNT(1) AS solved_count
FROM user_actions ua
JOIN topics t
ON t.id = ua.target_topic_id
JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
JOIN users u
ON u.id = ua.user_id
WHERE t.category_id = ANY ('{46,25,43,40,44,35,22,7,20,17,6,12}'::int[])
AND ua.action_type = 15
AND (u.admin = 't' OR u.moderator = 't')
GROUP BY t.category_id
ORDER BY solved_count DESC

Topics solved by regular users

(There is probably a way to combine this query with the staff query.)

The number of topics solved by staff for a given time period, broken down by categories. The categories array can be altered by editing this line: WHERE t.category_id = ANY ('{46,25,43,40,44,35,22,7,20,17,6,12}'::int[]). It is possible to alter the query so that it returns results for all categories, or so that the categories array is supplied as a string parameter in the form {1, 2, 3}.

-- [params]
-- int :months_ago = 1

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
t.category_id,
COUNT(1) AS solved_count
FROM user_actions ua
JOIN topics t
ON t.id = ua.target_topic_id
JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
JOIN users u
ON u.id = ua.user_id
WHERE t.category_id = ANY ('{46,25,43,40,44,35,22,7,20,17,6,12}'::int[])
AND ua.action_type = 15
AND (u.admin = 'f' AND u.moderator = 'f')
GROUP BY t.category_id
ORDER BY solved_count DESC

Query for Unsolved Topics
(Robert McIntosh) #97

It may not be the best way to do it (I still need to find a way to tell it to convert minutes to “Daily”, “Weekly” etc) but I find this information useful in order to know what digests my members are receiving - plus I analyse the results in excel to monitor the % who turn off digests.

SELECT users.id AS "ID",
users.username AS "Username",
user_options.email_digests AS "Digests",
user_options.digest_after_minutes AS "Timing"
FROM users
JOIN  user_options ON users.id =  user_options.user_id
ORDER BY ID

(Jp Robotshop) #98

Indeed :slight_smile: Kinda struggled with it but found a way to make it work. So, here you go!

-- [params]
-- int :months_ago = 0

WITH query_period AS (
SELECT
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' as period_start,
date_trunc('month', CURRENT_DATE) - INTERVAL ':months_ago months' + INTERVAL '1 month' - INTERVAL '1 second' as period_end
)

SELECT
t.category_id,

SUM (CASE WHEN (u.admin = 'false' OR u.moderator = 'false') THEN 1 ELSE 0 END) AS "Solved by Members",
SUM (CASE WHEN (u.admin = 't' OR u.moderator = 't') THEN 1 ELSE 0 END) AS "Solved by Staff"

FROM user_actions ua
JOIN topics t
ON t.id = ua.target_topic_id
JOIN query_period qp
ON ua.created_at >= qp.period_start
AND ua.created_at <= qp.period_end
JOIN users u
ON u.id = ua.user_id
WHERE t.category_id = ANY ('{[LIST CATEGORY IDS]}'::int[])
AND ua.action_type = 15

GROUP BY t.category_id ORDER BY t.category_id ASC

(Kyle Selby) #100

Hello all,

I am attempting to filter out a group I created named “Employees” from this data explorer tab, what am I doing wrong? How do I exclude group that I have created?:

    SELECT 
    sum(p.score) / count(p) as "average score per post", 
    count(p.id) as post_count, 
    p.user_id
FROM posts p
JOIN users u ON u.id = p.user_id
WHERE p.created_at >= CURRENT_DATE - INTERVAL '6 month'
  AND NOT u.admin
  AND NOT u.employees
  AND u.active
GROUP by user_id, u.views
HAVING count(p.id) > 10
ORDER BY sum(p.score) / count(p) DESC
LIMIT 20

(Brendan Schlagel) #101

The users table has primary_group_id (integer) so you could use that in your query if “Employees” is set as primary group for these users. Otherwise I think you’d have to do a separate join w/ the groups table. (Note — I don’t know the details of table structure off the top of my head but the little search thing on the right when editing a Data Explorer query is super useful for this!)


(Simon Cossar) #102

To get a list of users that excludes the members of a group, you could try something like this. This will exclude the members of the ‘employees’ group. It should be possible to rework your query to use this.

WITH group_users AS (
SELECT user_id
FROM group_users gu
JOIN groups g
ON g.id = gu.group_id
WHERE g.name = 'employees'
)

SELECT
u.id AS user_id
FROM users u
WHERE u.id NOT IN (SELECT * FROM group_users)
ORDER BY user_id

#111

How best do you think to fetch 10% of user to user PM messages in the last 30 days I’m trying to find a “randomise” function that could be called upon?

I run a website where people talk about largely illegal stuff - drugs mainly. As a platform we explicitly don’t allow sales or anything like that - I’m looking for some kind of “random call” (Sorry I don’t code I don’t know precisely how to articulate what I’m asking) randomised/fair way to defensibly say we “Spot check” PMs to make sure the rules are being followed.

I don’t obviously want to arbitrarily pick on people for this. We have flagged words and phrases and that does work but some people are incidentally getting around it, I would like (probably need) to be able to say we have community flags, moderators, watched words and “This” spot checking - I’m just not sure where to get the randomness … parameter from if it exists.

I’m not begging for free code/scripts either btw I’m just not sure how to ask for what I want if it’s going to the market place or is feasible.

You’d call the API to get that user’s messages. And messages are not stored as message records they are just normal posts, linked with topics, so you need to distinguish between post posts, and message posts.

Hmmmmmmmmmmmmmm

Right I think we’re going to call a crypto rand function to fetch topics that only have say 2 users in it and see how that goes, will get back with findings if anyone else runs communities where you might have legal need to spot check PMs in a fair way.


(Mittineague) #112

The terminology is a bit off. Posts belong to Topics. There are no “message” posts, but … there are “message” topics.

When you query the topics table, test against the archetype field for the value private_message.