What cool data explorer queries have you come up with?

data-explorer

(Jeff Atwood) #108

It definitely would; if you are not reading topics… well, what else is there to do in Discourse?


(Carson) #109

I don’t know! Does ‘read time’ track time spent writing posts?


(Jeff Atwood) #110

No, but post count would. There’s no way someone is writing posts for more than minutes a day; reading should dominate.


#111

How best do you think to fetch 10% of user to user PM messages in the last 30 days I’m trying to find a “randomise” function that could be called upon?

I run a website where people talk about largely illegal stuff - drugs mainly. As a platform we explicitly don’t allow sales or anything like that - I’m looking for some kind of “random call” (Sorry I don’t code I don’t know precisely how to articulate what I’m asking) randomised/fair way to defensibly say we “Spot check” PMs to make sure the rules are being followed.

I don’t obviously want to arbitrarily pick on people for this. We have flagged words and phrases and that does work but some people are incidentally getting around it, I would like (probably need) to be able to say we have community flags, moderators, watched words and “This” spot checking - I’m just not sure where to get the randomness … parameter from if it exists.

I’m not begging for free code/scripts either btw I’m just not sure how to ask for what I want if it’s going to the market place or is feasible.

You’d call the API to get that user’s messages. And messages are not stored as message records they are just normal posts, linked with topics, so you need to distinguish between post posts, and message posts.

Hmmmmmmmmmmmmmm

Right I think we’re going to call a crypto rand function to fetch topics that only have say 2 users in it and see how that goes, will get back with findings if anyone else runs communities where you might have legal need to spot check PMs in a fair way.


(Mittineague) #112

The terminology is a bit off. Posts belong to Topics. There are no “message” posts, but … there are “message” topics.

When you query the topics table, test against the archetype field for the value private_message.


#113

I like your idea.

I have your situation, and I control the PMs with this query:

That should help you :slight_smile:


#114

Oh that’s awesome mate, I can export those results then draw against them with random.org and check those users pms for being cheeky. I think random actually lets your save the draw list for reference as well if you pay them a small fee so there’s proof of no foul play on the admin side.

Nice one @SidV


#115

I have a new idea @DNSTARS !

What if you can “search” in the PMs terms like “buy” or “drugs:lol:

Check this out:

-- [params]
-- int :limit = 10
-- string :term = %term%
SELECT p.user_id, p.topic_id, p.post_number, p.raw, p.created_at::date
FROM posts p
LEFT JOIN topics t on t.id = p.topic_id
WHERE t.archetype = 'private_message'
  AND t.title <> 'WELCOME-TITLE-from-discobot'
  AND p.created_at::date > now()::date - 8
  AND p.raw ILIKE :term
ORDER BY p.created_at DESC
LIMIT :limit

Replace WELCOME-TITLE-from-discobot by the exact title of your welcome message, so that all welcome messages generated automatically are excluded.

The query’s idea source is from guru’s @meglio posted here.

I only add some little changes :wink:

Query list updated :rocket:


#116

It stumbled over how to handle the emoji, our welcome is :robot: Greeetings! so just left it as greetings and let it run and it largely works - we have about 20+ catch alls phrases though on the watched words log/flag section.

Because someone for example could be just chatting about the best LSD experience they ever had to a friend which is fine, what is not fine is “Where do I get” “How do I make” “Who has the best?” so we use the baked in flagging system to catch those and for that purpose it works well.

Legally though since there is “more” we could be doing to be doing everything “Reasonably” expected of us if it ever came to it, I can now say the community flags inappropriate content (as they are good and know the rules), we moderate, we have auto system flags, watched phrases and terms and we randomly spot check PMs for untoward behaviour.

So - yes your second one works, but is (without sounding mean) redundant for us specifically, since talking about drugs isn’t illegal - at least where we are, but trying to score or manufacture or whatever else is which we want no part of.

Thanks again mate, that first draw is minimal manual work and can be given to a member of staff to deal with and it’s systematic. :peace_symbol:


(John Waltrip) #117

I’m trying to put together a query that will show users of a group that don’t have a particular badge, like Certified.

Something along the lines of "give me all the users from group VIP where badge Certified does not exist. Thank you.


#118

Well, let’s try this:

WITH exclude_badge AS (
SELECT gu.user_id
FROM badges b, user_badges ub, users u, group_users gu
WHERE u.id = ub.user_id
AND ub.badge_id = b.id
AND u.id = gu.user_id
AND b.name = 'Certified'
AND gu.id = 10 
)

SELECT
u.id AS user_id
FROM users u
WHERE u.id NOT IN (SELECT * FROM exclude_badge)
ORDER BY user_id
LIMIT 10

(John Waltrip) #119

Thanks for your help Sid, I made a couple of tweaks and got it to work with the following:

SELECT gu.user_id
FROM badges b, user_badges ub, users u, group_users gu
WHERE u.id = ub.user_id
AND ub.badge_id = b.id
AND u.id = gu.user_id
AND b.name = 'Certified'
AND gu.group_id = 42
)

SELECT
u.id AS user_id
FROM users u, group_users gu
WHERE u.id = gu.user_id
and gu.group_id = 42
and u.id NOT IN (SELECT * FROM exclude_badge)
ORDER BY user_id
LIMIT 100

#120

Great! Let’s add params and update my query list :wink:

Final version:


(Charles) #121

I’m looking for a query that would give me usage/participation stats for all users in a specific group. We have volunteers who help the community and have them all in primary group “ninjas.” I’m looking to get a table with data that includes things like: visits, topics read, posts read, topics created, topics replied to, private message count, likes given, likes received, etc. I’d like to be able to input the time period and the group name as a param. I started with this query which I found in this topic, but not sure how to limit it to a specific group.

-- [params]
-- int :from_days_ago = 0
-- int :duration_days = 30

with
t as (
  select 
    CURRENT_TIMESTAMP - ((:from_days_ago + :duration_days) * (INTERVAL '1 days')) as start,
    CURRENT_TIMESTAMP - (:from_days_ago * (INTERVAL '1 days')) as end
),
pr as (
    select user_id, 
        count(1) as visits,
        sum(posts_read) as posts_read
    from user_visits, t
    where posts_read > 0
    and visited_at > t.start
    and visited_at < t.end
    group by user_id
),
pc as (
  select user_id, 
      count(1) as posts_created
  from posts, t
  where created_at > t.start
  and created_at < t.end
  group by user_id
),
ttopics as (
 select user_id, posts_count
  from topics, t
  where created_at > t.start
  and created_at < t.end
),
tc as (
  select user_id, 
      count(1) as topics_created
  from ttopics
  group by user_id
),
twr as (
  select user_id, 
      count(1) as topics_with_replies
  from ttopics
  where posts_count > 1
  group by user_id
),
tv as (
 select user_id, 
        count(distinct(topic_id)) as topics_viewed
  from topic_views, t
  where viewed_at > t.start
  and viewed_at < t.end
  group by user_id
),
likes as (
  select 
      post_actions.user_id as given_by_user_id, 
      posts.user_id as received_by_user_id
  from t, post_actions
  left join posts
  on post_actions.post_id = posts.id
  where post_actions.created_at > t.start
  and post_actions.created_at < t.end
  and post_action_type_id = 2

),
lg as (
  select given_by_user_id as user_id, 
      count(1) as likes_given
  from likes
  group by user_id
),
lr as (
  select received_by_user_id as user_id, 
      count(1) as likes_received
  from likes
  group by user_id
),
e as (
  select email, user_id
  from user_emails u
  where u.primary = true
)
select pr.user_id,
       username,
       name,
       email,
       visits, 
       coalesce(topics_viewed,0) as topics_viewed,
       coalesce(posts_read,0) as posts_read, 
       coalesce(posts_created,0) as posts_created,
       coalesce(topics_created,0) as topics_created,
       coalesce(topics_with_replies,0) as topics_with_replies,
       coalesce(likes_given,0) as likes_given,
       coalesce(likes_received,0) as likes_received
from pr
left join tv using (user_id)
left join pc using (user_id)
left join tc using (user_id)
left join twr using (user_id)
left join lg using (user_id)
left join lr using (user_id)
left join e using (user_id)
left join users on pr.user_id = users.id
order by 
  visits desc, 
  posts_read desc, 
  posts_created desc

This query has pretty much everything in it. I’d like to see if time_read could be added as a column.


#122

Lets try this:

  1. Define params:
-- int :limit = 10
-- int :group_id = 3

I always use 10 (ten) as a limit, but you can change that.
And the “group_id” here is 3 (for staff group). Change as you need.

  1. Lets see the query:
-- [params]
-- int :from_days_ago = 0
-- int :duration_days = 30
-- int :limit = 10
-- int :group_id = 3

with
t as (
  select 
    CURRENT_TIMESTAMP - ((:from_days_ago + :duration_days) * (INTERVAL '1 days')) as start,
    CURRENT_TIMESTAMP - (:from_days_ago * (INTERVAL '1 days')) as end
),
pr as (
    select user_id, 
        count(1) as visits,
        sum(posts_read) as posts_read
    from user_visits, t
    where posts_read > 0
    and visited_at > t.start
    and visited_at < t.end
    group by user_id
),
pc as (
  select user_id, 
      count(1) as posts_created
  from posts, t
  where created_at > t.start
  and created_at < t.end
  group by user_id
),
ttopics as (
 select user_id, posts_count
  from topics, t
  where created_at > t.start
  and created_at < t.end
),
tc as (
  select user_id, 
      count(1) as topics_created
  from ttopics
  group by user_id
),
twr as (
  select user_id, 
      count(1) as topics_with_replies
  from ttopics
  where posts_count > 1
  group by user_id
),
tv as (
 select user_id, 
        count(distinct(topic_id)) as topics_viewed
  from topic_views, t
  where viewed_at > t.start
  and viewed_at < t.end
  group by user_id
),
likes as (
  select 
      post_actions.user_id as given_by_user_id, 
      posts.user_id as received_by_user_id
  from t, post_actions
  left join posts
  on post_actions.post_id = posts.id
  where post_actions.created_at > t.start
  and post_actions.created_at < t.end
  and post_action_type_id = 2

),
lg as (
  select given_by_user_id as user_id, 
      count(1) as likes_given
  from likes
  group by user_id
),
lr as (
  select received_by_user_id as user_id, 
      count(1) as likes_received
  from likes
  group by user_id
),
e as (
  select email, user_id
  from user_emails u
  where u.primary = true
)
select pr.user_id,
       username,
       name,
       email,
       visits, 
       coalesce(topics_viewed,0) as topics_viewed,
       coalesce(posts_read,0) as posts_read, 
       coalesce(posts_created,0) as posts_created,
       coalesce(topics_created,0) as topics_created,
       coalesce(topics_with_replies,0) as topics_with_replies,
       coalesce(likes_given,0) as likes_given,
       coalesce(likes_received,0) as likes_received
from pr
left join tv using (user_id)
left join pc using (user_id)
left join tc using (user_id)
left join twr using (user_id)
left join lg using (user_id)
left join lr using (user_id)
left join e using (user_id)
left join users on pr.user_id = users.id
left join group_users on pr.user_id = group_users.user_id
WHERE group_users.group_id = :group_id
order by 
  visits desc, 
  posts_read desc, 
  posts_created desc
LIMIT :limit
  1. Please try it., and tell me your opinion :blush:

(Charles) #123

Thanks! This is getting a lot closer to what I was looking for. Is there a way to go by primary group? We have some users who are in multiple groups and I would want to narrow it down by their primary group. Also, where can I see group_Id? In /groups I can see the names of all of our groups, but not their number.


#124

I don’t know your forum URL, but you can check the group’s id here:
https://your-forum/groups.json

You will see a JSON file with all the groups, just search the group, and check the “id”. :+1:


(Charles) #125

That worked perfectly! What would it take to add the following values to that query?

  • post score
  • count of their posts that have been marked as solved by someone else
  • count of their topics or posts that have been bookmarked by someone else
  • count of flags given to others
  • count of flags received against them
  • total read time

I gave adding read time a shot myself…

-- [params]
-- int :from_days_ago = 0
-- int :duration_days = 30
-- int :limit = 30
-- int :group_id = 18

with
t as (
  select 
    CURRENT_TIMESTAMP - ((:from_days_ago + :duration_days) * (INTERVAL '1 days')) as start,
    CURRENT_TIMESTAMP - (:from_days_ago * (INTERVAL '1 days')) as end
),
pr as (
    select user_id, 
        count(1) as visits,
        sum(posts_read) as posts_read
    from user_visits, t
    where posts_read > 0
    and visited_at > t.start
    and visited_at < t.end
    group by user_id
),
tr as (
    select user_id, 
        count(1) as time_read
    from user_visits, t
    where visited_at > t.start
    and visited_at < t.end
    group by user_id
),
pc as (
  select user_id, 
      count(1) as posts_created
  from posts, t
  where created_at > t.start
  and created_at < t.end
  group by user_id
),
ttopics as (
 select user_id, posts_count
  from topics, t
  where created_at > t.start
  and created_at < t.end
),
tc as (
  select user_id, 
      count(1) as topics_created
  from ttopics
  group by user_id
),
twr as (
  select user_id, 
      count(1) as topics_with_replies
  from ttopics
  where posts_count > 1
  group by user_id
),
tv as (
 select user_id, 
        count(distinct(topic_id)) as topics_viewed
  from topic_views, t
  where viewed_at > t.start
  and viewed_at < t.end
  group by user_id
),
likes as (
  select 
      post_actions.user_id as given_by_user_id, 
      posts.user_id as received_by_user_id
  from t, post_actions
  left join posts
  on post_actions.post_id = posts.id
  where post_actions.created_at > t.start
  and post_actions.created_at < t.end
  and post_action_type_id = 2

),
lg as (
  select given_by_user_id as user_id, 
      count(1) as likes_given
  from likes
  group by user_id
),
lr as (
  select received_by_user_id as user_id, 
      count(1) as likes_received
  from likes
  group by user_id
),
e as (
  select email, user_id
  from user_emails u
  where u.primary = true
)
select pr.user_id,
       visits, 
       coalesce(time_read,0) as time_read,
       coalesce(topics_viewed,0) as topics_viewed,
       coalesce(posts_read,0) as posts_read, 
       coalesce(posts_created,0) as posts_created,
       coalesce(topics_created,0) as topics_created,
       coalesce(topics_with_replies,0) as topics_with_replies,
       coalesce(likes_given,0) as likes_given,
       coalesce(likes_received,0) as likes_received
from pr
left join tv using (user_id)
left join tr using (user_id)
left join pc using (user_id)
left join tc using (user_id)
left join twr using (user_id)
left join lg using (user_id)
left join lr using (user_id)
left join e using (user_id)
left join users on pr.user_id = users.id
left join group_users on pr.user_id = group_users.user_id
WHERE group_users.group_id = :group_id
order by 
  visits desc, 
  posts_read desc, 
  posts_created desc
LIMIT :limit

The results for read time seem to be off, but I’m sure its because of:

tr as (
    select user_id, 
        count(1) as time_read
    from user_visits, t
    where visited_at > t.start
    and visited_at < t.end
    group by user_id
),

So I took out this and got something more realistic. I’m not sure if by removing this I am also not including read time based on the time frame.

    where visited_at > t.start
    and visited_at < t.end


(Charles) #126

Is there a way to look at trending tags over a specified period of time?


(Tobias Eigen) #127

Would anyone be so kind as to share a query to display the number of messages in a PM group inbox? PM Group archive?