PAID Data explorer challenge: chronological list of user topics started, replied and liked

data-explorer

(Tobias Eigen) #1

Update: moving to marketplace and offering to pay to get this query done within the next week or two. I need it for my site and will contribute it back to the community for the benefit of others. PM me please if interested.


Google analytics helps us to understand active user stats, but that overlooks the many email-only participants we have. An extremely handy data explorer query, methinks, would generate a chronological list of the following, with a filter by category preferably to ignore private categories:

  • date
  • username
  • # topics started
  • # topics replied
  • # topics liked
  • # messages started
  • # messages replied
  • # messages liked

@DeanMarkTaylor can I interest you in helping me with this data query? I really appreciated your great ideas last time.


(Mittineague) #2

I imagine you have a fairly clear idea about what you’re after, but I have a few questions regarding specifications.

“date” - of what exactly? Of the tables I have looked at, there are 106 fields that are timestamps.

“liked” - as in member is the liker or the likee?


(Tobias Eigen) #3

Whoa. Managed to miss this reply - sorry. I am still very eager to get this done.

I am trying to get at active contributor data. So the date would reflect the timestamp of when the contribution occurs. Contributions include topics started, topics replied to, topics liked, messages started, messages replied to, messages liked.

“Liked” would be the liker, not the likee. :slight_smile:


(Tobias Eigen) #4

In the meantime, is there any way to get a list of all users who have posted in a given month or timeframe? Seems like this should be available but I haven’t been able to easily find it. Sorry if it’s obvious and I am missing it.


(Tobias Eigen) #5

@LeoMcA has offered to look into this task and asked for some more info. What I am trying to do is collect user-level data on member activity. We have an annual goal to have at least 750 active members by the end of 2016. We’ve defined this as members who meet one of the 4 following criteria:

  • logged into the website on at least 5 separate days so far this year
  • used the resource library
  • contributed to community discussions
  • applied to a learning event

We are tracking data on all of the above to salesforce so we can see directly what people have been doing on a daily basis when we view their contact details in salesforce. We can also run reports that let us drill down to see for example how many people in Kenya are logging in, applying to events etc.

This particular task is to help us get data on who is contributing to discussions on a daily basis. @DeanMarkTaylor created a great query to get post data by month which is really useful for another annual goal we have, which I would like to replicate for users by day. Dean’s query is in this topic:

https://meta.discourse.org/t/help-finding-some-monthly-site-stats-about-topics-and-user-trust-levels/42560/16?u=tobiaseigen

How I collect that data is in this google sheet. I added a second worksheet there as an example of what I am looking for here.


Another thought: it occurs to me that if it were possible to generate a list of all posts, likes and messages in a given timeframe I would have the data I need and would have lots more useful data besides. Has anyone done such a query? E.g. date, new topic/reply/like, username, topic title, topic URL


(Leo McArdle) #6

With my first exciting foray into the world of slightly complicated SQL queries, I present the query for this query! :tada:

Parameters

  • date_from
    • date to start the query from
  • date_to
    • date to query up to (but not including)
  • guess_domain
    • if true (the default) guesses the domain name of the Discourse instance for use in the url column
    • if false uses the value of the domain parameter
  • domain (optional)
    • domain and protocol of the Discourse instance

Columns

  • created_at
    • date the action took place
  • action
    • type of action (currently: New Topic, Topic Reply, Message, Like)
  • username
    • username of user performing the action
  • topic_title
    • title of topic or message created or in which the action was performed
  • category
    • category in which the action was performed (NULL if a message)
  • parent_category
    • parent category in which the action was performed (NULL if a message, or performed in a top level category)
  • url
    • url to topic, post or message which was created or liked

Query

-- [params]
-- date :date_from = 1970-01-01
-- date :date_to = 2038-01-19
-- boolean :guess_domain = true
-- string :domain = https://example.com

WITH ss AS (
  SELECT CASE
    WHEN :guess_domain = true THEN concat('https://', split_part(value, '@', 2))
    ELSE :domain
  END AS domain
  FROM site_settings
  WHERE name = 'notification_email'
)
SELECT
  ua.created_at,
  CASE
    WHEN ua.action_type = 1 THEN 'Like'
    WHEN ua.action_type = 4 THEN 'New Topic'
    WHEN ua.action_type = 5 THEN 'Topic Reply'
    WHEN ua.action_type = 12 THEN 'Message'
  END AS action,
  u.username,
  t.title AS topic_title,
  c.name AS category,
  pc.name AS parent_category,
  concat((SELECT domain from ss), '/t/', t.id, '/', (CASE WHEN p.post_number IS NOT NULL then p.post_number ELSE 1 END)) AS url
FROM user_actions AS ua
JOIN users AS u ON ua.user_id = u.id
JOIN topics AS t ON ua.target_topic_id = t.id
LEFT JOIN posts AS p ON ua.target_post_id = p.id
LEFT JOIN categories AS c ON t.category_id = c.id
LEFT JOIN categories AS pc ON c.parent_category_id = pc.id
WHERE
  ua.user_id != -1
  AND
  ua.action_type IN (1, 4, 5, 12)
  AND
  ua.created_at BETWEEN :date_from AND :date_to
ORDER BY created_at DESC

Displaying user custom fields in data explorer query results
(Tobias Eigen) #7

Awesome. This has been a huge help. Many thanks! :rocket:


(Jeff Atwood) #8