User stats wrong/not up to date

I noticed something’s off with the stats of the Users page at my small forum (it’s really small, yes):

https://discorsi.openarchaeology.eu/users/

If I’m not mistaken, various numbers shown for the default view (last week) and other views (last month) are wrong or not up to date, showing actions (likes given and received, messages) that do not reflect the actual activity of those users.

The view for last year seems correct though. I’m not sure there is a way to force a reset of the stats?

2 Likes

See /sidekiq/scheduler

And search for “stats”.

Mhh, I see just 3 items:

Jobs::DashboardStats 	11 minutes ago 	OK 	189ms 		in 18 minutes 	
Jobs::AboutStats 	5 minutes ago 	OK 	72ms 		in 23 minutes 	
...
Jobs::CategoryStats 	17 hours ago 	OK 	258ms 		in 7 hours 	

All seem to be working correctly.

I’m currently digging around a Users pages post count issue,

A lot of it is over my head and I’ve run into to some road blocks, but I am getting things narrowed down some.

If I see anything that could be a problem with data in addition to the code related to post count I’ll be sure to take note of those too.

1 Like

One thing interesting I noticed is that directory_item.rb has

all: 1,
yearly: 2,
monthly: 3,
weekly: 4,
daily: 5,
quarterly: 6

Damn enums! Aarrgghh!

I would intuitively think that quarterly would be between yearly and monthly.

I know if i was writing code involving period_types I would be prone to making a mistake because of that.

I don’t know if that has anything to do with your problem but I suspect it very well may.
… back to the code …

3 Likes

I would guess that quarterly got added after daily and there was no way to re number everything.

4 Likes

If you have Data Explorer this query might help you to analyse results.
It’s dirty and not easy to interpret, but it “works”

Note - the directory items “all” uses the user stats table

--[params]
-- string :user_name = eviltrout
-- int :results_limit = 60

WITH us AS ( SELECT user_stats.user_id 
        , user_stats.likes_received AS us_lr 
        , user_stats.likes_given AS us_lg 
        , user_stats.topics_entered AS us_te 
        , user_stats.topic_count AS us_tc 
        , user_stats.post_count AS us_pc 
        , user_stats.days_visited AS us_dv 
        , user_stats.posts_read_count AS us_pr
      FROM user_stats 
      )
, di AS ( SELECT directory_items.user_id 
        , directory_items.likes_received AS di_lr 
        , directory_items.likes_given AS di_lg 
        , directory_items.topics_entered AS di_te 
        , directory_items.topic_count AS di_tc 
        , directory_items.post_count AS di_pc 
        , directory_items.days_visited AS di_dv 
        , directory_items.posts_read AS di_pr
        , directory_items.period_type AS di_pt 
      FROM directory_items 
      )
, u AS ( SELECT users.id
        , users.username AS u_u 
        , users.last_seen_at AS u_ls 
      FROM users 
      )
SELECT 
    us.user_id 
    , us.us_lr 
    , di.di_lr 
    , us.us_lg 
    , di.di_lg 
    , us.us_te 
    , di.di_te 
    , us.us_tc 
    , di.di_tc 
    , us.us_pc 
    , di.di_pc 
    , us.us_dv 
    , di.di_dv 
    , us.us_pr 
    , di.di_pr 
    , CASE WHEN di.di_pt = 1 THEN CURRENT_DATE - interval '1000 days' 
           WHEN di.di_pt = 2 THEN CURRENT_DATE - interval '365 days' 
           WHEN di.di_pt = 3 THEN CURRENT_DATE - interval '30 days' 
           WHEN di.di_pt = 4 THEN CURRENT_DATE - interval '7 days' 
           WHEN di.di_pt = 5 THEN CURRENT_DATE - interval '1 day' 
           WHEN di.di_pt = 6 THEN CURRENT_DATE - interval '90 days' 
           ELSE CURRENT_DATE END AS di_di_pt 
    , CASE WHEN (CASE WHEN di.di_pt = 1 THEN CURRENT_DATE - interval '1000 days' 
                WHEN di.di_pt = 2 THEN CURRENT_DATE - interval '365 days' 
                WHEN di.di_pt = 3 THEN CURRENT_DATE - interval '30 days' 
                WHEN di.di_pt = 4 THEN CURRENT_DATE - interval '7 days' 
                WHEN di.di_pt = 5 THEN CURRENT_DATE - interval '1 day' 
                WHEN di.di_pt = 6 THEN CURRENT_DATE - interval '90 days' 
                ELSE CURRENT_DATE END) < u.u_ls THEN '<' 
           WHEN (CASE WHEN di.di_pt = 1 THEN CURRENT_DATE - interval '1000 days' 
                WHEN di.di_pt = 2 THEN CURRENT_DATE - interval '365 days' 
                WHEN di.di_pt = 3 THEN CURRENT_DATE - interval '30 days' 
                WHEN di.di_pt = 4 THEN CURRENT_DATE - interval '7 days' 
                WHEN di.di_pt = 5 THEN CURRENT_DATE - interval '1 day' 
                WHEN di.di_pt = 6 THEN CURRENT_DATE - interval '90 days' 
                ELSE CURRENT_DATE END) > u.u_ls THEN '>' 
           ELSE '=' END AS di_u 
    , date_trunc('day', u.u_ls) AS u_u_ls 
FROM us, di, u  
WHERE u.u_u ILIKE CONCAT('%', :user_name, '%') 
AND di.user_id = us.user_id 
AND u.id = di.user_id 
ORDER BY us.user_id, di_di_pt 
LIMIT :results_limit 

the member name can’t be empty or it won’t run the query
“limit” should be in multiples of 6

EDIT
If you would rather import than copy-paste

users-page-stats.dcquery.json (3.3 KB)

4 Likes

Thanks a lot for providing this query. As you wrote it’s not easy to interpret but I’m confident that the results returned by the query match those seen in the users page, especially looking at a few users that have been inactive for many months now.

Is it possible that the case when u_u_ls > di_di_pt results in wrong stats?

Except for the Users “all” that uses the user_stats table, the other periods pull from other tables.

Because the Users page gets updated by a sidekiq job that AFAIK runs once daily, there could be a difference between when the query is run and whether or not the sidekiq job has run.

I don’t test for either on my localhost, so it might be that members that visit via the API or email get or not get counted as having been “seen”

As best as I can tell I don’t think it is a problem if the users last seen is greater than the directory item period, but it could be when stats (other than Likes received) are present in directory_items (the Users page) in a period more recent than when the member was last seen.

That is, the di_di_pt > u_u_ls is more likely to indicate a discrepancy than a di_di_pt < u_u_ls is

I’m wondering if there’s just a delay or what, but I visited our users list just now, and “All Time” shows 1 less total users than “Today”. There’s a new user who just joined, and they show in “Today” but not in any other time-frame of the user list view… seems buggy, but maybe it just needs time to update…

UPDATE: seems so far to have just resolved itself with some delay

My monthly stats show almost no activity for the past two days, which is wrong. The other periods (week, quarter, and year) work properly. I’ve seen this behavior a few times, and it resolves itself eventually.

1 Like