'bundle exec rake db:migrate' taking a long time due to calendar migration

Lads - really appreciate the assistance here and @pfaffman - please don’t work late on my account. Forum goers will simply have to wait if you need to log off for the evening.

1 Like
discourse=# SELECT pid, age(clock_timestamp(), query_start), usename, query 
FROM pg_stat_activity 
WHERE query != '<IDLE>' AND query NOT ILIKE '%pg_stat_activity%' 
ORDER BY query_start desc;
 pid  |       age       |  usename  |                                                    query                                                     
------+-----------------+-----------+--------------------------------------------------------------------------------------------------------------
   47 |                 |           | 
   50 |                 | postgres  | 
   48 |                 |           | 
   46 |                 |           | 
   45 |                 |           | 
 3231 | 00:07:57.332835 | discourse | DELETE                                                                                                      +
      |                 |           |   FROM calendar_events ce                                                                                   +
      |                 |           | WHERE                                                                                                       +
      |                 |           |   ce.id IN (SELECT DISTINCT(ce3.id) FROM calendar_events ce2                                                +
      |                 |           |             LEFT JOIN calendar_events ce3 ON ce3.user_id = ce2.user_id AND ce3.description = ce2.description+
      |                 |           |             WHERE ce2.start_date >= (ce3.start_date - INTERVAL '1 days')                                    +
      |                 |           |               AND ce2.start_date <= (ce3.start_date + INTERVAL '1 days')                                    +
      |                 |           |               AND ce2.timezone IS NOT NULL                                                                  +
      |                 |           |               AND ce3.timezone IS NULL                                                                      +
      |                 |           |               AND ce3.id != ce2.id                                                                          +
      |                 |           |               AND ce2.post_id IS NULL                                                                       +
      |                 |           |               AND ce3.post_id IS NULL                                                                       +
      |                 |           |   )                                                                                                         +
      |                 |           | 
 3232 | 00:07:57.347747 | discourse | SELECT pg_try_advisory_lock(2859260972035668690)
(7 rows)

OMG

discourse=# select count(*) from calendar_events;
  count   
----------
 69724384
(1 row)

Whoa what is that?

That’s the calendar plugin

Is that running for 7 hours or 7 minutes?

It ran a good deal longer than 7 minutes the last time, but not 7 hours, at least for me.

Calendar plugin is barely used - it can be removed, if that helps in any regard.

1 Like

Cool.

Falco–Maybe Just delete the plugin?

And then delete the plugin tables?

EDIT: with calendar removed, the database migrated and assets are precompiling. The site should be up in a few more minutes.

I am beyond curious what the hell happened to the calendar to cause that kind of issue?!

It LIVES. Absolute legends.

1 Like

I’ll ping you in the morning about how to clean things up.

2 Likes

Thank you for going the extra mile <3

      DELETE
        FROM calendar_events ce
      WHERE
        ce.id IN (SELECT DISTINCT(ce3.id) FROM calendar_events ce2
                  LEFT JOIN calendar_events ce3 ON ce3.user_id = ce2.user_id AND ce3.description = ce2.description
                  WHERE ce2.start_date >= (ce3.start_date - INTERVAL '1 days')
                    AND ce2.start_date <= (ce3.start_date + INTERVAL '1 days')
                    AND ce2.timezone IS NOT NULL
                    AND ce3.timezone IS NULL
                    AND ce3.id != ce2.id
                    AND ce2.post_id IS NULL
                    AND ce3.post_id IS NULL
        )
    SQL

So that looks like at least an O(n^2) operation, right? There are 69,724,384 rows in that table, so that doesn’t look like a very good idea.

@Sikamikanico. If you want the calednar plugin back, I think the thing to do is to just delete all calendar events and you can start over. The other thing we could do would be to run that query on the active server and see if it ever finishes.

The root cause, I think, is a bug in the plugin that duplicated events when a timezone changed.

The calendar_events table is 11GB:

discourse=# SELECT pg_size_pretty( pg_total_relation_size('calendar_events') );
 pg_size_pretty 
----------------
 11 GB
(1 row)

Here are the biggest tables:

       relation       | total_size 
----------------------+------------
 calendar_events      | 11 GB
 post_timings         | 6884 MB
 posts                | 2292 MB
 user_auth_token_logs | 1240 MB
 user_actions         | 1055 MB
(5 rows)

Only calendar_events is clearly bizarre.

1 Like

We need to understand if this may still be a bug in the plugin or just something that happened in the past.

Can you confirm if the bulk of the duplicated entries is from the past or if there are still too many rows with a recent created_at?

Thank you!

1 Like
discourse=# select count(*) from calendar_events where created_at > NOW()-INTERVAL '200 days';                                                                         
  count                                                                                                                                                                
----------                                                                                                                                                             
 25970368                                                                                                                                                              
(1 row)                                                                                                                                                                
                                                                                                                                                                       
discourse=# select count(*) from calendar_events where created_at > NOW()-INTERVAL '100 days';                                                                         
  count                                                                                                                                                                
----------                                                                                                                                                             
 14377700                                                                                                                                                              
(1 row)                                                                                                                                                                
                                                                                                                                                                       
discourse=# select count(*) from calendar_events where created_at > NOW()-INTERVAL '50 days';                                                                          
  count                                                                                                                                                                
---------                                                                                                                                                              
 7207939                                                                                                                                                               
(1 row)                                                                                                                                                                
                                                                                                                                                                       
discourse=# select count(*) from calendar_events where created_at > NOW()-INTERVAL '5 days';                                                                           
 count                                                                                                                                                                 
--------                                                                                                                                                               
 589938                                  
(1 row)                                                                                                 
iscourse=# select count(*) from calendar_events where created_at < NOW()-INTERVAL '200 days';
  count   
----------
 43754016
(1 row)