`rake posts:rebake` è crashato a metà

Ciao, ho eseguito rake posts:rebake sul mio sito di test, un’operazione che normalmente richiede oltre 24 ore per completarsi. Questa mattina, accedendo alla sessione, ho scoperto che si era interrotta a circa il 50% del percorso. Non credo sia stato un problema di memoria esaurita, dato che il VPS dispone di 8 GB di RAM e 15 GB di swap. Inoltre, giorni fa il mio script di importazione Drupal personalizzato si è bloccato una volta, dopo diversi giorni di processo, a causa di un errore di Postgres. Ho eseguito con successo lo stesso script di importazione più volte, nonché almeno un’esecuzione completa di rake posts:rebake. In quell’occasione ho dato la colpa a una coincidenza, ma questa volta sembra esserci di nuovo un problema casuale con Postgres:

Rebaking post markdown for 'default'                                                                                                                                                          
  1328000 / 2625793 ( 50.6%)rake aborted!                                                                                                                                                     
ActiveRecord::StatementInvalid: PG::DataCorrupted: ERROR:  invalid page in block 181250 of relation base/16384/16846                                                                          
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/rack-mini-profiler-3.0.0/lib/patches/db/pg.rb:69:in `exec_params'                                                                            
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/rack-mini-profiler-3.0.0/lib/patches/db/pg.rb:69:in `exec_params'                                                                            
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/postgresql_adapter.rb:768:in `block (2 levels) in exec_no_cache'                  
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/concurrency/share_lock.rb:187:in `yield_shares'                                                     
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/dependencies/interlock.rb:41:in `permit_concurrent_loads'                                           
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/postgresql_adapter.rb:767:in `block in exec_no_cache'                             
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `handle_interrupt'                                
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `block in synchronize'                            
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `handle_interrupt'                                
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `synchronize'                                     
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/abstract_adapter.rb:765:in `block in log'                                         
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/notifications/instrumenter.rb:24:in `instrument'                                                    
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/abstract_adapter.rb:756:in `log'                                                  
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/postgresql_adapter.rb:766:in `exec_no_cache'                                      
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/postgresql_adapter.rb:745:in `execute_and_clear'                                  
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/postgresql/database_statements.rb:54:in `exec_query'                              
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/abstract/database_statements.rb:560:in `select'                                   
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/abstract/database_statements.rb:66:in `select_all'                                
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/abstract/query_cache.rb:110:in `select_all'                                       
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/querying.rb:54:in `_query_by_sql'                                                                     
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:942:in `block in exec_main_query'                                                         
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:962:in `skip_query_cache_if_necessary'                                                    
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:928:in `exec_main_query'                                                                  
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:914:in `block in exec_queries'                                                            
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:962:in `skip_query_cache_if_necessary' 
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:908:in `exec_queries'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:695:in `load'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:250:in `records'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation/delegation.rb:88:in `each'
/var/www/discourse/lib/tasks/posts.rake:128:in `block in rebake_posts'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/core_ext/range/each.rb:14:in `step'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/core_ext/range/each.rb:14:in `step'
/var/www/discourse/lib/tasks/posts.rake:123:in `rebake_posts'
/var/www/discourse/lib/tasks/posts.rake:108:in `block in rebake_posts_all_sites'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/rails_multisite-4.0.1/lib/rails_multisite/connection_management.rb:80:in `with_connection'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/rails_multisite-4.0.1/lib/rails_multisite/connection_management.rb:90:in `each_connection'
/var/www/discourse/lib/tasks/posts.rake:108:in `rebake_posts_all_sites'
/var/www/discourse/lib/tasks/posts.rake:7:in `block in <main>'
/usr/local/bin/bundle:25:in `load'
/usr/local/bin/bundle:25:in `<main>'
Caused by:
PG::DataCorrupted: ERROR:  invalid page in block 181250 of relation base/16384/16846
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/rack-mini-profiler-3.0.0/lib/patches/db/pg.rb:69:in `exec_params'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/rack-mini-profiler-3.0.0/lib/patches/db/pg.rb:69:in `exec_params'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/postgresql_adapter.rb:768:in `block (2 levels) in exec_no_cache'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/concurrency/share_lock.rb:187:in `yield_shares'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/dependencies/interlock.rb:41:in `permit_concurrent_loads'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/postgresql_adapter.rb:767:in `block in exec_no_cache'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `handle_interrupt'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `block in synchronize'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `handle_interrupt'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `synchronize'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/abstract_adapter.rb:765:in `block in log'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/notifications/instrumenter.rb:24:in `instrument'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/abstract_adapter.rb:756:in `log'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/postgresql_adapter.rb:766:in `exec_no_cache'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/postgresql_adapter.rb:745:in `execute_and_clear'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/postgresql/database_statements.rb:54:in `exec_query'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/abstract/database_statements.rb:560:in `select'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/abstract/database_statements.rb:66:in `select_all'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/connection_adapters/abstract/query_cache.rb:110:in `select_all'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/querying.rb:54:in `_query_by_sql'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:942:in `block in exec_main_query'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:962:in `skip_query_cache_if_necessary'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:928:in `exec_main_query'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:914:in `block in exec_queries'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:962:in `skip_query_cache_if_necessary'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:908:in `exec_queries'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:695:in `load'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation.rb:250:in `records'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activerecord-7.0.4.1/lib/active_record/relation/delegation.rb:88:in `each'
/var/www/discourse/lib/tasks/posts.rake:128:in `block in rebake_posts'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/core_ext/range/each.rb:14:in `step'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/activesupport-7.0.4.1/lib/active_support/core_ext/range/each.rb:14:in `step'
/var/www/discourse/lib/tasks/posts.rake:123:in `rebake_posts'
/var/www/discourse/lib/tasks/posts.rake:108:in `block in rebake_posts_all_sites'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/rails_multisite-4.0.1/lib/rails_multisite/connection_management.rb:80:in `with_connection'
/var/www/discourse/vendor/bundle/ruby/3.1.0/gems/rails_multisite-4.0.1/lib/rails_multisite/connection_management.rb:90:in `each_connection'
/var/www/discourse/lib/tasks/posts.rake:108:in `rebake_posts_all_sites'
/var/www/discourse/lib/tasks/posts.rake:7:in `block in <main>'
/usr/local/bin/bundle:25:in `load'
/usr/local/bin/bundle:25:in `<main>'
Tasks: TOP => posts:rebake
(See full trace by running task with --trace)

Ciao!

Sembra che lo stato di Postgres sul disco sia danneggiato, sia a causa di un evento di corruzione specifico o di un problema in corso, come la corruzione del filesystem o guasti hardware sottostanti del disco/memoria nel tuo VPS.

Per cominciare, dovresti provare fsck sul tuo filesystem.

Successivamente, se si tratta di un ambiente di test e puoi ricostruire i dati, prova a ricominciare da capo con PG rimuovendo completamente la sua directory dei dati e creando un nuovo database. Quindi prova a stressare le cose importando/rielaborando di nuovo per vedere se il problema persiste.

Grazie mille @leonardo per i suggerimenti.

Ripensandoci, ricordo che il precedente crash che ho avuto la settimana scorsa durante l’importazione era specificamente dovuto a un errore di chiave duplicata di Postgres, quindi questa volta è un errore diverso.

Ho eseguito un xfs_repair con l’opzione -e, quindi da quello che capisco non sembra esserci alcuna corruzione:

xfs_repair -e /dev/sda2
Fase 1 - ricerca e verifica superblock...
Fase 2 - utilizzo del log interno
        - azzeramento log...
        - scansione dello spazio libero del filesystem e delle mappe degli inode...
        - trovato blocco inode radice
Fase 3 - per ogni AG...
        - scansione e pulizia delle liste di agi non collegate...
        - elaborazione degli inode noti ed esecuzione della scoperta degli inode...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
        - agno = 16
        - elaborazione degli inode scoperti di recente...
Fase 4 - verifica dei blocchi duplicati...
        - impostazione dell'elenco dei blocchi duplicati...
        - verifica degli inode che rivendicano blocchi duplicati...
        - agno = 0
        - agno = 3
        - agno = 1
        - agno = 2
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
        - agno = 16
rimozione del flag reflink sugli inode quando possibile
Fase 5 - ricostruzione delle intestazioni e degli alberi AG...
        - reimpostazione del superblock...
Fase 6 - verifica della connettività degli inode...
        - reimpostazione dei contenuti della bitmap in tempo reale e degli inode di riepilogo
        - attraversamento del filesystem ...
        - attraversamento terminato ...
        - spostamento degli inode disconnessi in lost+found ...
Fase 7 - verifica e correzione dei conteggi dei collegamenti...
fatto

MODIFICA: Dopo il riavvio (ho usato il sistema di soccorso per eseguire fsck sul filesystem smontato) l’app Discourse si avviava e non c’erano errori nei log, ma ho solo ottenuto la schermata bianca della morte. Ho dovuto ricostruire l’app per far caricare di nuovo il sito web. Non sono proprio sicuro di cosa sia andato storto con tutto questo.