Got a lot of "Failed to backfill 'Reader' badge" errors

We’ve got a lot of these errors in the log. Is there something we can/should do?

Info :

Job exception: Failed to backfill ‘Reader’ badge: {:revoked_callback=>#<Proc:0x00007867ef8d9620 /var/www/discourse/app/jobs/regular/backfill_badge.rb:20 (lambda)>, :granted_callback=>#<Proc:0x00007867ef8d95f8 /var/www/discourse/app/jobs/regular/backfill_badge.rb:21 (lambda)>}. Reason: ERROR: canceling statement due to statement timeout

Trace:

/var/www/discourse/app/services/badge_granter.rb:505:in `rescue in backfill' 
/var/www/discourse/app/services/badge_granter.rb:385:in `backfill' 
/var/www/discourse/app/jobs/regular/backfill_badge.rb:18:in `execute' 
/var/www/discourse/app/jobs/base.rb:316:in `block (2 levels) in perform' 
rails_multisite-6.1.0/lib/rails_multisite/connection_management/null_instance.rb:49:in `with_connection'
rails_multisite-6.1.0/lib/rails_multisite/connection_management.rb:21:in `with_connection'
/var/www/discourse/app/jobs/base.rb:303:in `block in perform' 
/var/www/discourse/app/jobs/base.rb:299:in `each' 
/var/www/discourse/app/jobs/base.rb:299:in `perform' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:220:in `execute_job' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:185:in `block (4 levels) in process' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:180:in `traverse' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:183:in `block in traverse' 
/var/www/discourse/lib/sidekiq/discourse_event.rb:6:in `call' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:182:in `traverse' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:183:in `block in traverse' 
/var/www/discourse/lib/sidekiq/pausable.rb:131:in `call' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:182:in `traverse' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:183:in `block in traverse' 
sidekiq-7.3.9/lib/sidekiq/job/interrupt_handler.rb:9:in `call' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:182:in `traverse' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:183:in `block in traverse' 
sidekiq-7.3.9/lib/sidekiq/metrics/tracking.rb:26:in `track' 
sidekiq-7.3.9/lib/sidekiq/metrics/tracking.rb:134:in `call' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:182:in `traverse' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:173:in `invoke' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:184:in `block (3 levels) in process' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:145:in `block (6 levels) in dispatch' 
sidekiq-7.3.9/lib/sidekiq/job_retry.rb:118:in `local' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:144:in `block (5 levels) in dispatch' 
sidekiq-7.3.9/lib/sidekiq/config.rb:39:in `block in <class:Config>' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:139:in `block (4 levels) in dispatch' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:281:in `stats' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:134:in `block (3 levels) in dispatch' 
sidekiq-7.3.9/lib/sidekiq/job_logger.rb:15:in `call' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:133:in `block (2 levels) in dispatch' 
sidekiq-7.3.9/lib/sidekiq/job_retry.rb:85:in `global' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:132:in `block in dispatch' 
sidekiq-7.3.9/lib/sidekiq/job_logger.rb:40:in `prepare' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:131:in `dispatch' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:183:in `block (2 levels) in process' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:182:in `handle_interrupt' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:182:in `block in process' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:181:in `handle_interrupt' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:181:in `process' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:86:in `process_one' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:76:in `run' 
sidekiq-7.3.9/lib/sidekiq/component.rb:10:in `watchdog' 
sidekiq-7.3.9/lib/sidekiq/component.rb:19:in `block in safe_thread' 

Could this be an issue with a particularly large topic? Do you have any megatopics that might be a burden for it?

If the badge grant job is being blocked entirely because of it, it could be an idea to temporarily disable the Reader badge and see if that helps.


I did find this on what seems to be a similar issue: (though quite old)

So it could be that the query is too much for your spec?

We don’t have a megatopic. At least I am not aware of. If there is a SQL command for me to run and check, that would be great.
We have 32 core CPU + 128 GB ram.. I am not sure if this is a limitation. If there is something i need to change in the db, please let me know.

1 Like

I think if you had one you’d likely know, but you can flip your /latest list into activity order to double-check using the column title or using YourSite/latest?order=posts.

But something like this in the data explorer should also show you the top 10:

SELECT id AS topic_id
FROM topics 
ORDER BY posts_count DESC
LIMIT 10

The SQL for the Reader badge is here:

The Reader badge isn’t necessarily one of the most exciting ones, so if you can live with disabling it and that fixes everything that might be the easy way out. But if you want to explore it further I think you may want to look at your post_timings table to see how big it’s gotten.

1 Like

I saw a couple mega topic. But we’ve capped at 10k replies and divide it out.

I will disable it for now. But here is the rake db:stats

table_name                                       | row_estimate | table_size | index_size | total_size
-----------------------------------------------------------------------------------------------
post_timings                                     | 1707169280   | 70 GB      | 61 GB      | 132 GB
topic_views                                      | 243936880    | 11 GB      | 15 GB      | 26 GB
user_auth_token_logs                             | 98783264     | 23 GB      | 2775 MB    | 25 GB
1 Like

That does seem like a beefy one. :slight_smile:

On the off-chance you didn’t do this at the time (depending on how old your forum is), there was this advice given in the PostgreSQL 13 update

Unfortunately I don’t have any direct experience of this so we may need to wait for someone smart to show up for a deeper dive. :nerd_face:

Yeah. I did this back when I upgraded to PostgreSQL 13. I also ran it yesterday. But the db size didn’t change.

Hopefully somebody else can chime in how to reduce the size.

Thank you though!

1 Like