Failed VersionCheck Jobs behavior

Sorry if this is not the correct category for this.

I’m evaluating discourse and the VersionCheck Job is failling in my environment.

I’ve noticed that failed jobs are piling up inside sidekiq and probably will get moved to the “dead” section after the default 25 retries (as per Class: Sidekiq::JobRetry — Documentation for sidekiq (6.3.1)).

I know that I need to investigate what is causing it to fail, but the point here is: Does it make sense to maintain these jobs there? Isn’t best to simply discard failed version checks and wait the next job execution?

At this moment I have more than 80 VersionCheck jobs waiting for retry and to me it looks like a waste of resources (probably little, but still a waste)…

From what I’ve checked, adding sidekiq_options retry: false to app/jobs/scheduled/version_check.rb would solve this.

Am I missing something?

How did you install? Is there reason to believe you have network issues? Ram?

You may be right, but since you’re the only person to report this (at least so far) it’s not made it on the list of optimizations. It does make sense to just let it fail after one try,I’d think.

When was the last time you upgraded?

There was an issue with the version check job a few weeks ago (around end of october), it is fixed now. If you upgrade in the terminal (./launcher rebuild app), it should be ok.

1 Like

I’m using the standard docker install inside an ec2 instance.

I’m in a corporate environment, so there are lots of firewall, proxies and security scanners between the instance and the internet. In the logs I see a “Job exception: Connection reset by peer - SSL_connect (Errno::ECONNRESET)” error, so probably some firewall is denying the request at some point… I’m still understandig how discourse does this version checks so I can reproduce them by hand and get more details.

Totally understand this. In the past I’ve worked with gitlab and seen lots of issues where full sidekiq queues caused performance degradation and other weird behaviours so everytime I see something like this my alarms ring. :smile:

1 Like

I’m on 2.8.0.beta9 (959923d3cf)

Yeah… The upgrade in the terminal or via GUI is working OK (runing it on a weekly basis). The only issue in this case is that the main administrator screen doesn’t show the latest version and always says that i’m running an outdated version.

Then you should definitely run a command line upgrade

Discourse will reach out to the internet for tasks like checking version upgrades, fetching user avatars, downloading remote images to local storage, and general oneboxing. If the instance is severed from the internet, there will be some breakage indeed.

1 Like

Yes! I’m doing it every week until I find a solution.

I had to give up on oneboxing exactly for this reason. For now I can’t allow full internet access for this server.* is already allowed, but probably this versioncheck job uses another URL to do this.

1 Like

What I’d do is just turn off SiteSetting.version_checks, remove the discourse_docker plugin and do command line upgrades.

But, here, if you can open up, then you’re probably good.

1 Like

Thank you for the info! It worked when I allowed access to

1 Like