Can't alloc thread after updating to 2026.4.1

I’m seeing these errors in the logs after updating to 2026.4.1(e404d9aabc) recently:

Message (151769 copies reported)

Job exception: can't alloc thread

Backtrace

/usr/local/lib/ruby/3.4.0/socket.rb:712:in 'Thread.new'
/usr/local/lib/ruby/3.4.0/socket.rb:712:in 'block in Socket.tcp_with_fast_fallback'
/usr/local/lib/ruby/3.4.0/socket.rb:710:in 'Array#map'
/usr/local/lib/ruby/3.4.0/socket.rb:710:in 'Socket.tcp_with_fast_fallback'
/usr/local/lib/ruby/3.4.0/socket.rb:661:in 'Socket.tcp'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client/ruby_connection.rb:122:in 'RedisClient::RubyConnection#connect'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client/ruby_connection.rb:48:in 'RedisClient::RubyConnection#initialize'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client.rb:849:in 'Class#new'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client.rb:849:in 'block in RedisClient#connect'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client/middlewares.rb:12:in 'RedisClient::BasicMiddleware#connect'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client.rb:848:in 'RedisClient#connect'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client.rb:824:in 'RedisClient#raw_connection'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client.rb:779:in 'RedisClient#ensure_connected'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client.rb:372:in 'RedisClient#call_v'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-5.4.0/lib/redis/client.rb:90:in 'Redis::Client#call_v'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/rack-mini-profiler-4.0.1/lib/mini_profiler/profiling_methods.rb:90:in 'block in Redis::Client#profile_method'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-5.4.0/lib/redis.rb:152:in 'block in Redis#send_command'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-5.4.0/lib/redis.rb:151:in 'Monitor#synchronize'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-5.4.0/lib/redis.rb:151:in 'Redis#send_command'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-5.4.0/lib/redis/commands/keys.rb:256:in 'Redis::Commands::Keys#del'
/var/www/discourse/lib/discourse_redis.rb:168:in 'block in DiscourseRedis#del'
/var/www/discourse/lib/discourse_redis.rb:29:in 'DiscourseRedis.ignore_readonly'
/var/www/discourse/lib/discourse_redis.rb:165:in 'DiscourseRedis#del'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/mini_scheduler-0.18.0/lib/mini_scheduler/distributed_mutex.rb:48:in 'MiniScheduler::DistributedMutex#synchronize'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/mini_scheduler-0.18.0/lib/mini_scheduler/distributed_mutex.rb:15:in 'MiniScheduler::DistributedMutex.synchronize'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/mini_scheduler-0.18.0/lib/mini_scheduler/manager.rb:365:in 'MiniScheduler::Manager#lock'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/mini_scheduler-0.18.0/lib/mini_scheduler/manager.rb:316:in 'MiniScheduler::Manager#tick'
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/mini_scheduler-0.18.0/lib/mini_scheduler.rb:74:in 'block (2 levels) in MiniScheduler.start'

hostname	ip-172-x-x-x-app
process_id	286281
application_version	3532c825824ee1259628545c7bd5311ecb918009
current_db	default
current_hostname	discussion.mcebuddy2x.com
message	While ticking scheduling manager
time	7:57 pm

This would likely be a resource constraint on your server, what is RAM doing at the moment do you have any details about the droplet you are running?

It’s a dedicated machine running only one instance of discourse . There doesn’t appear to a shortage of ram / swap.

when did you last upgrade docker, OS, Discourse image? Is this running latest on all 3.

Last ran a CLI/docker upgrade and OS update last week. Will try it again.


It was running fine in the beta release from a few weeks ago. This started after updating from beta to release via the browser upgrade option.

could you ./launcher enter app and

ulimit -u

this shows the maximum number of user processes/threads allowed

ulimit -a

this shows all resource limits

cat /sys/fs/cgroup/pids.max

This checks the maximum number of processes (PIDs) allowed for the container or system cgroup.


now use logout to return to the host;

systemctl show docker | grep TasksMax

this checks whether systemd has imposed a task/thread limit on the Docker service.

systemctl show containerd | grep TasksMax

this does the same kind of check, but for the containerd service instead of Docker directly.

docker inspect app | grep -i pid

this checks the process / PID limits and settings of your Discourse container. The grep -i pid: filters to anything containing “pid” (case-insensitive).

If you keep getting errors, please could you paste the output of these commands here, that would be helpful.

./launcher enter app

1 Like

Here’s the information, pretty much everything is unlimited

real-time non-blocking time  (microseconds, -R)  unlimited
core file size              (blocks, -c)  unlimited
data seg size               (kbytes, -d)  unlimited
scheduling priority         (-e)  0
file size                   (blocks, -f)  unlimited
pending signals             (-i)  7617
max locked memory           (kbytes, -l)  8192
max memory size             (kbytes, -m)  unlimited
open files                  (-n)  1048576
pipe size                   (512 bytes, -p)  8
POSIX message queues        (bytes, -q)  819200
real-time priority          (-r)  0
stack size                  (kbytes, -s)  8192
cpu time                    (seconds, -t)  unlimited
max user processes          (-u)  unlimited
virtual memory              (kbytes, -v)  unlimited
file locks                  (-x)  unlimited
file locks (-x) unlimited

root@ip-__________-app:/var/www/discourse# cat /sys/
fs/cgroup/pids.max

2285
docker | grep TasksMax
TasksMax=infinity
pect app | grep -i pid

"Pid": 640806,
"PidMode": "",
"PidsLimit": null,

edit by moderator per @Ethsim2 request to include content in code blocks

Doing a rebuild from the CLI appears to have fixed it. Will keep an eye on it. Something about doing a browser update from the beta to stable in the last week triggered this.

Should there be limits on the browser upgrade ? Can the browser upgrade detect potential issue and flag it or prevent the upgrade from being triggered ?

1 Like

The rebuild likely reset the container’s cgroup placement, which would explain why it’s stable again.

Given the original can’t alloc thread errors and the fact that everything else (ulimits, TasksMax, Docker PIDs) is unlimited, the remaining suspect is PID cgroup pressure.

Could you check during normal load:

cat /sys/fs/cgroup/pids.current

[1]

cat /sys/fs/cgroup/pids.max

[2]

If pids.current is approaching ~2000+ against a max of ~2285, that would confirm the container was hitting the cgroup PID ceiling during the scheduler / Redis reconnect bursts.

That would also explain why the issue only appeared after the upgrade (higher thread churn), and why the rebuild temporarily cleared it.


  1. How many processes (PIDs/threads) are currently running inside the container/cgroup ↩︎

  2. the maximum number of processes (PIDs/threads) allowed in that cgroup (your container) ↩︎

1 Like