Ruby may be getting a generational GC, what this means to you?


(Sam Saffron) #1

Earlier this week Koichi Sasada announced a new very interesting generational GC for Ruby aka RGENGC.

There is a great overview of the mechanics here: https://bugs.ruby-lang.org/attachments/3686/gc-strategy-en.pdf

I decided to do some basic testing comparing Discourse performance between Ruby 1.9.3 p392 / Ruby 2.0 p0 / Ruby Head / Ruby Head rgengc.

I am testing 2 particular setups

  1. A vanilla install of Ruby, unoptimised.
  2. A GC optimised install of Ruby (using recommendation by Aman Gupta from GitHub) including tcmalloc

All tests were performed on my Ubuntu 13.04 x64 VM, it is housed on a dedicated SSD my CPU is an i960

Keep in mind, I strongly encourage people reading this to attempt the patch themselves. I generated a patch file by diffing ko1s branch with master and then installing as a named ruby in rvm (using the --patch) param.

###Vanilla install

##Test 1: time rake middleware

Ruby 1.9.3:

real 0m16.092s
user 0m15.304s
sys 0m0.676s

Ruby 2.0

real 0m6.341s
user 0m5.616s
sys 0m0.624s

Ruby Head

real 0m5.631s
user 0m5.000s
sys 0m0.560s

Ruby RGENGC

real 0m5.913s
user 0m5.272s
sys 0m0.556s

##Test 2: ab -n 100 http://localhost/

(run three times, slowest run discarded - running in production mode)

Ruby 1.9.3

Time per request:       87.942 [ms] (mean)

Percentage of the requests served within a certain time (ms)
  50%     80
  66%     81
  75%     82
  80%     82
  90%    126
  95%    127
  98%    129
  99%    129
 100%    129 (longest request)

Ruby 2.0

Time per request:       89.985 [ms] (mean)

Percentage of the requests served within a certain time (ms)
  50%     68
  66%     71
  75%    149
  80%    153
  90%    155
  95%    156
  98%    159
  99%    185
 100%    185 (longest request)

Ruby Head

Time per request:       89.082 [ms] (mean)

  50%     69
  66%     71
  75%    141
  80%    145
  90%    147
  95%    148
  98%    152
  99%    166
 100%    166 (longest request)

Ruby Head rgengc


Time per request:       89.876 [ms] (mean)

Percentage of the requests served within a certain time (ms)
  50%     72
  66%     73
  75%    148
  80%    148
  90%    149
  95%    150
  98%    151
  99%    153
 100%    153 (longest request)

###Optimised Install
export RUBY_GC_MALLOC_LIMIT=1000000000 export RUBY_HEAP_SLOTS_GROWTH_FACTOR=1.25 export RUBY_HEAP_MIN_SLOTS=800000 export RUBY_FREE_MIN=600000 export LD_PRELOAD=/usr/lib/libtcmalloc_minimal.so

##Test 1: time rake middleware

Ruby 1.9.3:

real 0m6.766s
user 0m5.328s
sys 0m1.352s

Ruby 2.0

real 0m3.994s
user 0m3.236s
sys 0m0.676s

Ruby Head

real 0m3.841s
user 0m3.116s
sys 0m0.628s

Ruby RGENGC

real 0m4.016s
user 0m3.360s
sys 0m0.580s

##Test 2: ab -n 100 http://localhost/

Ruby 1.9.3:

Time per request:       76.205 [ms] (mean)

Percentage of the requests served within a certain time (ms)
  50%     74
  66%     74
  75%     75
  80%     75
  90%     77
  95%    110
  98%    113
  99%    113
 100%    113 (longest request)

Ruby 2.0

Time per request:       76.850 [ms] (mean)

Percentage of the requests served within a certain time (ms)
  50%     73
  66%     73
  75%     74
  80%     74
  90%     75
  95%    133
  98%    133
  99%    135
 100%    135 (longest request)

Ruby Head

Time per request:       75.800 [ms] (mean)

Percentage of the requests served within a certain time (ms)
  50%     72
  66%     72
  75%     72
  80%     72
  90%     77
  95%    121
  98%    122
  99%    126
 100%    126 (longest request)

Ruby RGENGC

Time per request:       74.955 [ms] (mean)

Percentage of the requests served within a certain time (ms)
  50%     66
  66%     67
  75%     68
  80%     68
  90%     78
  95%    169
  98%    184
  99%    186
 100%    186 (longest request)


My observations:

  • Interestingly with a tuned stack GC less requests seem to be served faster (the median request does not trigger a GC)
  • There is a slight impact to rails bootup time
  • Non tuned stacks can perform worse under an RGENGC
  • Ruby Head is getting faster :thumbsup:
  • A LOT more testing is needed, somebody needs to replicate my results :smile:

Overall, I am excited about this change especially since it appears to be advantageous to people running out-of-band-gcs with heavily tuned stacks.


(Jeff Atwood) #2

More info on this

https://www.omniref.com/blog/blog/2014/11/18/ko1-at-rubyconf-2014-massive-garbage-collection-speedup-in-ruby-2-dot-2/?hn=1