YJIT Benchmarks

Details for Benchmarks at 2023-06-03 06:08:56 GMT

YJIT metrics from the yjit-bench suite using Ruby 0402193723.

Overall YJIT is 48.0% faster than interpreted CRuby!
On Railsbench specifically, YJIT is 52.0% faster than CRuby!

Performance on Headline Benchmarks

Select Platform
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 No JIT YJIT activerecord hexapdf liquid-c liquid-compile liquid-render mail psych-load railsbench ruby-lsp sequel
Speed of each Ruby implementation relative to the baseline CRuby measurement. Higher is better.

Memory Usage on Headline Benchmarks

Select Platform
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 CRuby 3.3.0dev YJIT 3.3.0dev activerecord hexapdf liquid-c liquid-compile liquid-render mail psych-load railsbench ruby-lsp sequel geomean*
Memory usage of each Ruby implementation relative to the baseline CRuby measurement. Lower is better.

Performance on Other Benchmarks

Select Platform
0.0 0.5 1.0 1.5 2.0 2.5 No JIT YJIT binarytrees chunky_png erubi erubi_rails etanni fannkuchredux lee nbody optcarrot ruby-json rubykon
Speed of each Ruby implementation relative to the baseline CRuby measurement. Higher is better.

Memory Usage on Other Benchmarks

Select Platform
0.0 0.2 0.4 0.6 0.8 1.0 CRuby 3.3.0dev YJIT 3.3.0dev binarytrees chunky_png erubi erubi_rails etanni fannkuchredux lee nbody optcarrot ruby-json rubykon geomean*
Memory usage of each Ruby implementation relative to the baseline CRuby measurement. Lower is better.

Performance on MicroBenchmarks

Select Platform
0.0 2.0 4.0 6.0 8.0 10.0 No JIT YJIT 30k_ifelse 30k_methods cfunc_itself fib getivar keyword_args respond_to setivar setivar_object setivar_young str_concat throw
Speed of each Ruby implementation relative to the baseline CRuby measurement. Higher is better.

Memory Usage on MicroBenchmarks

Select Platform
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 CRuby 3.3.0dev YJIT 3.3.0dev 30k_ifelse 30k_methods cfunc_itself fib getivar keyword_args respond_to setivar setivar_object setivar_young str_concat throw geomean*
Memory usage of each Ruby implementation relative to the baseline CRuby measurement. Lower is better.

Want Raw Graphs and CSV?

Benchmarks Speed Details

Select Platform
bench No JIT (ms) No JIT RSD YJIT (ms) YJIT RSD YJIT spd YJIT spd RSD % in YJIT
activerecord 79.7 1.81% 43.1 3.74% 1.85x 4.16% 94.75%
hexapdf 3266.9 0.70% 2036.5 0.98% 1.60x 1.21% 91.83%
liquid-c 81.6 0.49% 59.4 0.64% 1.37x 0.81% 93.41%
liquid-compile 73.0 1.55% 55.7 3.74% 1.31x 4.05% 93.96%
liquid-render 192.0 0.28% 104.6 0.43% 1.83x 0.52% 87.95%
mail 170.2 0.10% 132.2 0.14% 1.29x 0.17% 99.19%
psych-load 2608.6 0.07% 1849.2 0.10% 1.41x 0.13% 99.99%
railsbench 2755.5 0.63% 1812.8 1.17% 1.52x 1.33% 93.75%
ruby-lsp 79.6 2.78% 54.8 7.20% 1.45x 7.71% 79.09%
sequel 88.8 0.77% 69.3 1.30% 1.28x 1.51% 95.29%
binarytrees 457.7 0.04% 230.3 0.06% 1.99x 0.07% 100.00%
chunky_png 1010.6 0.25% 654.2 0.38% 1.54x 0.46% 100.00%
erubi 304.2 0.20% 256.6 0.07% 1.19x 0.21% 100.00%
erubi_rails 26.8 10.90% 15.8 19.26% 1.70x 22.13% 91.83%
etanni 409.7 0.03% 412.8 0.04% 0.99x 0.05% 7.03%
fannkuchredux 2157.8 0.23% 778.8 0.19% 2.77x 0.30% 91.09%
lee 1304.9 0.34% 944.3 0.38% 1.38x 0.51% 99.97%
nbody 119.4 0.06% 71.8 0.06% 1.66x 0.08% 100.00%
optcarrot 6285.3 0.65% 2297.8 0.60% 2.74x 0.88% 96.78%
ruby-json 3724.3 0.07% 3230.4 0.18% 1.15x 0.19% 99.75%
rubykon 12596.3 0.39% 6714.9 0.36% 1.88x 0.53% 99.78%
30k_ifelse 2337.9 0.02% 359.7 0.06% 6.50x 0.06% 99.99%
30k_methods 6405.1 0.01% 855.8 0.03% 7.48x 0.03% 100.00%
cfunc_itself 101.8 0.31% 38.7 0.30% 2.63x 0.43% 100.00%
fib 229.6 0.05% 44.6 0.07% 5.15x 0.09% 100.00%
getivar 118.7 0.14% 21.1 0.18% 5.62x 0.23% 97.88%
keyword_args 284.0 0.11% 48.8 0.13% 5.82x 0.17% 100.00%
respond_to 276.2 1.46% 25.4 0.42% 10.86x 1.52% 100.00%
setivar 72.3 0.35% 12.0 0.09% 6.03x 0.36% 98.74%
setivar_object 102.1 0.14% 42.1 0.05% 2.42x 0.15% 95.69%
setivar_young 102.1 0.42% 42.3 1.79% 2.41x 1.83% 95.69%
str_concat 79.7 0.18% 45.8 0.42% 1.74x 0.46% 99.97%
throw 30.6 0.54% 25.2 0.34% 1.21x 0.64% 64.81%

RSD is relative standard deviation - the standard deviation divided by the mean, expressed as a percentage.
% in YJIT is the percentage of instructions that complete in YJIT rather than exiting to the non-JITted interpreter. YJIT performs better when this is higher.
Speedup is relative to interpreted CRuby. So an "MJIT speedup" of 1.21x means MJIT runs at 1.21 times the iters/second of CRuby with JIT disabled.

You can find our benchmark code in the yjit-bench Github repo and the yjit-extra-benchmarks Github repo.
Our benchmark-runner and reporting code is in the yjit-metrics Github repo.

Tested Ruby version for YJIT and No-JIT: ruby 3.3.0dev (2023-06-03T03:41:36Z :detached: 0402193723) +YJIT [x86_64-linux]

Benchmark Memory Usage Details

Select Platform
bench CRuby 3.3.0dev mem (MiB) YJIT 3.3.0dev mem (MiB) Inline Code Outlined Code YJIT Mem overhead
activerecord 58 63 1 1 8.8%
hexapdf 220 293 2 2 32.9%
liquid-c 39 43 1 1 9.6%
liquid-compile 36 43 1 1 18.1%
liquid-render 38 42 1 1 10.6%
mail 48 51 1 1 6.9%
psych-load 37 39 1 1 5.7%
railsbench 110 124 3 2 12.6%
ruby-lsp 89 130 5 5 45.4%
sequel 40 43 1 1 8.5%
binarytrees 29 30 1 1 1.4%
chunky_png 64 66 1 1 3.7%
erubi 32 34 1 1 6.0%
erubi_rails 91 99 2 2 8.4%
etanni 27 27 1 1 1.6%
fannkuchredux 24 24 1 1 2.5%
lee 34 37 1 1 9.7%
nbody 23 24 1 1 2.2%
optcarrot 62 66 1 1 7.1%
ruby-json 25 25 1 1 2.1%
rubykon 251 255 1 1 1.8%
30k_ifelse 65 102 6 5 56.1%
30k_methods 56 67 2 2 21.0%
cfunc_itself 23 24 1 1 1.7%
fib 23 24 1 1 1.6%
getivar 23 24 1 1 1.7%
keyword_args 23 24 1 1 1.7%
respond_to 23 24 1 1 1.8%
setivar 23 24 1 1 1.7%
setivar_object 23 24 1 1 1.6%
setivar_young 23 24 1 1 1.7%
str_concat 50 81 1 1 62.5%
throw 23 24 1 1 1.6%

Memory is shown in mebibytes (1024 * 1024 bytes.)

Older YJIT allocated an additional 256MiB for generated code. Current YJIT allocates executable memory on demand, so this overhead should no longer be present.

Number of Iterations and Warmups Tested

Benchmark YJIT Stats

Note: currently, all stats are collected on x86_64, not ARM.

Raw JSON data files

All graphs and table data in this page comes from processing these data files, which come from benchmark runs.