One of the fun exercises in The Complete Guide to Rails Performance is to profile
the dalli
gem with ruby-prof
.
After cloning the repository and running ruby-prof test/benchmark_test.rb
, we get the following summary:
%self total self wait child calls name
13.56 9.181 9.181 0.000 0.000 193917 <Class::IO>#select
8.12 9.604 5.498 0.000 4.106 275009 Dalli::Ring#binary_search
5.01 3.390 3.390 0.000 0.000 272536 Kgio::SocketMethods#kgio_write
...
That sparked my curiosity. Why is dalli
spending so much time in binary_search
and can it be improved?
As it turns out, dalli
already takes care of it in a pretty interesting way. Here’s a simplified version of
binary_search
method definition (full source here):
begin
require 'inline'
inline do |builder|
builder.c <<-EOM
int binary_search(VALUE ary, unsigned int r) {
// C implementation
}
EOM
end
rescue LoadError
def binary_search(ary, value)
# Ruby implementation
end
end
We’re making use of rubyinline
gem which lets you embed small snippets
of C into your code. They get automatically compiled and loaded into the class/module that
defines them. Definitely handy if all you need is a few lines of C to speed things up.
Why not go the more common native extension route, you might ask? This comment from the source provides an indication:
optional for performance and only necessary if you are using multiple
memcached
servers.
Clearly, the typical user won’t need the native extension. More often than not, they’re a source of frustration.
I like the flexibility of this simple-by-default design. It allows you to optimize only if your metrics tell you so.
If you need more speed, all you need to do is include rubyinline
in your bundle, without
further configuration in your code. Here’s the kind of performance you’ll get:
%self total self wait child calls name
13.57 7.796 7.796 0.000 0.000 197184 <Class::IO>#select
6.39 3.672 3.672 0.000 0.000 272536 Kgio::SocketMethods#kgio_write
...
0.77 0.440 0.440 0.000 0.000 275009 Dalli::Ring#binary_search
Out of the box, fallback to Ruby implementation by means of LoadError
has you covered.