Yehuda Katz is a member of the Ember.js, Ruby on Rails and jQuery Core Teams; he spends his daytime hours at the startup he founded, Tilde Inc.. Yehuda is co-author of best-selling jQuery in Action and Rails 3 in Action. He spends most of his time hacking on open source—his main projects, like Thor, Handlebars and Janus—or traveling the world doing evangelism work. He can be found on Twitter as @wycats and on Github.
Status Update — A Fresh Look at Callbacks
January 16th, 2009
After I finished the first stage of the ActionView refactor I was working on (I’m currently working on some small issues with Josh and getting ready to merge it into the official Rails3 branch), I decided to take a fresh look at the performance numbers. As expected, by initial refactor did not really improve performance (it wasn’t intended to — I mainly just moved things around and cleaned up the code path), but one thing that stuck out at me was how expensive the callback system is.
This is the case in both Rails and Merb, although Merb’s smaller callback feature set makes it somewhat faster than Rails’ at present (but not significantly so; both systems use the same basic approach). So I decided to spend some time today trying to make the callbacks faster. The first thing I noticed was that the ActiveSupport callbacks are fairly limited (for instance, missing around filters), and that ActionPack uses a small part of the ActiveSupport system and then layers on scads of additional functionality.
After spending a few hours trying to improve the performance of the callbacks, I remembered that Carl had considered an interesting approach for the Merb callbacks that involved dispensing with iteration in favor of a single compiled method that inlined the filters. In his original experiments (which supported before, after, and around filters, but not conditions), he was able to make filters run about an order of magnitude faster than Merb’s system.
Effectively, the approach boils down to:
before_filter :foo after_filter :bar around_filter :baz def _run_hookz foo baz do yield baz end end
This completely removes the need for iteration at runtime, and compiles down the hooks into a single, speedy method. What complicated matters a bit was:
- Rails’ support for conditions (also supported in Merb, but not in Carl’s original prototype)
- Rails’ support for various filter types, including strings (which get evalled), procs, blocks, and arbitrary objects
After playing around a bit, I managed to support conditional filters as well as all the supported Rails filter types using the speedier callback system.I compared the ActionPack filtering system to the old ActiveSupport callbacks to the new callbacks. According to my benchmarks, the new system is 15-20x faster than the old one. Here are the actual benchmarks for 100,000 iterations (2 before filters, 1 after filter with a proc conditions).
JRuby Results | --------------------- actionpack 2.406 | old 1.798 | new 0.089 |
MRI (1.8) Results | --------------------- actionpack 4.190 | old 3.063 | new 0.276 |
MRI (1.9) Results | --------------------- actionpack 2.617 | old 2.137 | new 0.157 |
Keep in mind that I haven’t retrofitted the ActionPack filters to use the new callback system yet; I had to make ActiveSupport::Callbacks support around_filters first, but that there shouldn’t be any reason it won’t work. Also, adding around_filter support to ActiveSupport::Callbacks means that Test::Unit will now have around filters for free as part of the retrofit (this should affect ActiveRecord as well, for anyone who needs around filters in AR’s hooks).
Also, and this is a big caveat: this is most certainly not an optimization that is likely to significantly help most apps where the overhead is in database access or rendering. However, one of my personal goals for the work I’m doing is to reduce the need for things like Rack Metal by reducing the overhead of rendering a request through the full stack. Filters are used throughout the render path and we shockingly using 10-20% of the 4ms a render :text => “Hello world” was taking up. I was surprised that the action itself was only about 7% of the default render path.
So again, shaving off under 1ms is not likely to help many apps, but it is a step in the direction of making it possible to build high-performance APIs and service tiers on top of the full Rails request cycle. You can follow along this callback work on my optimize_callbacks branch.
P.S. Merb 1.0.8 should be released tomorrow. It’ll include a bunch of pending fixes (including a fix for PIDs being overwritten in production when using merb -i)