Yehuda Katz is a member of the Ember.js, Ruby on Rails and jQuery Core Teams; his 9-to-5 home is at the startup he founded, Tilde Inc.. There he works on Skylight, the smart profiler for Rails, and does Ember.js consulting. He is best known for his open source work, which also includes Thor and Handlebars. He travels the world doing open source evangelism and web standards work.

Archive for the ‘Ruby’ Category

Better Ruby Idioms

Carl and I have been working on the plugins system over the past few days. As part of that process, we read through the Rails Plugin Guide. While reading through the guide, we noticed a number of idioms presented in the guide that are serious overkill for the task at hand.

I don’t blame the author of the guide; the idioms presented are roughly the same that have been used since the early days of Rails. However, looking at them brought back memories of my early days using Rails, when the code made me feel as though Ruby was full of magic incantations and ceremony to accomplish relatively simple things.

Here’s an example:

module Yaffle
  def self.included(base)
    base.send :extend, ClassMethods
  end
 
  module ClassMethods
    # any method placed here will apply to classes, like Hickwall
    def acts_as_something
      send :include, InstanceMethods
    end
  end
 
  module InstanceMethods
    # any method placed here will apply to instaces, like @hickwall
  end
end

To begin with, the send is completely unneeded. The acts_as_something method will be run on the Class itself, giving the method access to include, a private method.

This code intended to be used as follows:

class ActiveRecord::Base
  include Yaffle
end
 
class Article < ActiveRecord::Base
  acts_as_yaffle
end

What the code does is:

  1. Register a hook so that when the module is included, the ClassMethods are extended onto the class
  2. In that module, define a method that includes the InstanceMethods
  3. So that you can say acts_as_something in your code

The crazy thing about all of this is that it’s completely reinventing the module system that Ruby already has. This would be exactly identical:

module Yaffle
  # any method placed here will apply to classes, like Hickwall
  def acts_as_something
    send :include, InstanceMethods
  end
 
  module InstanceMethods
    # any method placed here will apply to instances, like @hickwall
  end
end

To be used via:

class ActiveRecord::Base
  extend Yaffle
end
 
class Article < ActiveRecord::Base
  acts_as_yaffle
end

In a nutshell, there’s no point in overriding include to behave like extend when Ruby provides both!

To take this a bit further, you could do:

module Yaffle
  # any method placed here will apply to instances, like @hickwall, 
  # because that's how modules work!
end

To be used via:

class Article < ActiveRecord::Base
  include Yaffle
end

In effect, the initial code (override included hook to extend a method on, which then includes a module) is two layers of abstraction around a simple Ruby include!

Let’s look at a few more examples:

module Yaffle
  def self.included(base)
    base.send :extend, ClassMethods
  end
 
  module ClassMethods
    def acts_as_yaffle(options = {})
      cattr_accessor :yaffle_text_field
      self.yaffle_text_field = (options[:yaffle_text_field] || :last_squawk).to_s
    end
  end
end
 
ActiveRecord::Base.send :include, Yaffle

Again, we have the idiom of overriding include to behave like extend (instead of just using extend!).

A better solution:

module Yaffle
  def acts_as_yaffle(options = {})
    cattr_accessor :yaffle_text_field
    self.yaffle_text_field = options[:yaffle_text_field].to_s || "last_squawk"
  end
end
 
ActiveRecord::Base.extend Yaffle

In this case, it’s appropriate to use an acts_as_yaffle, since you’re providing additional options which could not be encapsulated using the normal Ruby extend.

Another “more advanced” case:

module Yaffle
  def self.included(base)
    base.send :extend, ClassMethods
  end
 
  module ClassMethods
    def acts_as_yaffle(options = {})
      cattr_accessor :yaffle_text_field
      self.yaffle_text_field = (options[:yaffle_text_field] || :last_squawk).to_s
      send :include, InstanceMethods
    end
  end
 
  module InstanceMethods
    def squawk(string)
      write_attribute(self.class.yaffle_text_field, string.to_squawk)
    end
  end
end
 
ActiveRecord::Base.send :include, Yaffle

Again, we have the idiom of overriding include to pretend to be an extend, and a send where it is not needed. Identical functionality:

module Yaffle
  def acts_as_yaffle(options = {})
    cattr_accessor :yaffle_text_field
    self.yaffle_text_field = (options[:yaffle_text_field] || :last_squawk).to_s
    include InstanceMethods
  end
 
  module InstanceMethods
    def squawk(string)
      write_attribute(self.class.yaffle_text_field, string.to_squawk)
    end
  end
end
 
ActiveRecord::Base.extend Yaffle

Of course, it is also possible to do:

module Yaffle
  def squawk(string)
    write_attribute(self.class.yaffle_text_field, string.to_squawk)
  end
end
 
class ActiveRecord::Base
  def self.acts_as_yaffle(options = {})
    cattr_accessor :yaffle_text_field
    self.yaffle_text_field = (options[:yaffle_text_field] || :last_squawk).to_s
    include Yaffle
  end
end

Since the module is always included in ActiveRecord::Base, there is no reason that the earlier code, with its additional modules and use of extend, is superior to simply reopening the class and adding the acts_as_yaffle method directly. Now we can put the squawk method directly inside the Yaffle module, where it can be included cleanly.

It may not seem like a huge deal, but it significantly reduces the amount of apparent magic in the plugin pattern, making it more accessible for new users. Additionally, it exposes the new user to include and extend quickly, instead of making them feel as though they were magic incantations requiring the use of send and special modules named ClassMethods in order to get them to work.

To be clear, I’m not saying that these idioms aren’t sometimes needed in special, advanced cases. However, I am saying that in the most common cases, they’re huge overkill that obscures the real functionality and confuses users.

Using the New Gem Bundler Today

As you might have heard, Carl and I released a new project that allows you to bundle your gems (both pure-ruby and native) with your application. Before I get into the process for using the bundler today, I’d like to go into the design goals of the project.

  • The bundler should allow the specification of all dependencies in a separate place from the application itself. In other words, it should be possible to determine the dependencies for an application without needing to start up the application.
  • The bundler should have a built-in dependency resolving mechanism, so it can determine the required gems for an entire set of dependencies.
  • Once the dependencies are resolved, it should be possible to get the application up and running on a new system without needing to check Rubyforge (or gemcutter) again. This is especially important for compiled gems (it should be possible to get the list of required gems once and compile on remote systems as desired).
  • Above all else, the bundler should provide a reproducible installation of Ruby applications. New gem releases or down remote servers should not be able to impact the successful installation of an application. In most cases, git clone; gem bundle should be all that is needed to get an application on a new system and up and running.
  • Finally, the bundler should not assume anything about Rails applications. While it should work flawlessly in the context of a Rails application, this should be because a Rails application is a Ruby application.

Using the Bundler Today

To use the gem bundler today in a non-Rails application, follow the following steps:

  1. gem install bundler
  2. Create a Gemfile in the root of your application
  3. Add dependencies to your Gemfile. See below for more details on the sorts of things you can do in the Gemfile. At the simplest level, gem "gem_name", "version" will add a dependency of the gem and version to your application
  4. At the root, run gem bundle. The bundler should tell you that it is resolving dependencies, then downloading and installing the gems.
  5. Add vendor/gems/gems, vendor/gems/specifications, vendor/gems/doc, and vendor/gems/environment.rb to your .gitignore file
  6. Inside your application, require vendor/gems/environment.rb to add the bundled dependencies to your load paths.
  7. Use Bundler.require_env :optional_environment to actually require the files.
  8. After committing, run gem bundle in a fresh clone to re-expand the gems. Since you left the vendor/gems/cache in your source control, new machines will be guaranteed to use the same files as the original machine, requiring no remote dependencies

The bundler will also install binaries into the app’s bin directory. You can, therefore, run bin/rackup for instance, which will ensure that the local bundle, rather than the system, is used. You can also run gem exec rackup, which runs any command using the local bundle. This allows things like gem exec ruby -e "puts Nokogiri::VERSION" or the even more adventurous gem exec bash, which will open a new shell in the context of the bundle.

Gemfile

You can do any of the following in the Gemfile:

  • gem "name", "version": version may be a strict version or a version requirement like &gt;= 1.0.6. The version is optional.
  • gem "name", "version", :require_as =&gt; "file": the require_as allows you to specify which file should be required when the require_env is called. By default, it is the gem’s name
  • gem "name", "version", :only =&gt; :testing: The environment name can be anything. It is used later in your require_env call. You may specify either :only, or :except constraints
  • gem "name", "version", :git =&gt; "git://github.com/wycats/thor": Specify a git repository to be used to satisfy the dependency. You must use a hard dependency (“1.0.6″) rather than a soft dependency (“>= 1.0.6″). If a .gemspec is found in the repository, it is used for further dependency lookup. If the repository has multiple .gemspecs, each directory will a .gemspec will be considered a gem.
  • gem "name", "version", :git =&gt; "git://github.com/wycats/thor", :branch =&gt; "experimental": Further specify a branch, tag, or ref to use. All of :branch, :tag, and :ref are valid options
  • gem "name", "version", :vendored_at =&gt; "vendor/nokogiri": In the next version of bundler, this option will be changing to :path. This specifies that the dependency can be found in the local file system, rather than remotely. It is resolved relative to the location of the Gemfile
  • clear_sources: Empties the list of gem sources to search inside of.
  • source "http://gems.github.com": Adds a gem source to the list of available gem sources.
  • bundle_path "vendor/my_gems": Changes the default location of bundled gems from vendor/gems
  • bin_path "my_executables": Changes the default location of the installed executables
  • disable_system_gems: Without this command, both bundled gems and system gems will be used. You can therefore have things like ruby-debug in your system and use it. However, it also means that you may be using something in development mode that is installed on your system but not available in production. For this reason, it is best to disable_system_gems
  • disable_rubygems: This completely disables rubygems, reducing startup times considerably. However, it often doesn’t work if libraries you are using depend on features of Rubygems. In this mode, the bundler shims the features of Rubygems that we know people are using, but it’s possible that someone is using a feature we’re unaware of. You are free to try disable_rubygems first, then remove it if it doesn’t work. Note that Rails 2.3 cannot be made to work in this mode
  • only :environment { gem "rails" }: You can use only or except in block mode to specify a number of gems at once

Bundler process

When you run gem bundle, a few things happen. First, the bundler attempts to resolve your list of dependencies against the gems you have already bundled. If they don’t resolve, the metadata for each specified source is fetched and the gems are downloaded. Next (either way), the bundler checks to see whether the downloaded gems are expanded. For any gem that is not yet expanded, the bundler expands it. Finally, the bundler creates the environment.rb file with the new settings. This means that running gem bundler over and over again will be extremely fast, because after the first time, all gems are downloaded and expanded. If you change settings, like disable_rubygems, running gem bundle again will simply regenerate the environment.rb.

Rails 2.3

To get this working with Rails 2.3, you need to create a preinitializer.rb and insert the following:

require "#{File.dirname(__FILE__)}/../vendor/bundler_gems/environment"
 
class Rails::Boot
  def run
    load_initializer
    extend_environment
    Rails::Initializer.run(:set_load_path)
  end
 
  def extend_environment
    Rails::Initializer.class_eval do
      old_load = instance_method(:load_environment)
      define_method(:load_environment) do
        Bundler.require_env RAILS_ENV
        old_load.bind(self).call
      end
    end
  end
end

It’s a bit ugly, but you can copy and paste that code and forget it. Astute readers will notice that we’re using vendor/bundler_gems/environment.rb. This is because Rails 2.3 attaches special, irrevocable meaning to vendor/gems. As a result, make sure to do the following in your Gemfile: bundle_path "vendor/bundler_gems".

Gemcutter uses this setup and it’s working great for them.

Bundler 0.7

We’re going to be releasing Bundler 0.7 tomorrow. It has some new features:

  • List outdated gems by passing --outdated-gems. Bundler conservatively does not update your gems simply because a new version came out that satisfies the requirement. This is so that you can be sure that the versions running on your local machine will make it safely to production. This will allow you to check for outdated gems so you can decide whether to update your gems with –update. Hat tip to manfred, who submitted this patch as part of his submission to the Rumble at Ruby en Rails
  • Specify the build requirements for gems in a YAML file that you specify with --build-options. The file looks something like this:
    mysql:
      config: /path/to/mysql_config

    This is equivalent to –with-mysql-config=/path/to/mysql_config

  • Specify :bundle =&gt; false to indicate that you want to use system gems for a particular dependency. This will ensure that it gets resolved correctly during dependency resolution but does not need to be included in the bundle
  • Support for multiple bundles containing multiple platforms. This is especially useful for people moving back and forth between Ruby 1.8 and 1.9 and don’t want to constantly have to nuke and rebundle
  • Deprecate :vendored_at and replace with :path
  • A new directory DSL method in the Gemfile:
    directory "vendor/rails" do
      gem "activesupport", :path =&gt; "activesupport" # :path is optional if it's identical to the gem name
                                                    # the version is optional if it can be determined from
                                                    # a gemspec
    end
  • You can do the same with the git DSL method
    git "git://github.com/rails/rails.git" do # :branch, :tag, or :ref work here
      gem "activesupport", :path =&gt; "activesupport" # same rules as directory, except that the files are
                                                    # first downloaded from git.
    end
  • Fix some bugs in resolving prerelease dependencies

Simplifying Rails Block Helpers (With a Side of Rubinius)

We all know that <%= string %> emits a String in ERB. And <% string %> runs Ruby code, but does not emit a String. When starting working with Rails, you almost expect the syntax for block helpers to be:

<%= content_tag(:div) do %>
  The content
<% end %>

Why doesn’t it work that way?

It has to do with how the ERB parser works, looking at each line individually. When it sees <% %>, it evaluates the code as a line of Ruby. When it sees <%= %>, it evaluates the inside of the ERB tag, and calls to_s on it.

This:

<% form_for(@object) do %>
Stuff
<% end %>

gets effectively converted to:

form_for(@object) do
_buf << ("Stuff").to_s
end

On the other hand, this:

<%= form_for(@object) do %>
Stuff
<% end %>

gets converted to:

_buf << (form_for(@object) do).to_s
_buf << ("Stuff").to_s
end

which isn’t valid Ruby. So we use the first approach, and then let the helper itself, rather than ERB, be responsible for concatenating to the buffer. Sadly, it leads to significantly more complex helpers.

Let’s take a look at the implementation of content_tag.

def content_tag(name, content_or_options_with_block = nil, options = nil, escape = true, &block)
  if block_given?
    options = content_or_options_with_block if content_or_options_with_block.is_a?(Hash)
    content_tag = content_tag_string(name, capture(&block), options, escape)
 
    if block_called_from_erb?(block)
      concat(content_tag)
    else
      content_tag
    end
  else
    content_tag_string(name, content_or_options_with_block, options, escape)
  end
end

The important chunk here is the middle, inside of the if block_given? section. The first few lines just get the actual contents, using the capture helper to pull out the contents of the block. But then you get this:

if block_called_from_erb?(block)
  concat(content_tag)
else
  content_tag
end

This is actually a requirement for writing a block helper of any kind in Rails. First, Rails checks to see if the block is being called from ERB. If so, it takes care of concatenating to the buffer. Otherwise, the caller simply wants a String back, so it returns it.

Worse, here’s the implementation of block_called_from_erb?:

BLOCK_CALLED_FROM_ERB = 'defined? __in_erb_template'
 
# Check whether we're called from an erb template.
# We'd return a string in any other case, but erb <%= ... %>
# can't take an <% end %> later on, so we have to use <% ... %>
# and implicitly concat.
def block_called_from_erb?(block)
  block && eval(BLOCK_CALLED_FROM_ERB, block)
end

So every time you use a block helper in Rails, or use a helper which uses a block helper, Rails is forced to eval into the block to determine what the context is.

In Merb, we solved this problem by using this syntax:

<%= form_for(@object) do %>
Stuff
<% end =%>

And while everyone agrees that the opening <%= is a reasonable change, the closing =%> is a bit grating. However, it allows us to compile the above code into:

_buf << (form_for(@object) do
_buf << ("Stuff").to_s
end).to_s

That’s because we tag the end with a special ERB tag that allows us to attach a ).to_s to the end. We use Erubis, which lets us control the compilation process more finely, to hook into this process.

Rails 3 will use Erubis regardless of this problem to implement on-by-default XSS protection, but I needed a solution that didn’t require the closing =%> (ideally).

Evan (lead on Rubinius) hit upon a rather ingenious idea: use Ruby operator precedence to get around the need to know where the end was. Effectively, compile into the following:

_buf << capture_obj << form_for(@object) do
_buf << ("Stuff").to_s
end

where capture_obj is:

class CaptureObject
  def <<(obj)
    @object = obj
    self
  end
 
  def to_str
    @object.to_s
  end
 
  def to_s
    @object.to_s
  end
end

Unfortunately, with one hand Ruby operator precedence giveth, and with one hand it taketh away. In order to test this, I tried using a helper that returned an object, rather than a String (valid in ERB). In ERB, this would call to_s on the object. When I tried to run this code with the CaptureObject, I got:

template template:1:in `<<': can't convert Object into String (TypeError)
   from template template:1:in `template'
   from helper_spike.rb:48

Evan and I were both a bit baffled by this (although it retrospect we probably shouldn’t have been), and we hit on the idea to try running the code through Rubinius and look at its backtrace:

An exception occurred running helper_spike.rb
    Coercion error: #<Object:0x60a>.to_str => String failed:
(No method 'to_str' on an instance of Object.) (TypeError)
 
Backtrace:
                       Type.coerce_to at kernel/common/type.rb:22
           Kernel(String)#StringValue at kernel/common/kernel.rb:82
                            String#<< at kernel/common/string.rb:93
                   MyContext#template at template template:1
                      main.__script__ at helper_spike.rb:48

By looking at Rubinius’ backtrace, we quickly realized that the order of operations was wrong, and to_str was getting called on the return value from the helper, rather than the CaptureObject. As I tweeted immediately thereafter, the information available in Rubinius’ backtrace is just phenomenal, exposing enough information to really see what’s going on. Because the internals of Rubinius are written in Ruby, the Ruby backtrace goes all the way through to the Type.coerce_to method.

After realizing that, we changed the implementation of CaptureObject to take the buffer in its initializer, and have it handle concatenating to the buffer. The compiled code now looks like:

capture_obj << form_for(@object) do
_buf << ("Stuff").to_s
end

and the CaptureObject looks like:

class CaptureObject
  def initialize(buf)
    @buf = buf
  end
 
  def <<(obj)
    @buf << obj.to_s
  end
end

Now, Ruby’s operator precedence will bind the do to the form_for, and the return value of form_for will be to_s‘ed and concatenated to the buffer.

And the best thing is the implementation of content_tag once that’s done:

def content_tag(name, content = nil, options = nil, escape = true, &block)
  if block_given?
    options = content if content.is_a?(Hash)
    content = capture(&block)
  end
  content_tag_string(name, content, options, escape)
end

We can simply return a String and ERB handles the concatenation work. That’s the important part: helper writers should be able to think of block helpers the same way they think about traditional helpers. Somewhat less importantly, we’ll be able to eliminate evaling into untold numbers of blocks at runtime.

This was only an experiment, and the specific details still need to be worked out (how do we do this without breaking untold numbers of existing applications), I’m very happy with this solution, which provides the simplicity and performance enhancement of the Merb solution without the ugly =%>.

How to Build Sinatra on Rails 3

In Ruby, we have the great fortune to have one major framework (Rails) and a number of minor frameworks that drive innovation forward. One of the great minor frameworks which has been getting a lot of traction recently is Sinatra, primarily because it exposes a great DSL for writing small, single-purpose apps.

Here’s an example of a simple Sinatra application.

class MyApp < Sinatra::Base
  set :views, File.dirname(__FILE__)
  enable :sessions
 
  before do
    halt if session[:fail] == true
  end
 
  get "/hello" do
    "Hello world"
  end
 
  get "/world" do
    @name = "Carl"
    erb :awesomesauce
  end
 
  get "/fail" do
    session[:fail] = true
    "You failed"
  end
end

There’s a lot of functionality packed into this little package. You can declare some code to be run before all actions, declare actions and the URL they should be routed from, use rendering semantics, and even use sessions.

We’ve been saying that Rails 3 is flexible enough to use as a framework toolkit–let’s prove it by using Rails to build the subset of the Sinatra DSL described above.

Let’s start with a very tiny subset of the DSL:

class MyApp < Sinatra::Base
  get "/hello" do
    "HELLO World"
  end
 
  post "/world" do
    "Hello WORLD"
  end
end

The first step is to declare the Sinatra base class:

module Sinatra
  class Base < ActionController::Metal
    include ActionController::RackConvenience
  end
end

We start off by making Sinatra::Base a subclass of the bare metal ActionController implementation, which provides just enough infrastructure to get going. We also include the RackConvenience module, which provides request and response and handles some basic Rack tasks for us.

Next, let’s add support for the GET and POST method:

class Sinatra::Base
  def self.inherited(klass)
    klass.class_eval { @_routes = [] }
  end
 
  class << self
    def get(uri, options = {}, &block)  route(:get,  uri, options, &block) end
    def post(uri, options = {}, &block) route(:post, uri, options, &block) end
 
    def route(http_method, uri, options, &block)
      action_name = "[#{http_method}] #{uri}"
      @_routes << {:method => http_method.to_s.upcase, :uri => uri,
                   :action => action_name, :options => options}
      define_method(action_name, &block)
    end
  end
end

We’ve simply defined some class methods on the Sinatra::Base to store off routing details for the get and post methods, and creating a new method named [GET] /hello. This is a bit of an interesting Ruby trick; while the def keyword has strict semantics for method names, define_method allows any string.

Now we need to wire up the actual routing. There are a number of options, including the Rails router (rack-mount, rack-router, and usher are all new, working Rails-like routers). We’ll use Usher, a fast Rails-like router written by Josh Hull.

class << Sinatra::Base
  def to_app
    routes, controller = @_routes, self
 
    Usher::Interface.for(:rack) do
      routes.each do |route|
        add(route[:uri], :conditions => {:method => route[:method]}.merge(route[:options])).
          to(controller.action(route[:action]))
      end
    end
  end
end

Here, we define to_app, which is used by Rack to convert a parameter to run into a valid Rack application. We create a new Usher interface, and add a route for each route created by Sinatra. Because Usher::Interface.for uses instance_eval for its DSL, we store off the routes and controller in local variables that will still be available in the closure.

One little detail here: In Rails 3, each action in a controller is a valid rack endpoint. You get the endpoint by doing ControllerName.action(method_name). Here, we’re simply pulling out the action named “[GET] /hello” that we created in route.

The final piece of the puzzle is covering the action processing in the controller itself. For this, we will mostly reuse the default action processing, with a small change:

class Sinatra::Base
  def process_action(*)
    self.response_body = super
  end
end

What’s happening here is that Rails does not treat the return value of the action as significant, instead expecting it to be set using render, but Sinatra treats the returned string as significant. As a result, we set the response_body to the return value of the action.

Next, let’s add session support.

class << Sinatra::Base
  def set(name, value)
    send("_set_#{name}", value)
  end
 
  def enable(name)
    set(name, true)
  end
 
  def _set_sessions(value)
    @_sessions = value
    include ActionController::Session if value
  end
 
  def to_app
    routes, controller = @_routes, self
 
    app = Usher::Interface.for(:rack) do
      routes.each do |route|
        add(route[:uri], :conditions => {:method => route[:method]}.merge(route[:options])).
          to(controller.action(route[:action]))
      end
    end
 
    if @_sessions
      app = ActionDispatch::Session::CookieStore.new(app, {:key => "_secret_key",
        :secret => Digest::SHA2.hexdigest(Time.now.to_s + rand(100).to_s)})
    end
 
    app
  end
end

There’s a few things going on here. First, Sinatra provides an API for setting options: set :option, :value. In Sinatra, enable :option is equivalent to set :option, true. To simplify adding new options, we just delegate set :whatever, value to a call to _set_whatever(value).

We then implement _set_sessions(value) to include ActionController::Session, which provides the session helper. In to_app, we wrap the original application in an ActionDispatch::Session::CookieStore if sessions were set.

Next, we want to add in support for callbacks (before do). It’s only a few lines:

class Sinatra::Base
  include AbstractController::Callbacks
end
 
class << Sinatra::Base
  alias before before_filter
end

Basically, we pull in the normal Rails callback code, and then rename before_filter to before and we’re good to go.

Finally, let’s dig into rendering.

class Sinatra::Base
  include ActionController::RenderingController
 
  def sinatra_render_file(name)
    render :template => name.to_s
  end
 
  def sinatra_render_inline(string, type)
    render :inline => string, :type => type
  end
 
  %w(haml erb builder).each do |type|
    define_method(type) do |thing|
      return sinatra_render_inline(thing, type) if thing.is_a?(String)
      return sinatra_render_file(thing)
    end
  end
end
 
class << Sinatra::Base
  alias _set_views append_view_path
end

We include the RenderController module, which provides rendering support. Sinatra supports a few different syntaxes for rendering. It supports erb :template_name which renders the ERB template named template_name. It also supports erb "Some String", which renders the string uses the ERB engine.

Rails supports both of those via render :template and render :inline, so we simply defer to that functionality in each case. We also handle Sinatra’s set :views, view_path by delegating to append_view_path.

You can check out the full repository at https://github.com/wycats/railsnatra/

So there you have it, a large subset of the Sinatra DSL written in Rails in under 100 lines of code. And if you want to add in more advanced Rails features, like layouts, flash, respond_to, file streaming, or conditional get support, it’s just a simple module inclusion away.

My 10 Favorite Things About the Ruby Language

I work with Ruby every single day, and over time have come to really enjoy using it. Here’s a list of some specific things that I really like about Ruby. Some of them are obvious, and some are shared with other languages. The purpose is to share things I like about Ruby, not to compare and contrast with any specific language.

1. Dynamic Typing

There are very good things about statically typed languages, such as compile-time verifiability and IDE support. However, in my experience, dynamic typing really helps get projects bootstrapped and smooths along changes, especially in the early to middle stages of a project.

I’m very happy that I don’t need to create a formal interface for my new objects to implement simply to enable me to easily swap out that class for another later on.

2. Duck Typing

This is effectively just an extension of Dynamic Typing. In Ruby, methods that expect to be able to operate on String objects don’t do checks for is_a?(String). They check whether the object respond_to?(:to_str) and then calls to_str on the Object if it does. Similarly, objects that represent Paths in Ruby can implement a to_path method to provide the path representation.

In Rails, we use this technique for objects that have “model” characteristics by expecting them to respond_to?(:to_model). This allows us to support any object in relevant contexts, provided those objects can supply us with a “model” representation of themselves.

3. Awesome Modules

Ruby provides a language feature similar to “traits” in Scala, Squeak, and Perl. Effectively, Ruby modules allow the dynamic addition of new elements of the class hierarchy at runtime. The use of super is dynamically evaluated at runtime to take into consideration any modules that might have been added, making it easy to extend functionality on a superclass as many times as desired, without being forced to decide where super will land at class declaration time.

Additionally, Ruby modules provide the lifecycle hooks append_features and included, which make it possible to use modules robustly to isolate extensions from one another and to dynamically extend classes on the basis of feature inclusion.

4. Class Bodies Aren’t Special

In Ruby, class bodies aren’t a special context. They’re simply a context where self points at the class object. If you’ve used Rails, you’ve probably seen code like this:

class Comment < ActiveRecord::Base
  validates_presence_of :post_id
end

It may look like validates_presence_of is a language feature, but it’s actually a method being called on Comment that is provided by ActiveRecord::Base.

That method can execute arbitrary code, also in the context of the class, including creating new methods, executing other pieces of code, or updating a class instance variable. Unlike Java annotations, which must be run at compile-time, Ruby class bodies can take runtime information into consideration, such as dynamically supplied options or the results of evaluating other code.

5. String Eval

This is likely a heresy. I’m not referring to arbitrary runtime String eval here, but rather String eval that is used to create methods early in the boot process of a Ruby application.

This can make it possible to take Ruby-defined structures, like Rails routes or AOP-definitions, and compile them into Ruby methods. Of course, it is possible to implement these things as add-ons to other languages, but Ruby makes it possible to implement these sorts of things in pure Ruby. It is, to a large degree, a self-hosting language.

6. Blocks and Lambdas

I’ve said this a few times and I’ll repeat myself: I don’t consider languages without anonymous lambdas to be powerful enough for me to use day-to-day. These constructs are actually extremely common, and found in languages as diverse as Ruby, JavaScript, Scala, Clojure, and of course Lisp.

They make it possible to implement block-scoped constructs that look like language features. The most common example usage is for File operations. In languages without lambdas, users are forced to use an inline “ensure” block every in the same lexical scope that they originally opened the file in, to ensure that the resource is closed.

In Java:

static void run(String in) 
throws FileNotFoundException {
  File input = new File(in);
  String line; Scanner reader = null;
  try {
    reader = new Scanner(input);
    while(reader.hasNextLine()) {
      System.out.println(reader.nextLine());
    }
  } finally { reader.close(); }
}

Among other things, the Java version needs to wrap the creation of the Scanner in a try block so it can be guaranteed to be closed. In contrast, the Ruby version:

def run(input)
  File.open(input, "r") do |f|
    f.each_line {|line| puts line }
  end
end

Because of the existence of blocks, it is possible to abstract away the need to close the File in a single location, minimizing programmer error and reducing duplication.

7. Combo Attack: Self-Hosting Language

The combination of several of the above features produce real-life examples of ways that we can “extend” the Ruby language in Rails. Consider the following:

  respond_to do |format|
    if @user.save
      flash[:notice] = 'User was successfully created.'
      format.html { redirect_to(@user) }
      format.xml { render :xml => @user, :status =>ted, :location => @user }
    else
      format.html { render :action => "new" }
      format.xml { render :xml => @user.errors, :status => :unprocessable_entity }
    end
  end

In this case, we’re able to seamlessly mix methods (respond_to) with normal Ruby code (if and else) to produce a new block-scoped construct. Ruby’s semantics for blocks allow us to return or yield from inside the block, further blending the boundaries of the code-block and language constructs like if or while.

In Rails 3, we introduced:

class PeopleController < ApplicationController
  respond_to :html, :xml, :json
 
  def index
    @people = Person.find(:all)
    respond_with(@people)
  end
end

Here, respond_to is provided at the class-level. It tells Rails that respond_with (in index) should accept HTML, XML, or JSON as response formats. If the user asked for a different format, we automatically return a 406 error (Not Acceptable).

If you dig in a bit deeper, you can see that the respond_to method is defined as:

def respond_to(*mimes)
  options = mimes.extract_options!
 
  only_actions   = Array(options.delete(:only))
  except_actions = Array(options.delete(:except))
 
  mimes.each do |mime|
    mime = mime.to_sym
    mimes_for_respond_to[mime]          = {}
    mimes_for_respond_to[mime][:only]   = only_actions   unless only_actions.empty?
    mimes_for_respond_to[mime][:except] = except_actions unless except_actions.empty?
  end
end

This method is defined on the ActionController::MimeResponds::ClassMethods module, which is pulled into ActionController::Base. Additionally, mimes_for_respond_to is defined using class_inheritable_reader in the included lifecycle hook for the module. The class_inheritable_reader method (macro?) uses class_eval to add methods onto the class in question to emulate the built-in attr_accessor functionality.

Understanding all of the details isn’t important. What’s important is that using the Ruby features we described above, it’s possible to create layers of abstraction that can appear to add features to the Ruby language.

A developer looking at ActionController::MimeResponds need not understand how class_inheritable_reader works–he just needs to understand the basic functionality. And a developer looking at the API documentation need not understand how the class-level respond_to is implemented–she just needs to understand the provided functionality. With that said, peeling back each layer leads to simple abstractions that build on other abstractions. There’s no need to peel back the whole curtain at once.

8. Nice Literals

I often forget about this when programming in Ruby, only to crash back down to earth when using a language with fewer, less expressive literals.

Ruby has literals for just about everything:

  • Strings: single-line, double-line, interpolated
  • Numbers: binary, octal, decimal, hex
  • Null: nil
  • Boolean: true, false
  • Arrays: [1,2], %w(each word is element)
  • Hashes: {key => value} and in Ruby 1.9 {key: value}
  • Regular expressions: /hello/, %r{hello/path}, %r{hello#{interpolated}}
  • Symbols: :name and :”weird string”
  • Block: { block literal }

And I think I’m missing some. While it may seem academic, good, readable literals can increase the programmer’s ability to write short but extremely expressive code. It’s of course possible to achieve the same sorts of things as you can with literal Hashes by instantiating a new Hash object and pushing the keys and values on one at a time, but it reduces their utility as method parameters, for instance.

The terseness of the Hash literal has allowed Ruby programmers to effectively add a limited keyword argument feature to the language without having to get approval by the language designers. Yet another small example of self-hosting.

9. Everything is an Object, and All Code is Executed and Has a self

I showed this to some degree earlier, but a lot of the reason that Class bodies work the way they do is a consequence of the unfailing object orientation of the Ruby language. Inside of a class body, Ruby is simply executing code with a self pointing at the class. Additionally, nothing is special about the class context; it is possible to evaluate code in a class’ context from any location. Consider:

module Util
  def self.evaluate(klass)
    klass.class_eval do
      def hello
        puts "#{self} says Hello!" 
      end
    end
  end
end
 
class PersonName < String
  Util.evaluate(self)
end

This is exactly equivalent to:

class PersonName < String
  def hello
    puts "#{self} says Hello!" 
  end
end

By removing the artificial boundaries between code in different locations, Ruby reduces the conceptual overhead of creating abstractions. And this is the result of a strong, consistent object model.

One more example on this topic. This idiom is quite common in Ruby: possibly_nil && possibly_nil.method_name. Since nil is just an object in Ruby, sending it a message it does not understand will result in a NoMethodError. Some developers suggested the following syntax: possibly_nil.try(:method_name). This can be implemented in Ruby as follows:

class Object
  alias_method :try, :__send__
end
 
class NilClass
  def try
    nil
  end
end

Essentially, this adds the method try to every Object. When the Object is nil, try simply returns nil. When the object is not nil, try just calls the method in question.

Using targeted application of Ruby’s open classes, combined with the fact that everything in Ruby, including nil, is an object, we were able to create a new Ruby feature. Again, this isn’t such a big deal, but it’s another case where the right choices in the language can allow us to create useful abstractions.

10. Rack

I’m going to cheat a little bit since Rack isn’t part of the Ruby language, but it does demonstrate some useful things about it. First of all, the Rack library only hit 1.0 earlier this year, and already every single Ruby web framework is Rack compliant. If you use a Ruby framework, you can be guaranteed that it uses Rack, and any standard Rack middleware will work.

This was all done without any backward compatibility sacrifices, a tribute to the flexibility of the Ruby language.

Rack itself also leverages Ruby features to do its work. The Rack API looks like this:

Rack::Builder.new do
  use Some::Middleware, param
  use Some::Other::Middleware
  run Application
end

In this brief code snippet, a number of things are at work. First, a block is passed to Rack::Builder. Second, that block is evaluated in the context of a new instance of Rack::Builder, which gives it access to the use and run methods. Third, the parameter passed to use and run is a class literal, which in Ruby is a simple object. This allows Rack to call passed_in_middleware.new(app, param), where new is just a method call on the Class object Some::Middleware.

And in case you think the code to implement that would be heinous, here it is:

class Rack::Builder
  def initialize(&block)
    @ins = []
    instance_eval(&block) if block_given?
  end
 
  def use(middleware, *args, &block)
    @ins << lambda { |app| middleware.new(app, *args, &block) }
  end
 
  def run(app)
    @ins << app #lambda { |nothing| app }
  end
end

This is all that’s needed to implement the code I showed above the creates a new Rack application. Instantiating the middleware chain is a simple affair as well:

def to_app
  inner_app = @ins.last
  @ins[0...-1].reverse_each { |app| inner_app = app.call(inner_app) }
  inner_app
end
 
def call(env)
  to_app.call(env)
end

First, we take the last element from the chain, which is our endpoint. We then loop over the remaining elements in reverse, instantiating each middleware with the next element in the chain, and return the resulting object.

Finally, we define a call method on the Builder, which is required by the Rack specifically, that calls to_app and passes the environment through, kicking off the chain.

Through the use of a number of the techniques described in the post, we were able to create a Rack-compliant application that supports Rack middleware in under two dozen lines of code.

Rubygems Good Practice

Rubygems provides two things for the Ruby community.

  1. A remote repository/packaging format and installer
  2. A runtime dependency manager

The key to good rubygems practice is to treat these two elements of Rubygems as separate from each other. Someone might use the Rubygems packaging format and the Rubygems distribution but not want to use the Rubygems runtime.

And why should they? The Rubygems runtime is mainly responsible for setting up the appropriate load-paths, and if you are able to get the load paths set up correctly, why should you care about Rubygems at all?

In other words, you should write your libraries so that their only requirement is being in the load path. Users might then use Rubygems to get your library in the load path, or they might check it out of git and add it themselves.

It sounds pretty straight-forward but there are a few common pitfalls:

Using gem inside your gems

It’s reasonably common to see code like this inside of a gem:

gem "extlib", ">= 1.0.8"
require "extlib"

This should be entirely unnecessary. While using Kernel.gem in an application makes perfect sense, gems themselves should use their gem specification to provide dependent versions. When used with Rubygems, Rubygems will automatically add the appropriate dependencies to the load path. When not using Rubygems, the users can add the dependencies themselves.

Keep in mind that whether or not you use Rubygems, you can use require and it will do the right thing. If the file is in the load path (because you put it there or because Rubygems put it there), it will just work. If it’s not in the loadpath, Rubygems will look for a matching gem to add to the load path (by overriding require).

Rescuing from Gem::LoadError

This idiom is also reasonably common:

begin
  gem "my_gem", ">= 1.0.6"
  require "my_gem"
rescue Gem::LoadError
  # handle the error somehow
end

The right solution here is to avoid the gem call, as I said above, and rescue from plain LoadError. The Rubygems runtime sometimes raises Gem::LoadError, but that inherits from regular LoadError, so you’re free to rescue from that and catch cases with and without the rubygems runtime.

Conclusion

Declare you gem version dependencies in your gem specification and use simple requires in your library. If you need to catch the case where the dependency could not be found, rescue from LoadError.

And that’s all there is to it. Your library will work fine with or without the Rubygems runtime :)

Rails 3: The Great Decoupling

In working on Rails 3 over the past 6 months, I have focsed rather extensively on decoupling components from each other.

Why should ActionController care whether it’s talking to ActionView or just something that duck-types like ActionView? Of course, the key to making this work well is to keep the interfaces between components as small as possible, so that implementing an ActionView lookalike is a matter of implementing just a few methods, not dozens.

While I was preparing for my talk at RubyKaigi, I was trying to find the smallest possible examples that demonstrate some of this stuff. It went really well, but I noticed a few areas that could be improved even further, producing an even more compelling demonstration.

This weekend, I focused on cleaning up those interfaces, so we have small and clearly documented mechanisms for interfacing with Rails components. I want to focus on ActionView in this post, which I’ll demonstrate with an example.

$:.push "rails/activesupport/lib"
$:.push "rails/actionpack/lib"
 
require "action_controller"
 
class Kaigi < ActionController::Http
  include AbstractController::Callbacks
  include ActionController::RackConvenience
  include ActionController::Renderer
  include ActionController::Layouts
  include ActionView::Context
 
  before_filter :set_name
  append_view_path "views"
 
  def _action_view
    self
  end
 
  def controller
    self
  end
 
  DEFAULT_LAYOUT = Object.new.tap {|l| def l.render(*) yield end }
 
  def _render_template_from_controller(template, layout = DEFAULT_LAYOUT, options = {}, partial = false)
    ret = template.render(self, {})
    layout.render(self, {}) { ret }
  end
 
  def index
    render :template => "template"
  end
 
  def alt
    render :template => "template", :layout => "alt"
  end
 
  private
  def set_name
    @name = params[:name]
  end
end
 
app = Rack::Builder.new do
  map("/kaigi") {  run Kaigi.action(:index) }
  map("/kaigi/alt") { run Kaigi.action(:alt) }
end.to_app
 
Rack::Handler::Mongrel.run app, :Port => 3000

There’s a bunch going on here, but the important thing is that you can run this file with just ruby, and it’ll serve up /kaigi and /kaigi/alt. It will serve templates from the local “/views” directory, and correctly handle before filters just fine.

Let’s look at this a piece at a time:

$:.push "rails/activesupport/lib"
$:.push "rails/actionpack/lib"
 
require "action_controller"

This is just boilerplace. I symlinked rails to a directory under this file and required action_controller. Note that simply requiring ActionController is extremely cheap — no features have been used yet

class Kaigi < ActionController::Http
  include AbstractController::Callbacks
  include ActionController::RackConvenience
  include ActionController::Renderer
  include ActionController::Layouts
  include ActionView::Context
end

I inherited my class from ActionController::Http. I then included a number of features, include Rack convenience methods (request/response), the Renderer, and Layouts. I also made the controller itself the view context. I will discuss this more in just a moment.

  before_filter :set_name

This is the normal Rail before_filter. I didn’t need to do anything else to get this functionality other than include AbstractController::Callbacks

  append_view_path "views"

Because we’re not in a Rails app, our view paths haven’t been pre-populated. No problem: it’s just a one-liner to set them ourselves.

The next part is the interesting part. In Rails 3, while ActionView::Base remains the default view context, the interface between ActionController and ActionView is extremely well defined. Specifically:

  • A view context must include ActionView::Context. This just adds the compiled templates, so they can be called from the context
  • A view context must provide a _render_template_from_controller method, which takes a template object, a layout, and additional options
  • A view context may optionally also provide a _render_partial_from_controller, to handle render :partial => @some_object
  • In order to use ActionView::Helpers, a view context must have a pointer back to its original controller

That’s it! That’s the entire ActionController<=>ActionView interface.

  def _action_view
    self
  end
 
  def controller
    self
  end

Here, we specify that the view context is just self, and define controller, required by view contexts. Effectively, we have merged the controller and view context (mainly just to see if it could be done ;) )

  DEFAULT_LAYOUT = Object.new.tap {|l| def l.render(*) yield end }

Next, we make a default layout. This is just a simple proc that provides a render method that yields to the block. It will simplify:

  def _render_template_from_controller(template, layout = DEFAULT_LAYOUT, options = {}, partial = false)
    ret = template.render(self, {})
    layout.render(self, {}) { ret }
  end

Here, we supply the required _render_template_from_controller. The template object that is passed in is a standard Rails Template which has a render method on it. That method takes the view context and any locals. For this example, we pass in self as the view context, and do not provide any locals. Next, we call render on the layout, passing in the return value of template.render. The reason we created a default is to make the case of a layout identical to the case without.

  def index
    render :template => "template"
  end
 
  def alt
    render :template => "template", :layout => "alt"
  end
 
  private
  def set_name
    @name = params[:name]
  end

This is a standard Rails controller.

app = Rack::Builder.new do
  map("/kaigi") {  run Kaigi.action(:index) }
  map("/kaigi/alt") { run Kaigi.action(:alt) }
end.to_app
 
Rack::Handler::Mongrel.run app, :Port => 3000

Finally, rather than use the Rails router, we just wire the controller up directly using Rack. In Rails 3, ControllerName.action(:action_name) returns a rack-compatible endpoint, so we can wire them up directly.

And that’s all there is to it!

Note: I’m not sure if I still need to say this, but stuff like this is purely a demonstration of the power of the internals, and does not reflect changes to the public API or the way people use Rails by default. Everyone on the Rails team is strongly committed to retaining the same excellent startup experience and set of good conventional defaults. That will not be changing in 3.0.

What do we need to get on Ruby 1.9?

A year ago, I was very skeptical of Ruby 1.9. There were a lot of changes in it, and it seemed like it was going to be a mammoth job to get things running on it. The benefits did not seem to outweigh the costs of switching, especially since Ruby 1.9 was not yet adequately stable to justify the big switch.

At this point, however, it seems as though Ruby 1.9 has stabilized (with 1.9.2 on the horizon), and there are some benefits that seem to obviously justify a switch (such as fast, integrated I18n, better performance in general, blocks that can have default arguments and take blocks, etc.).

Perhaps more importantly though, Ruby’s language implementors have shifted their focus to Ruby 1.9. It has become increasingly difficult to get enhancements in Ruby 1.8, because it is no longer trunk Ruby. Getting community momentum behind Ruby 1.9 would enable us to make productive suggestions to Matz and the other language implementors. Instead, we seem to get a new monthly patch fixing Ruby 1.8.

So my question is: what do we as a community need to shift momentum to 1.9. I’m don’t want a generic answer, like “we need to feel good about it”. I’m asking you what is stopping you today from using Ruby 1.9 for your next project. Is there a library that doesn’t work? Is there a new language feature that causes so much disruption to your existing programming patterns to make a switch untenable?

I suspect that we are all just comfortable in Ruby 1.8, but would actually be mostly fine upgrading to Ruby 1.9. I also suspect that there are small issues I’m not personally aware of, but which have blocked some of you from upgrading. Rails 2.3 and 3.0 (edge) work fine on Ruby 1.9, and I’d like to see what we can do to make Ruby 1.9 a good recommended option for new projects.

Thoughts?

Rails Bundling — Revisited

One of the things I spent quite a bit of time on in Merb was trying to get a worker gem bundler that we could be really proud of. Merb had particular problems because of the sheer amount of gems with webbed dependencies, so we hit some limitations of Rubygems quite early. Eventually, we settled on a solution with the following characteristics:

  • A single dependencies file that listed out the required dependencies for the application
  • Only required that the gems you cared about were listed. All dependencies of those gems were resolved automatically
  • A task to go through the dependencies, get the appropriate gems, and install them (merb:gem:install)
  • Gems were installed in a standard rubygems structure inside the application, so normal Rubygems could activate and run them
  • Only the .gem files were stored in source control. These files were expanded out to their full structures on each new machine (via the merb:gem:redeploy task). This allowed us to support native gems without any additional trouble
  • When gems were re-expanded, they took into consideration gems that were already present, meaning that running the deployment task when there were no new gems added to the repo took no time at all (so it could just be added to the normal cap task).

Most importantly, the Merb bundling system relied on a mandatory one-version-per-gem rule that was enforced by keeping the dependencies file in sync with the .gem files in gems/cache. In other words, it would be impossible to have leftover gems or gem activation problems with this system.

There were, however, some flaws. First of all, it was a first pass, before we knew Rubygems all that well. As a result, the code is more clumsy than it needed to be to achieve the task in question. Second, it was coupled to Merb’s dependencies DSL and runtime loader (as well as thor), making it somewhat difficult to port to Rails cleanly. It has since been ported, but it is not really possible to maintain the underlying bundling bits independent of the Rails/Merb parts.

Most importantly, while we did solve the problem of conflicting gems to a reasonable extent, it was still somewhat possible to get into a conflicting state at installation time, even if a possible configuration could be found.

For Rails, we’ve discussed hoisting as much of this stuff as possible into Rubygems itself or a standard library that Rails could interact with, that could also be used by others who wished to bundle gems in with an application. And we have a number of projects at Engine Yard that could benefit from a standard bundler that was not directly coupled with Rails or Merb.

It’s too early to really use it for anything, but Carl and I have made a lot of progress on a gem bundler along these lines. A big part of the reason this is possible is a project I worked on with Tim Carey-Smith a while back (he really did most of the work) called GemResolver. GemResolver takes a set of dependencies and a gem source index and returns back a list of all of the gems, including their dependencies, that need to be installed to satisfy the original list. It does a search of all options, so even if the simple solution would have resulted in the dreaded activation error, it will still be able to find a solution if one exists.

Unlike the Merb bundler, the new bundler does not assume a particular DSL for specifying dependencies, making it suitable for use with Rails, Merb or other projects that wish to have their own DSL for interacting with the bundler. It works as follows:

  • A Manifest object that receives a list of Rubygems sources and dependencies for the application
  • The bundler then fetches the full gem list from each of the sources and resolves the dependencies using GemResolver (which we have merged into the bundler)
  • Once the list is determined, each of the .gem files is retrieved from their sources and stashed
  • Next, each gem is installed, without the need to download their dependencies, since the resolution process has already occurred. This guarantees a single gem per version and a working environment that will not produce activation errors in any circumstance
  • This second step can be run in isolation from the first, so it is possible to expand the gems on remote machines. This means that you can store just the necessary .gem files in version control, and be entirely isolated from network dependencies for deployments
  • Both the fetching and installation steps will not clobber existing .gem files or installed gems, so if there are no new gems, those steps take no time
  • After installation is complete, environment-specific load-path files are created, which means:
  • The bundler will be able to work with or without Rubygems, even though the installed gems are still inside a normal Rubygems structure.

I am providing all this detail for the curious. In the end, as a user, your experience will be quite simple:

  1. List out your dependencies, including what environments those dependencies should be used in
  2. Run rake gem:install
  3. Run your Rails app

In other words, quite similar to the existing gem bundling solution, with fewer warts, and a standard system that you can use outside of Rails if you want to.

New Rails Isolation Testing

A little while ago, Carl and I starting digging into Rails’ initializer. We already made a number of improvements, such as adding the ability to add a new initializer at any step in the process, and to make it possible to have multiple initializers in a single process. The second improvement is the first step toward running multiple Rails apps in a single process, which requires moving all global Rails state into instances of objects, so each application can have its own contained configuration in its own object. More on this in the next few weeks.

As I detailed on the Engine Yard blog this week, when moving into a new area to refactor, it’s important to make sure that there are very good tests. Although the Rails initializer tests covered a fair amount of area, successfully getting the tests to pass did not guarantee that Rails booted. Thankfully, Sam Ruby’s tests were comprehensive enough to get us through the initial hump.

After making the initial change, we went back to see what we could do to improve the test suite. The biggest problem was a problem we’d already encountered in Merb: you can’t uninitialize Rails. Once you’ve run through the initialization process, many of the things that happen are permanent.

Our solution, which we committed to master today, is to create a new test mixin that runs each test case in its own process. Getting it working on OSX wasn’t trivial, but it was pretty elegant once we got down to it. All we did was override the run method on TestCase to fork before actually running the test. The child then runs the test (and makes whatever invasive changes it needs to), and communicates any methods that were called on the Test::Unit result object back to the parent.

The parent then replays those methods, which means that as far as the parent is concerned, all of the cases are part of a single suite, even though they are being run in a separate process. Figuring out what parts of Test::Unit to hook into took all of yesterday afternoon, but once we were done, it was only about 40 lines of code.

Today, we tackled getting the same module to work in environments that don’t support forking, like JRuby and Windows. Unfortunately, these environments are going to run these tests significantly more slowly, because they have to boot up a full process for each test case, where the forking version can simply use the setup already done in the parent process (which makes it almost as fast as running all the tests in a single process).

The solution was to emulate forking by shelling out to a new process that was identical to the one that was just launched, but with an extra constraint on the test name (basically, booting up the test suite multiple times, but each run only runs a single test). The subprocess then communicates back to the launching process using the same protocol as in the forking model, which means that we only had to change the code that ran the tests in isolation; everything else remains the same.

There was one final caveat, however. It turns out that in Test::Unit, using a combination of -t to specify the test case class and -n to specify the test case name doesn’t work. Test::Unit’s semantics are to include any test for which ANY of the appropriate filters match. I’m not proud of this, but what we did was a small monkey-patch of the Test::Unit collector in the subprocess only which does the right thing:

# Only in subprocess for windows / jruby.
if ENV['ISOLATION_TEST']
  require "test/unit/collector/objectspace"
  class Test::Unit::Collector::ObjectSpace
    def include?(test)
      super && test.method_name == ENV['ISOLATION_TEST']
    end
  end
end

Not great, but all in all, not all that much code (the entire module, including both forking and subprocess methods is just 98 lines of code).

A crazy couple of days yielding a pretty epic hack, but it works!

Archives

Categories

Meta