Yehuda Katz is a member of the Ember.js, Ruby on Rails and jQuery Core Teams; he spends his daytime hours at the startup he founded, Tilde Inc.. Yehuda is co-author of best-selling jQuery in Action and Rails 3 in Action. He spends most of his time hacking on open source—his main projects, like Thor, Handlebars and Janus—or traveling the world doing evangelism work. He can be found on Twitter as @wycats and on Github.

Ruby 2.0 Refinements in Practice

First Shugo announced them at RubyKaigi. Then Matz showed some improved syntax at RubyConf. But what are refinements all about, and what would they be used for?

The first thing you need to understand is that the purpose of refinements in Ruby 2.0 is to make monkey-patching safer. Specifically, the goal is to make it possible to extend core classes, but to limit the effect of those extensions to a particular area of code. Since the purpose of this feature is make monkey-patching safer, let’s take a look at a dangerous case of monkey-patching and see how this new feature would improve the situation.

A few months ago, I encountered a problem where some accidental monkey-patches in Right::AWS conflicted with Rails’ own monkey-patches. In particular, here is their code:

unless defined? ActiveSupport::CoreExtensions
  class String #:nodoc:
    def camelize()
      self.dup.split(/_/).map{ |word| word.capitalize }.join('')
    end
  end
end

Essentially, Right::AWS is trying to make a few extensions available to itself, but only if they were not defined by Rails. In that case, they assume that the Rails version of the extension will suffice. They did this quite some time ago, so these extensions represent a pretty old version of Rails. They assume (without any real basis), that every future version of ActiveSupport will return an expected vaue from camelize.

Unfortunately, Rails 3 changed the internal organization of ActiveSupport, and removed the constant name ActiveSupport::CoreExtensions. As a result, these monkey-patches got activated. Let’s take a look at what the Rails 3 version of the camelize helper looks like:

class String
  def camelize(first_letter = :upper)
    case first_letter
      when :upper then ActiveSupport::Inflector.camelize(self, true)
      when :lower then ActiveSupport::Inflector.camelize(self, false)
    end
  end
end
 
module ActiveSupport
  module Inflector
    extend self
 
    def camelize(lower_case_and_underscored_word, first_letter_in_uppercase = true)
      if first_letter_in_uppercase
        lower_case_and_underscored_word.to_s.gsub(/\/(.?)/) { "::#{$1.upcase}" }.gsub(/(?:^|_)(.)/) { $1.upcase }
      else
        lower_case_and_underscored_word.to_s[0].chr.downcase + camelize(lower_case_and_underscored_word)[1..-1]
      end
    end
  end
end

There are a few differences here, but the most important one is that in Rails, “foo/bar” becomes “Foo::Bar”. The Right::AWS version converts that same input into “Foo/bar”.

Now here’s the wrinkle. The Rails router uses camelize to convert controller paths like “admin/posts” to “Admin::Posts”. Because Right::AWS overrides camelize with this (slightly) incompatible implementation, the Rails router ends up trying to find an “Admin/posts” constant, which Ruby correctly complains isn’t a valid constant name. While situations like this are rare, it’s mostly because of an extremely diligent library community, and a general eschewing of applying these kinds of monkey-patches in library code. In general, Right::AWS should have done something like Right::Utils.camelize in their code to avoid this problem.

Refinements allow us to make these kinds of aesthetically pleasing extensions for our own code with the guarantee that they will not affect any other Ruby code.

First, instead of directly reopening the String class, we would create a refinement in the ActiveSupport module:

module ActiveSupport
  refine String do
    def camelize(first_letter = :upper)
      case first_letter
        when :upper then ActiveSupport::Inflector.camelize(self, true)
        when :lower then ActiveSupport::Inflector.camelize(self, false)
      end
    end
  end
end

What we have done here is define a String refinement that we can activate elsewhere with the using method. Let’s use the refinement in the router:

module ActionDispatch
  module Routing
    class RouteSet
      using ActiveSupport
 
      def controller_reference(controller_param)
        unless controller = @controllers[controller_param]
          controller_name = "#{controller_param.camelize}Controller"
          controller = @controllers[controller_param] =
            ActiveSupport::Dependencies.ref(controller_name)
        end
        controller.get
      end      
    end
  end
end

It’s important to note that the refinement only applies to methods physically inside the same block. It will not apply to other methods in ActionDispatch::Routing::RouteSet defined in a different block. This means that we can use different refinements for different groups of methods in the same class, by defining the methods in different class blocks, each with their own refinements. So if I reopened the RouteSet class somewhere else:

module ActionDispatch
  module Routing
    class RouteSet
      using RouterExtensions
 
      # I can define a special version of camelize that will be used
      # only in methods defined in this physical block
      def route_name(name)
        name.camelize
      end
    end
  end
end

Getting back to the real-life example, even though Right::AWS created a global version of camelize, the ActiveSupport version (applied via using ActiveSupport) will be used. This means that we are guaranteed that our code (and only our code) uses the special version of camelize.

It’s also important to note that only explicit calls to camelize in the physical block will use the special version. For example, let’s imagine that some library defines a global method called constantize, and uses a camelize refinement:

module Protection
  refine String do
    def camelize()
      self.dup.split(/_/).map{ |word| word.capitalize }.join('')
    end
  end
end
 
class String #:nodoc:
  using Protection
 
  def constantize
    Object.module_eval("::#{camelize}", __FILE__, __LINE__)
  end
end

Calling String#constantize anywhere will internally call the String#camelize from the Protection refinement to do some of its work. Now let’s say we create a String refinement with an unusual camelize method:

module Wycats
  refine String do
    def camelize
      result = dup.split(/_/).map(&:capitalize).join
      "_#{result}_"
    end
  end
end
 
module Factory
  using Wycats
 
  def self.create(class_name, string)
    klass = class_name.constantize
    klass.new(string.camelize)
  end
end
 
class Person
  def initialize(string)
    @string = string
  end
end
 
Factory.create("Person", "wycats")

Here, the Wycats refinement should not leak into the call to constantize. If it did, it would mean that any call into any method could leak a refinement into that method, which is the opposite of the purpose of the feature. Once you realize that refinements apply lexically, they create a very orderly, easy to understand way to apply targeted monkey patches to an area of code.

In my opinion, the most important feature of refinements is that you can see the refinements that apply to a chunk of code (delineated by a physical class body). This allows you to be sure that the changes you are making only apply where you want them to apply, and makes refinements a real solution to the general problem of wanting aesthetically pleasing extensions with the guarantee that you can’t break other code. In addition, refinements protect diligent library authors even when other library authors (or app developers) make global changes, which makes it possible to use the feature without system-wide adoption. I, for one, am looking forward to it.

Postscript

There is one exception to the lexical rule, which is that refinements are inherited from the calling scope when using instance_eval. This actually gives rise to some really nice possibilities, which I will explore in my next post.

10 Responses to “Ruby 2.0 Refinements in Practice”

I agree the refinements patch looks interesting, but this comment caught my eye:

“In general, Right::AWS should have done something like Right::Utils.camelize in their code to avoid this problem.”

Why was the onus on Right::AWS to use Right::Utils.camelize and not on ActiveSupport to do something similar?

Excellent idea! This will be very useful. Reminds me of Scala’s implicits, except that instead of using “using”, one would simply import the module that defined the refinement.

In my mind, in Ruby < 2.0, there's a category of library which is "provide a number of useful core extensions." The three major ones are ActiveSupport, Extlib and Facets. In general, applications need to choose one, but not more of these libraries to avoid conflicts. These libraries are designed to be used in applications, and things get a bit murky when they are used as dependencies of other standalone libraries. For instance, using DataMapper with Rails was a problem for a while because of conflicts between Extlib and ActiveSupport. They eventually removed the dependency on Extlib, but it definitely illustrates the larger systemic issue that refinements are trying to solve.

On the other hand, Right::AWS is not a library that provides core extensions, nor does it depend on such a library (if it did, it would end up in the same place as DataMapper and ActiveMerchant, which is an unsatisfying but somewhat more grey area). Since it only creates the helpers for its own use, it should use utility methods to avoid the problem described in this post.

OK, I believe in the use case now. I think the syntax would make more sense if #using took a block (`using Foo do … end` and the contents of the block were what got refined (is that the right way to say it?). Or if it were a keyword construct like `if` and the scope would be like `using Foo … end`. I’m thinking more about implementation than syntactic sugar, so I’ll defer to Charlie and Evan.

It seems like this will solve one of the uglier things in Rails – the leakage of internal AssociationProxy methods that makes them visible outside the proxy. charity.campaign.target should be the target attribute of the campaign, not the associated model of the campaign association proxy.

I posted my concerns about refinements to the ruby-core list:

http://groups.google.com/group/ruby-core-google/msg/7ccc375905dda23c

In short, I think they need to be trully lexically scoped, or there will be intractable performance and/or concurrency issues that result. It should not be possible for a given scope to become “refined” long after it has been parsed, as in the instance_eval cases.

“It’s important to note that the refinement only applies to methods physically inside the same block.”

Actually no, it applies to all methods of the class or its subclasses that are defined after the “using” is evaluated, which may be at least partly a bug.

I suspect that when the dust settles, it will end up working like you described. Anything else is too chaotic.

@James Healy I think it is more of that people expect a certain behavior when they run activesupport and as you see Right::AWS was expecting that the person would probably be running activesupport and they would have that method available. If it did not detect that, it opened String to put camelize in for its own usage (and now globally add it to String) which is not what someone running this gem would expect. When people run activesupport they expect the class to be opened and methods to be added.

If Right::AWS depended on camelize for its own usage it should either depend on activesupport or implement it itself in a util module. (as it was looking for activesupport) I personally think activesupport should not have a monopoly on opening classes for itself either. When I use activesupport I am very careful to require only what I need because I do not like how it opens a lot of classes.

I like the article and discussion, but it may be misleading since it is redefining what ActiveSupport really is.

“A toolkit of support libraries and Ruby core extensions extracted from the Rails framework.”

Having to “use” a bunch of AS modules in everything would be really annoying, and introduce a lot of duplication in code (and maybe slow things down). Liking or disliking that idea is outside of the scope of this (probably).

But for the proposed changes to ruby core by adding “refinements”, I like that idea. I find the syntax strange, since what you are doing isn’t really “refining” a class, it’s like extending it in most cases. I really don’t care too much about what it’s called, just a thought. I like the #used callback and the #using syntax since it will clearly show that a module is being added that changes other classes.

I would love to see refine work on singleton classes as far as I can tell right now they do not.

class FooController < ApplicationController

before_filter do
singleton_class.class_eval %{
refine Foo do

def bar
"beer"
end

end
}
end

def index
render text: Foo.new.bar # returns "baz", but should return "beer"
end

end

This is a cool feature. Objective-C has also long suffered from the monkey patching issue. Hopefully it can be implemented efficiently. I’m also looking forward to named arguments. That will make MacRuby and RubyMotion feel more natural.

Long term, I hope Ruby finds some ways to increase performance for lower level programming. I’d like to use Ruby for everything. It is getting faster, but still not ideal for some types of algorithms. I would also love to see compile time (or load time when interpreted) eval expressions as another option when implementing DSLs. That could improve performance significantly for many DSLs.

My pipe dream is to see Ruby become more of a multi-paradigm language. Smalltalk style programming should always be the core, but adding let expressions (and guards) would be cool. I could envision them defined in classes, modules, and methods. Those expressions could have a Haskell-style static type system. Static typing at the expression level would not interfere with standard duck typing. A Haskell-style type system for expressions could improve performance and abstraction at the expression level. For performance reasons, there would probably need to be limitations on re-definition. I think it would be the best of both worlds. Kind of like OCaml in reverse.

Leave a Reply

Archives

Categories

Meta