Yehuda Katz is a member of the Ember.js, Ruby on Rails and jQuery Core Teams; he spends his daytime hours at the startup he founded, Tilde Inc.. Yehuda is co-author of best-selling jQuery in Action and Rails 3 in Action. He spends most of his time hacking on open source—his main projects, like Thor, Handlebars and Janus—or traveling the world doing evangelism work. He can be found on Twitter as @wycats and on Github.

Archive for the ‘Ruby on Rails’ Category

Rubygems Good Practice

Rubygems provides two things for the Ruby community.

  1. A remote repository/packaging format and installer
  2. A runtime dependency manager

The key to good rubygems practice is to treat these two elements of Rubygems as separate from each other. Someone might use the Rubygems packaging format and the Rubygems distribution but not want to use the Rubygems runtime.

And why should they? The Rubygems runtime is mainly responsible for setting up the appropriate load-paths, and if you are able to get the load paths set up correctly, why should you care about Rubygems at all?

In other words, you should write your libraries so that their only requirement is being in the load path. Users might then use Rubygems to get your library in the load path, or they might check it out of git and add it themselves.

It sounds pretty straight-forward but there are a few common pitfalls:

Using gem inside your gems

It’s reasonably common to see code like this inside of a gem:

gem "extlib", ">= 1.0.8"
require "extlib"

This should be entirely unnecessary. While using Kernel.gem in an application makes perfect sense, gems themselves should use their gem specification to provide dependent versions. When used with Rubygems, Rubygems will automatically add the appropriate dependencies to the load path. When not using Rubygems, the users can add the dependencies themselves.

Keep in mind that whether or not you use Rubygems, you can use require and it will do the right thing. If the file is in the load path (because you put it there or because Rubygems put it there), it will just work. If it’s not in the loadpath, Rubygems will look for a matching gem to add to the load path (by overriding require).

Rescuing from Gem::LoadError

This idiom is also reasonably common:

begin
  gem "my_gem", ">= 1.0.6"
  require "my_gem"
rescue Gem::LoadError
  # handle the error somehow
end

The right solution here is to avoid the gem call, as I said above, and rescue from plain LoadError. The Rubygems runtime sometimes raises Gem::LoadError, but that inherits from regular LoadError, so you’re free to rescue from that and catch cases with and without the rubygems runtime.

Conclusion

Declare you gem version dependencies in your gem specification and use simple requires in your library. If you need to catch the case where the dependency could not be found, rescue from LoadError.

And that’s all there is to it. Your library will work fine with or without the Rubygems runtime :)

Rails 3: The Great Decoupling

In working on Rails 3 over the past 6 months, I have focsed rather extensively on decoupling components from each other.

Why should ActionController care whether it’s talking to ActionView or just something that duck-types like ActionView? Of course, the key to making this work well is to keep the interfaces between components as small as possible, so that implementing an ActionView lookalike is a matter of implementing just a few methods, not dozens.

While I was preparing for my talk at RubyKaigi, I was trying to find the smallest possible examples that demonstrate some of this stuff. It went really well, but I noticed a few areas that could be improved even further, producing an even more compelling demonstration.

This weekend, I focused on cleaning up those interfaces, so we have small and clearly documented mechanisms for interfacing with Rails components. I want to focus on ActionView in this post, which I’ll demonstrate with an example.

$:.push "rails/activesupport/lib"
$:.push "rails/actionpack/lib"
 
require "action_controller"
 
class Kaigi < ActionController::Http
  include AbstractController::Callbacks
  include ActionController::RackConvenience
  include ActionController::Renderer
  include ActionController::Layouts
  include ActionView::Context
 
  before_filter :set_name
  append_view_path "views"
 
  def _action_view
    self
  end
 
  def controller
    self
  end
 
  DEFAULT_LAYOUT = Object.new.tap {|l| def l.render(*) yield end }
 
  def _render_template_from_controller(template, layout = DEFAULT_LAYOUT, options = {}, partial = false)
    ret = template.render(self, {})
    layout.render(self, {}) { ret }
  end
 
  def index
    render :template => "template"
  end
 
  def alt
    render :template => "template", :layout => "alt"
  end
 
  private
  def set_name
    @name = params[:name]
  end
end
 
app = Rack::Builder.new do
  map("/kaigi") {  run Kaigi.action(:index) }
  map("/kaigi/alt") { run Kaigi.action(:alt) }
end.to_app
 
Rack::Handler::Mongrel.run app, :Port => 3000

There’s a bunch going on here, but the important thing is that you can run this file with just ruby, and it’ll serve up /kaigi and /kaigi/alt. It will serve templates from the local “/views” directory, and correctly handle before filters just fine.

Let’s look at this a piece at a time:

$:.push "rails/activesupport/lib"
$:.push "rails/actionpack/lib"
 
require "action_controller"

This is just boilerplace. I symlinked rails to a directory under this file and required action_controller. Note that simply requiring ActionController is extremely cheap — no features have been used yet

class Kaigi < ActionController::Http
  include AbstractController::Callbacks
  include ActionController::RackConvenience
  include ActionController::Renderer
  include ActionController::Layouts
  include ActionView::Context
end

I inherited my class from ActionController::Http. I then included a number of features, include Rack convenience methods (request/response), the Renderer, and Layouts. I also made the controller itself the view context. I will discuss this more in just a moment.

  before_filter :set_name

This is the normal Rail before_filter. I didn’t need to do anything else to get this functionality other than include AbstractController::Callbacks

  append_view_path "views"

Because we’re not in a Rails app, our view paths haven’t been pre-populated. No problem: it’s just a one-liner to set them ourselves.

The next part is the interesting part. In Rails 3, while ActionView::Base remains the default view context, the interface between ActionController and ActionView is extremely well defined. Specifically:

  • A view context must include ActionView::Context. This just adds the compiled templates, so they can be called from the context
  • A view context must provide a _render_template_from_controller method, which takes a template object, a layout, and additional options
  • A view context may optionally also provide a _render_partial_from_controller, to handle render :partial => @some_object
  • In order to use ActionView::Helpers, a view context must have a pointer back to its original controller

That’s it! That’s the entire ActionController<=>ActionView interface.

  def _action_view
    self
  end
 
  def controller
    self
  end

Here, we specify that the view context is just self, and define controller, required by view contexts. Effectively, we have merged the controller and view context (mainly just to see if it could be done ;) )

  DEFAULT_LAYOUT = Object.new.tap {|l| def l.render(*) yield end }

Next, we make a default layout. This is just a simple proc that provides a render method that yields to the block. It will simplify:

  def _render_template_from_controller(template, layout = DEFAULT_LAYOUT, options = {}, partial = false)
    ret = template.render(self, {})
    layout.render(self, {}) { ret }
  end

Here, we supply the required _render_template_from_controller. The template object that is passed in is a standard Rails Template which has a render method on it. That method takes the view context and any locals. For this example, we pass in self as the view context, and do not provide any locals. Next, we call render on the layout, passing in the return value of template.render. The reason we created a default is to make the case of a layout identical to the case without.

  def index
    render :template => "template"
  end
 
  def alt
    render :template => "template", :layout => "alt"
  end
 
  private
  def set_name
    @name = params[:name]
  end

This is a standard Rails controller.

app = Rack::Builder.new do
  map("/kaigi") {  run Kaigi.action(:index) }
  map("/kaigi/alt") { run Kaigi.action(:alt) }
end.to_app
 
Rack::Handler::Mongrel.run app, :Port => 3000

Finally, rather than use the Rails router, we just wire the controller up directly using Rack. In Rails 3, ControllerName.action(:action_name) returns a rack-compatible endpoint, so we can wire them up directly.

And that’s all there is to it!

Note: I’m not sure if I still need to say this, but stuff like this is purely a demonstration of the power of the internals, and does not reflect changes to the public API or the way people use Rails by default. Everyone on the Rails team is strongly committed to retaining the same excellent startup experience and set of good conventional defaults. That will not be changing in 3.0.

What do we need to get on Ruby 1.9?

A year ago, I was very skeptical of Ruby 1.9. There were a lot of changes in it, and it seemed like it was going to be a mammoth job to get things running on it. The benefits did not seem to outweigh the costs of switching, especially since Ruby 1.9 was not yet adequately stable to justify the big switch.

At this point, however, it seems as though Ruby 1.9 has stabilized (with 1.9.2 on the horizon), and there are some benefits that seem to obviously justify a switch (such as fast, integrated I18n, better performance in general, blocks that can have default arguments and take blocks, etc.).

Perhaps more importantly though, Ruby’s language implementors have shifted their focus to Ruby 1.9. It has become increasingly difficult to get enhancements in Ruby 1.8, because it is no longer trunk Ruby. Getting community momentum behind Ruby 1.9 would enable us to make productive suggestions to Matz and the other language implementors. Instead, we seem to get a new monthly patch fixing Ruby 1.8.

So my question is: what do we as a community need to shift momentum to 1.9. I’m don’t want a generic answer, like “we need to feel good about it”. I’m asking you what is stopping you today from using Ruby 1.9 for your next project. Is there a library that doesn’t work? Is there a new language feature that causes so much disruption to your existing programming patterns to make a switch untenable?

I suspect that we are all just comfortable in Ruby 1.8, but would actually be mostly fine upgrading to Ruby 1.9. I also suspect that there are small issues I’m not personally aware of, but which have blocked some of you from upgrading. Rails 2.3 and 3.0 (edge) work fine on Ruby 1.9, and I’d like to see what we can do to make Ruby 1.9 a good recommended option for new projects.

Thoughts?

Rails Bundling — Revisited

One of the things I spent quite a bit of time on in Merb was trying to get a worker gem bundler that we could be really proud of. Merb had particular problems because of the sheer amount of gems with webbed dependencies, so we hit some limitations of Rubygems quite early. Eventually, we settled on a solution with the following characteristics:

  • A single dependencies file that listed out the required dependencies for the application
  • Only required that the gems you cared about were listed. All dependencies of those gems were resolved automatically
  • A task to go through the dependencies, get the appropriate gems, and install them (merb:gem:install)
  • Gems were installed in a standard rubygems structure inside the application, so normal Rubygems could activate and run them
  • Only the .gem files were stored in source control. These files were expanded out to their full structures on each new machine (via the merb:gem:redeploy task). This allowed us to support native gems without any additional trouble
  • When gems were re-expanded, they took into consideration gems that were already present, meaning that running the deployment task when there were no new gems added to the repo took no time at all (so it could just be added to the normal cap task).

Most importantly, the Merb bundling system relied on a mandatory one-version-per-gem rule that was enforced by keeping the dependencies file in sync with the .gem files in gems/cache. In other words, it would be impossible to have leftover gems or gem activation problems with this system.

There were, however, some flaws. First of all, it was a first pass, before we knew Rubygems all that well. As a result, the code is more clumsy than it needed to be to achieve the task in question. Second, it was coupled to Merb’s dependencies DSL and runtime loader (as well as thor), making it somewhat difficult to port to Rails cleanly. It has since been ported, but it is not really possible to maintain the underlying bundling bits independent of the Rails/Merb parts.

Most importantly, while we did solve the problem of conflicting gems to a reasonable extent, it was still somewhat possible to get into a conflicting state at installation time, even if a possible configuration could be found.

For Rails, we’ve discussed hoisting as much of this stuff as possible into Rubygems itself or a standard library that Rails could interact with, that could also be used by others who wished to bundle gems in with an application. And we have a number of projects at Engine Yard that could benefit from a standard bundler that was not directly coupled with Rails or Merb.

It’s too early to really use it for anything, but Carl and I have made a lot of progress on a gem bundler along these lines. A big part of the reason this is possible is a project I worked on with Tim Carey-Smith a while back (he really did most of the work) called GemResolver. GemResolver takes a set of dependencies and a gem source index and returns back a list of all of the gems, including their dependencies, that need to be installed to satisfy the original list. It does a search of all options, so even if the simple solution would have resulted in the dreaded activation error, it will still be able to find a solution if one exists.

Unlike the Merb bundler, the new bundler does not assume a particular DSL for specifying dependencies, making it suitable for use with Rails, Merb or other projects that wish to have their own DSL for interacting with the bundler. It works as follows:

  • A Manifest object that receives a list of Rubygems sources and dependencies for the application
  • The bundler then fetches the full gem list from each of the sources and resolves the dependencies using GemResolver (which we have merged into the bundler)
  • Once the list is determined, each of the .gem files is retrieved from their sources and stashed
  • Next, each gem is installed, without the need to download their dependencies, since the resolution process has already occurred. This guarantees a single gem per version and a working environment that will not produce activation errors in any circumstance
  • This second step can be run in isolation from the first, so it is possible to expand the gems on remote machines. This means that you can store just the necessary .gem files in version control, and be entirely isolated from network dependencies for deployments
  • Both the fetching and installation steps will not clobber existing .gem files or installed gems, so if there are no new gems, those steps take no time
  • After installation is complete, environment-specific load-path files are created, which means:
  • The bundler will be able to work with or without Rubygems, even though the installed gems are still inside a normal Rubygems structure.

I am providing all this detail for the curious. In the end, as a user, your experience will be quite simple:

  1. List out your dependencies, including what environments those dependencies should be used in
  2. Run rake gem:install
  3. Run your Rails app

In other words, quite similar to the existing gem bundling solution, with fewer warts, and a standard system that you can use outside of Rails if you want to.

New Rails Isolation Testing

A little while ago, Carl and I starting digging into Rails’ initializer. We already made a number of improvements, such as adding the ability to add a new initializer at any step in the process, and to make it possible to have multiple initializers in a single process. The second improvement is the first step toward running multiple Rails apps in a single process, which requires moving all global Rails state into instances of objects, so each application can have its own contained configuration in its own object. More on this in the next few weeks.

As I detailed on the Engine Yard blog this week, when moving into a new area to refactor, it’s important to make sure that there are very good tests. Although the Rails initializer tests covered a fair amount of area, successfully getting the tests to pass did not guarantee that Rails booted. Thankfully, Sam Ruby’s tests were comprehensive enough to get us through the initial hump.

After making the initial change, we went back to see what we could do to improve the test suite. The biggest problem was a problem we’d already encountered in Merb: you can’t uninitialize Rails. Once you’ve run through the initialization process, many of the things that happen are permanent.

Our solution, which we committed to master today, is to create a new test mixin that runs each test case in its own process. Getting it working on OSX wasn’t trivial, but it was pretty elegant once we got down to it. All we did was override the run method on TestCase to fork before actually running the test. The child then runs the test (and makes whatever invasive changes it needs to), and communicates any methods that were called on the Test::Unit result object back to the parent.

The parent then replays those methods, which means that as far as the parent is concerned, all of the cases are part of a single suite, even though they are being run in a separate process. Figuring out what parts of Test::Unit to hook into took all of yesterday afternoon, but once we were done, it was only about 40 lines of code.

Today, we tackled getting the same module to work in environments that don’t support forking, like JRuby and Windows. Unfortunately, these environments are going to run these tests significantly more slowly, because they have to boot up a full process for each test case, where the forking version can simply use the setup already done in the parent process (which makes it almost as fast as running all the tests in a single process).

The solution was to emulate forking by shelling out to a new process that was identical to the one that was just launched, but with an extra constraint on the test name (basically, booting up the test suite multiple times, but each run only runs a single test). The subprocess then communicates back to the launching process using the same protocol as in the forking model, which means that we only had to change the code that ran the tests in isolation; everything else remains the same.

There was one final caveat, however. It turns out that in Test::Unit, using a combination of -t to specify the test case class and -n to specify the test case name doesn’t work. Test::Unit’s semantics are to include any test for which ANY of the appropriate filters match. I’m not proud of this, but what we did was a small monkey-patch of the Test::Unit collector in the subprocess only which does the right thing:

# Only in subprocess for windows / jruby.
if ENV['ISOLATION_TEST']
  require "test/unit/collector/objectspace"
  class Test::Unit::Collector::ObjectSpace
    def include?(test)
      super && test.method_name == ENV['ISOLATION_TEST']
    end
  end
end

Not great, but all in all, not all that much code (the entire module, including both forking and subprocess methods is just 98 lines of code).

A crazy couple of days yielding a pretty epic hack, but it works!

Merb Vulnerability Fix (1.0.12)

Over the weekend, it was discovered that the json_pure gem is subject to a DoS attack when parsing specially formed JSON objects. This vulnerability does not affect the json gem, which uses a C extension, or ActiveSupport::JSON, which is used in Rails.

By default, Merb uses the json gem, which is not vulnerable, but falls back to json_pure if json is not available. As a result, if you have json_pure but not json on your system, you may be vulnerable. Additionally, Ruby 1.9.1 (but not Ruby 1.9 trunk) ships with json_pure, which remains vulnerable.

The easiest way to immunize yourself from this problem, no matter what Ruby version you are on, is to upgrade to the latest version of json_pure, which resolves the vulnerability. Additionally, Merb 1.0.12 has been released, which monkey-patches json_pure to remove the vulnerability, but prints a warning encouraging you to upgrade to the latest. Merb 1.0.12 only adds this patch on top of 1.0.11, so it should be perfectly safe to upgrade.

Guest Blogging

I’ll occasionally be posting on the Engine Yard blog in addition to my posts here. Union Station has really stepped up lately in terms of technical content. We’ve had posts on pairing, TDD with Cucumber, Rack and a slew of other things, and I know there’s more in the pipe.

I posted today on how we’ve been refactoring Rails, and how you can apply what we’ve learned to your own projects: let me know what you think!

On Rails Testing

One of the things that has both pleasantly surprised and frustrated me over the past six months is the state of Rails’ internal tests. While the tests can sometimes cover the most obscure, important bugs, they can sometimes be heavily mock-based or very coupled to internal implementation.

In large part, this is because of the requirement that patches come with tests, so the test suite has accumulated a lot of knowledge over time, but isn’t always the clearest when it comes time for a major refactor. Just to be clear, without the test suite, I don’t think the work we’ve been doing would have been possible, so I’m not complaining much.

However, I recently became aware that Sam Ruby, one of the authors of Agile Web Development on Rails, has released a test suite that tests each step in the depot application, as well as each of the advanced chapters.

Last week, Carl and I started digging into the Rails initializer, and the tests in the initializer (railties) are more mock-based and less reliable than the tests in ActionPack (which we’ve been working with so far). They’re pretty reasonable unit tests for individual components, but getting all of the tests to pass did not result in an (even close) bootable Rails app. Thanks to Sam’s tests, we were able to work through getting a booting Rails app by the end of last week.

Over the past week or two, Carl and I have been running the tests a few times a day, and while they definitely take a while to get through, they’ve added a new sense of confidence to the work we’re doing.

Rails Edge Architecture

I’ve talked a bunch about the Rails 3 architecture in various talks, and it’s finally come together enough to do a full blog post on it. With that said, there’s a few things to keep in mind.

We’ve done a lot of work on ActionController, but have only mildly refactored ActionView. We’re going to tackle that next, but I don’t expect nearly as many internal changes as there were for ActionController. Also, we’ve been working on the ActionController changes for a few months, and have focused quite a bit on maintaining backward compatibility (for the user-facing API) with Rails 2.3. If you try edge and find that we’ve broken something, definitely let us know.

AbstractController::Base

One of the biggest internal changes is creating a base superclass that is separated from the notions of HTTP. This AbstractController handles the basic notion of controllers, actions, and action dispatching, and not much else.

At that level of abstraction, there are also a number of modules that can be added in, such as Renderer, Layouts, and Callbacks. The API in the AbstractController is very normalized, and it’s not intended to be used directly by end users.

For instance, the template renderer takes an ActionView::Template object and renders it. Developers working with Rails shouldn’t have to know about Template objects; this is simply an internal API to make it easier to work with the Rails internals and build things on top of Rails.

Another example is related to action dispatching. At its core, action dispatching is simply taking an action name, calling the method if it exists, or raising an exception if it doesn’t. However, other parts of Rails have different semantics. For instance, method_missing is used if the action name is not a public method, and the action is dispatched if no action is present but a template with the same name is present.

In order to make it easy to implement these features at different levels of abstraction, AbstractController::Base calls method_for_action to determine what method to call for an action. By default, that method simply checks to see if the action is present. In ActionController::HideActions, that method is overridden to return nil if actions are explicitly removed using hide_action. ActionController::Base further overrides it to return “default_render”, which handles the rendering.

That’s, in a nutshell, how the new architecture works: simple abstractions that represent a bunch of functionality that you can layer on if you need it.

ActionController::Http

ActionController::Http layers a Rack abstraction on top of the core controller abstraction so that actions can be called as Rack endpoints. On top of the core rendering functionality available in AbstractController::Base, ActionController provides rendering functionality that builds on top of HTTP constructs. For instance, the ActionController renderer knows how to set the content type based on the template it rendered. Additionally, ActionController implements more developer-friendly APIs. Instead of having to know about the ActionView::Template object, developers can just use the name of the template (pretty much the Rails 2.3 rendering API).

ActionController normalizes the developer’s inputs into the inputs that AbstractController is expecting, before calling super (ActionController::Http inherits from AbstractController::Base). Additional functionality is added via the ActionController::Rails2Compatibility module, which provides support for things like stripping a leading “/” off of template names, and ActionController::Base, which provides a final layer of normalization (for instance, normalizing render :action => “foo” and render “foo”). As a result, ActionController::Base ends up being identical to the fully-featured ActionController::Base that you used in Rails 2.3, and none of your app code needs to change.

As an example, let’s look at rendering. If you call render :template => “hello”, the first thing that happens is the normalization pass on ActionController::Base. Since we used a relatively normalized form, not much happens, and the options hash is passed up into the compatibility module, which checks to see if there’s a leading “/” in the template name. Since there isn’t, it passes the options hash up again into the AbstractController::Renderer module.

There, Renderer checks to see if we’ve already set the response body. If we have, a DoubleRenderError is raised. If not, render_to_body is called with the options hash. The first place in the chain we find render_to_body is in ActionController::Renderer, where it processes options like :content_type or :status.

It then calls super, passing control back into AbstractController, which promptly calls _determine_template, again with the options. The job of _determine_template is to take in some options and return the template to render. This hook point is provided so that other modules, like the Layouts module, can use the template that was looked up to control something they care about. In this case, the Layouts module wants to limit the search for a layout to the formats and locale of the template that was actually rendered.

This solves a long-standing problem in Rails where templates and layouts were looked up separately, so it was possible for an HTML layout to be implicitly wrapped around an XML template. No longer :)

The Layouts module gets called first with _determine_template. It calls super, allowing the default template lookup to occur, which populates options[:_template]. It then uses options[:_template] to look up the layout to use, populating options[:_layout]. I hadn’t mentioned it before, but AbstractController uses options with leading underscores, leaving undecorated options for subclasses like ActionController::Base.

After the template to render is determined, Renderer calls _render_template, which makes the call into ActionView.

I know it sounds rather complicated, but a graphic showing the relationships is forthcoming, which should clean things up. The nice thing is that there are several layers of abstraction, and while the final system is complex (mostly because the functionality is complex), it’s reasonably easy to understand the higher levels of abstraction on their own. It’s also easy to put various kinds of normalization into standard places, so the code you’re reading at any given point is code that expects normalized inputs. Since the normalization that Rails does can sometimes be quite gnarly (in the service of making the user experience really pleasant), separating that stuff out can reduce the amount of gnarliness in the internals themselves.

Finally, an important part of an architecture like this is making sure that there is great internal documentation (which we’ve already started with), and some visualizations that show what’s going on. If you were forced to track down the control flow above on your own for the first time, it would probably be non-trivial. So a key part of this architecture is making sure you never have to do that. I would also note that I’m not particularly good at expressing code reading in blog posts, so the process definitely sounds a lot more complex than it actually is ;)

Rails Dispatch

It’s been a busy week or two. I just got back from acts_as_conference, which was a blast. I got to talk about “The Merge”; the talk was 50% prepared slides and 50% Q&A. I’m glad I did it that way, since the questions were (as usual) great. DHH did Q&A in the morning of the first day, and fielded some questions about the merge as well. He also took questions on a bunch of other topics (“Do you Test All the Fucking Time?” “No.”).

On the Rails front, I continue to wrestle with callbacks. On the bright side, once it’s all done, there will be a single, uniform callbacks interface for every callback use in the Rails codebase. That includes ActiveRecord callbacks, Dispatch callbacks, Test::Unit callbacks, and of course, before/after/around filters. Once I’m ready, all of the above will have around filters, the biggest obvious change, and they’ll also all be much faster in the face of a number of callbacks.

I’ve completed integrating the new callbacks into Test::Unit and ActionController::Dispatcher, and am in the process of merging them into AP. However, I’m going to take a break for a few days from it and see about extracting AbstractController (I’m stuck in a rut with the integration and need to take a step back for a bit).

On another topic, we had a discussion today about how to handle http parameters when there are multiple parameters with the same name (but without “[]“). I should have a patch to handle this later tonight. Stay tuned!

Archives

Categories

Meta