Yehuda Katz is a member of the Ember.js, Ruby on Rails and jQuery Core Teams; his 9-to-5 home is at the startup he founded, Tilde Inc.. There he works on Skylight, the smart profiler for Rails, and does Ember.js consulting. He is best known for his open source work, which also includes Thor and Handlebars. He travels the world doing open source evangelism and web standards work.

Archive for the ‘Rails 3’ Category

What’s Up With All These Changes in Rails?

Yesterday, there was a blog post entitled “What the Hell is Happening to Rails” that stayed at the number one spot on Hacker News for quite a while. The post and many (but not most) the comments on the post reflect deep-seated concern about the recent direction of Rails. Others have addressed the core question about change in the framework, but I’d like to address questions about specific changes that came up in the post and comments.

The intent of this post is not to nitpick the specific arguments that were made, or to address the larger question of how much change is appropriate in the framework, but rather to provide some background on some of the changes that have been made since Rails 2.3 and to explain why we made them.

Block Helpers

I too get a feeling of “change for the sake of change” from Rails at times. That’s obviously not something they’re doing, as all the changes have some motivation, but at times it feels a bit like churn.
At one point in time, you did forms like <%= form …. and then they switched to <% form …. do and now they’ve switched back to <%= form … do again.
Also, the upgrade to Rails 3 is not an easy one. Yeah, you get some nice stuff, but because it’s so painful, it’s not happening for a lot of people, which is causing more problems.

Prior to Rails 3.0, Rails never used <= form_for because it was technically very difficult to make it work. I wrote a post about it in Summer 2009 that walked through the technical problems. The short version is that every ERB template is compiled into a Ruby method, and reliably converting &lt%= with blocks proved to be extremely complicated.

However, knowing when to use <% and when to use <= caused major issues for new developers, and it was a major 3.0 priority to make this work. In addition, because of the backwards compatibility issue, we went to extreme Ruby-exploiting lengths to enable deprecation warnings about the change, so that we could fix the issue for new apps, but also not break old apps.

The night José and I figured out how to do this (the Engine Yard party at Mountain West RubyConf 2009), we were pretty close to coming to the conclusion that it couldn’t be done short of using a full language parser in ERB, which should give you some sense of how vexing a problem it was for us.

Performance

The general disregard for improving performance, inherited from Ruby, is also something endemic from a while back.

Yes, there have been some performance regressions in Rails 3.0. However, the idea that the Rails core team doesn’t care about performance, and neither does the Ruby team doesn’t pass the smell test.

Aaron Patterson, the newest core team member, worked full time for almost a year to get the new ActiveRecord backend into decent performance shape. Totally new code often comes with some performance regressions, and the changes to ActiveRecord were important and a long-time coming. Many of us didn’t see the magnitude of the initial problem until it was too late for 3.0, but we take the problem extremely seriously.

Ruby core (“MRI”) itself has sunk an enormous amount of time into performance improvements in Ruby 1.9, going so far as to completely rewrite the core VM from scratch. This resulted in significant performance improvements, and the Ruby team continues to work on improvements to performance and to core systems like the garbage collector.

The Ruby C API poses some long-term problems for the upper-bound of Ruby performance improvements, but the JRuby and Rubinius projects are showing how you can use Ruby C extensions inside a state-of-the-art virtual machine. Indeed, the JRuby and Rubinius projects show that the Ruby community both cares about, and is willing to invest significant energy into improving the performance of Ruby.

Heroku and the Asset Pipeline

The assets pipeline feels horrible, it’s really slow. I upgraded to Rails 3.1rc, realized it fights with Heroku unless upgrading to the Cedar stack

The problem with Heroku is that the default Gemfile that comes with Rails 3.1 currently requires a JavaScript runtime to boot, even in production. This is clearly wrong and will be fixed posthaste. Requiring Rails apps to have node or a compiled v8 in production is an unacceptable burden.

On the flip side, the execjs gem, which manages compiling CoffeeScript (and more importantly minifying your JavaScript), is actually a pretty smart piece of work. It turns out that both Mac OSX and Windows ship with usable JavaScript binaries, so in development, most Rails users will already have a JavaScript library ready to use.

It’s worth noting that a JavaScript engine is needed to run “uglify.js”, the most popular and most effective JavaScript minifier. It is best practice to minify your JavaScript before deployment, so you can feel free to format and comment your code as you like without effecting the payload. You can learn more about minification in this excellent post by Steve Souders from a few years back. Rails adding minification by default is an unambiguous improvement in the workflow of Rails applications, because it makes it easy (almost invisible) to do something that everyone should be doing, but which has previously been something of a pain.

Again, making node a dependency in production is clearly wrong, and will be removed before the final release.

Change for Change’s Sake

The problem with Rails is not the pace of change so much as the wild changes of direction it takes, sometimes introducing serious performance degradations into official releases. Sometimes it’s hard to see a guiding core philosophy other than the fact they want to be on the shiny edge

When Rails shipped, it came with a number of defaults:

  • ActiveRecord
  • Test::Unit
  • ERB
  • Prototype

Since 2004, alternatives to all of those options have been created, and in many cases (Rspec, jQuery, Haml) have become somewhat popular. Rails 3.0 made no changes to the defaults, despite much clamoring for a change from Prototype to jQuery. Rails 3.1 changed to jQuery only when it became overwhelmingly clear that jQuery was the de facto standard on the web.

As someone who has sat in on many discussions about changing defaults, I can tell you that defaults in Rails are not changed lightly, and certainly not “to be on the shiny edge.” In fact, I think that Rails is more conservative than most would expect in changing defaults.

Right, but the problem is that there doesn’t seem to be a ‘right’ way, That’s the problem.

We were all prototype a few years ago, now it jquery … we (well I) hadn’t heard of coffeescript till a few months ago and now its a default option in rails, The way we were constructing ActiveRecord finders had been set all through Rails 2, now we’ve changed it, the way we dealt with gems was set all through rails 2 now its changed completely in Rails 3.

I like change, I like staying on the cutting edge of web technologies, but I don’t want to learn something, only to discard it and re-do it completely to bring it up to date with a new way of doing things all the time.

First, jQuery has become the de facto standard on the web. As I said earlier, Rails resisted making this change in 3.0, despite a lot of popular demand, and the change to jQuery is actually an example of the stability of Rails’ default choices over time, rather than the opposite.

Changes to gem handling evolved as the Rails community evolved to use gems more. In Rails 1, all extensions were implemented as plugins that got pulled out of svn and dumped into your project. This didn’t allow for versioning or dependencies, so Rails 2 introduced first-class support for Rails plugins as gems.

During the Rails 2 series, Rails added a dependency on Rack, which caused serious problems when Rails was used with other gems that rely on Rack, due to the way that raw Rubygems handles dependency activation. Because Rails 3 uses gem dependencies more extensively, we spent a year building bundler, which adds per-application dependency resolution to Rubygems. This was simply the natural evolution of the way that Rails has used dependencies over time.

The addition of CoffeeScript is interesting, because it’s a pretty young technology, but it’s also not really different from shipping a new template handler. When you create a new Rails app, the JavaScript file is a regular JS file, and the asset compilation and dependency support does not require CoffeeScript. Shipping CoffeeScript is basically like shipping Builder: should you want it, it’s there for you. Since we want to support minification out of the box anyway, CoffeeScript doesn’t add any new requirements. And since it’s just there in the default Gemfile, as opposed to included in Rails proper like Builder, turning it off (if you really want to) is as simple as removing a line in your Gemfile. Nothing scary here.

ActiveRecord Changes

The way we were constructing ActiveRecord finders had been set all through Rails 2, now we’ve changed it

The Rails core team does seem to treat the project as if it’s a personal playground

One of the biggest problems with ActiveRecord was the way it internally used Strings to represent queries. This meant that changes to queries often required gsubing String to make simple changes. Internally, it was a mess, and it expressed itself in the public API in how conditions were generated, and more importantly how named scopes were created.

One goal of the improvements in Rails 3 was to get rid of the ad-hoc query generation in Rails 2 and replace it with something better. ActiveRelation, this library, was literally multiple years in the making, and a large amount of the energy in the Rails 3.0 process was spent on integrating ActiveRelation.

From a user-facing perspective, we wanted to unify all of the different ways that queries were made. This means that scopes, class methods, and one-time-use queries all use the same API. As in the <= case, a significant amount of effort was spent on backwards compatibility with Rails 2.3. In fact, we decided to hold onto support for the old API as least as long as Rails 3.2, in order to soften the community transition to the new API.

In general, we were quite careful about backwards compatibility in Rails 3.0, and while a project as complex as Rails was not going to be perfect in this regard, characterizing the way we handled this as “irresponsible” or “playground” disregards the tremendous amount of work and gymnastics that the core team and community contributors put into supporting Rails 2.3 APIs across the entire codebase when changes were made.

Bundler: As Simple as What You Did Before

One thing we hear a lot by people who start to use bundler is that the workflow is more complicated than it used to be when they first start. Here’s one (anonymized) example: “Trying out Bundler to package my gems. Still of the opinion its over-complicating a relatively simple concept, I just want to install gems.”

Bundler has a lot of advanced features, and it’s definitely possible to model fairly complex workflows. However, we designed the simple case to be extremely simple, and to usually be even less work than what you did before. The problem often comes when trying to handle a slightly off-the-path problem, and using a much more complex solution than you need to. This can make *everything* much more complicated than it needs to be.

In this post, I’ll walk through the bundler happy path, and show some design decisions we made to keep things moving as smoothly as before, but with far fewer snags and problems than the approach you were using before. I should be clear that there are probably bugs in some cases when using a number of advanced features together, and we should fix those bugs. However, they don’t reflect core design decisions of bundler.

In the Beginning

When you create a Rails application for the first time, the Rails installer creates a Gemfile for you. You’ll note that you don’t actually have to run bundle install before starting the Rails server.

$ gem install rails
Successfully installed activesupport-3.0.0
Successfully installed builder-2.1.2
Successfully installed i18n-0.4.1
Successfully installed activemodel-3.0.0
Successfully installed rack-1.2.1
Successfully installed rack-test-0.5.6
Successfully installed rack-mount-0.6.13
Successfully installed tzinfo-0.3.23
Successfully installed abstract-1.0.0
Successfully installed erubis-2.6.6
Successfully installed actionpack-3.0.0
Successfully installed arel-1.0.1
Successfully installed activerecord-3.0.0
Successfully installed activeresource-3.0.0
Successfully installed mime-types-1.16
Successfully installed polyglot-0.3.1
Successfully installed treetop-1.4.8
Successfully installed mail-2.2.6.1
Successfully installed actionmailer-3.0.0
Successfully installed thor-0.14.2
Successfully installed railties-3.0.0
Successfully installed rails-3.0.0
22 gems installed
$ rails s
=> Booting WEBrick
=> Rails 3.0.0 application starting in development on http://0.0.0.0:3000
=> Call with -d to detach
=> Ctrl-C to shutdown server
[2010-09-30 12:25:51] INFO  WEBrick 1.3.1
[2010-09-30 12:25:51] INFO  ruby 1.9.2 (2010-08-18) [x86_64-darwin10.4.0]
[2010-09-30 12:25:51] INFO  WEBrick::HTTPServer#start: pid=26380 port=3000

If you take a look at your directory, you’ll see that Bundler noticed that you didn’t yet have a Gemfile.lock, saw that you had all the gems that you needed already in your system, and created a Gemfile.lock for you. If you didn’t have a necessary gem, the Rails server would error out. For instance, if you were missing Erubis, you’d get Could not find RubyGem erubis (~> 2.6.6). In this case, you’d run bundle install to install the gems.

As you can see, while bundler is providing some added features for you. If you take your application to a new development or production machine with the Gemfile.lock, you will always use exactly the same gems. Additionally, bundler will resolve all your dependencies at once, completely eliminating the problem exemplified by this thread on the Rails core mailing list. But you didn’t have to run bundle install or any bundle command unless you were missing some needed gems. This makes the core experience of bundler the same as “I just want to install gems”, while adding some new benefits.

Adding Gems

When you add a gem that is already installed on your system, you just add it to the Gemfile and start the server. Bundler will automatically pick it up and add it to the Gemfile.lock. Here again, you don’t need to run bundle install. To install the gem you added, you can run bundle install or you can even just use gem install to install the missing gems.

Once again, while bundler is handling a lot for you behind the scenes (ensuring compatibility of gems, tracking dependencies across machines), if you have all the gems that you need (and specified in your Gemfile) already installed on your system, no additional commands are required. If you don’t, a simple bundle install will get you up to date.

Deployment

After developing your application, you will probably need to deploy it. With bundler, if you are deploying your application with capistrano, you just add require "bundler/capistrano" to your deploy.rb. This will automatically install the gems in your Gemfile, using deployment-friendly settings. That’s it!

Before bundler, you had two options:

  • Most commonly, “make sure that all the gems I need are on the remote system”. This could involve including the gem install in the capistrano task, or even sshing into the remote system to install the needed gems. This solution would work, but would often leave no trail about what exactly happened. This was especially problematic for applications with a lot of dependencies, or applications with sporadic maintenance
  • Using the GemInstaller gem, which allowed you to list the gems you were using, and then ran gem install on each of the gems. Heroku used a similar approach, with a .gems manifest. While this solved the problem of top-level dependencies, it did not lock in dependencies of dependencies. This caused extremely common problems with rack and activesupport, and a long tail of other issues. One particularly egregious example is DreamHost, which directs users to use a brittle patch to Rails

In both of these cases, there was no guarantee that the gems you were using in development (including dependencies of dependencies like Rack and Active Support) remained the same after a deploy. One recent example that I saw was that carrierwave added an updated dependency on the newest activesupport, which depends on Ruby 1.8.7. Without making any change to his .gems manifest, Heroku failed to deploy his application on their Ruby 1.8.6 stack.

With bundler, this sort of problem can never happen, because you’re always running exactly the same versions of third-party code in production as you are in development and staging. This should increase your confidence in deployments, especially when you come back to an application months after it was originally developed.

Tweaked Gems

Some times, you will need to make small changes to one of your dependencies, and deploy your application with those changes. Of course, in some cases you can make the tweaks via monkey-patching, but in other cases, the only real solution is to change the gem itself.

With bundler, you can simply fork the gem on Github, make your changes, and change the gem line in your Gemfile to include a reference to the git repository, like this:

# change this
gem "devise"
 
# to this
gem "devise", :git => "git://github.com/plataformatec/devise.git"

This behaves exactly the same as a gem version, but uses your tweaks from Github. It participates in dependency resolution just like a regular gem, and can be installed with bundle install just like a regular gem. Once you’ve added a git repository, you run bundle install to install it (since the normal gem command can’t install git repositories).

You can use :branch, :tag, or :ref flags to specify a particular branch, tag or revision. By default, :git gems use the master branch. If you use master or a :branch, you can update the gem by running bundle update gem_name. This will check out the latest revision of the branch in question.

Conclusion

Using bundler for the sorts of things that you would have handled manually before should be easier than before. Bundler will handle almost all of the automatable record-keeping for you. While doing so, it offers the incredible guarantee that you will always run the same versions of third-party code.

There are a lot of advanced features of Bundler, and you can learn a lot more about them at the bundler web site.

Threads (in Ruby): Enough Already

For a while now, the Ruby community has become enamored in the latest new hotness, evented programming and Node.js. It’s gone so far that I’ve heard a number of prominent Rubyists saying that JavaScript and Node.js are the only sane way to handle a number of concurrent users.

I should start by saying that I personally love writing evented JavaScript in the browser, and have been giving talks (for years) about using evented JavaScript to sanely organize client-side code. I think that for the browser environment, events are where it’s at. Further, I don’t have any major problem with Node.js or other ways of writing server-side evented code. For instance, if I needed to write a chat server, I would almost certainly write it using Node.js or EventMachine.

However, I’m pretty tired of hearing that threads (and especially Ruby threads) are completely useless, and if you don’t use evented code, you may as well be using a single process per concurrent user. To be fair, this has somewhat been the party line of the Rails team years ago, but Rails has been threadsafe since Rails 2.2, and Rails users have been taking advantage of it for some time.

Before I start, I should be clear that this post is talking about requests that spent a non-tiny amount of their time utilizing the CPU (normal web requests), even if they do spend a fair amount of time in blocking operations (disk IO, database). I am decidedly not talking about situations, like chat servers where requests sit idle for huge amounts of time with tiny amounts of intermittent CPU usage.

Threads and IO Blocking

I’ve heard a common misperception that Ruby inherently “blocks” when doing disk IO or making database queries. In reality, Ruby switches to another thread whenever it needs to block for IO. In other words, if a thread needs to wait, but isn’t using any CPU, Ruby’s built-in methods allow another waiting thread to use the CPU while the original thread waits.

If every one of your web requests uses the CPU for 30% of the time, and waits for IO for the rest of the time, you should be able to serve three requests in parallel, coming close to maxing out your CPU.

Here’s a couple of diagrams. The first shows how people imagine requests work in Ruby, even in threadsafe mode. The second is how an optimal Ruby environment will actually operate. This example is extremely simplified, showing only a few parts of the request, and assuming equal time spent in areas that are not necessarily equal.


Untitled.001.png


Untitled.002.png


I should be clear that Ruby 1.8 spends too much time context-switching between its green threads. However, if you’re not switching between threads extremely often, even Ruby 1.8′s overhead will amount to a small fraction of the total time needed to serve a request. A lot of the threading benchmarks you’ll see are testing pathological cases involve huge amounts of threads, not very similar to the profile of a web server.

(if you’re thinking that there are caveats to my “optimal Ruby environment”, keep reading)

“Threads are just HARD”

Another common gripe that pushes people to evented programming is that working with threads is just too hard. Working hard to avoid sharing state and using locks where necessary is just too tricky for the average web developer, the argument goes.

I agree with this argument in the general case. Web development, on the other hand, has an extremely clean concurrency primitive: the request. In a threadsafe Rails application, the framework manages threads and uses an environment hash (one per request) to store state. When you work inside a Rails controller, you’re working inside an object that is inherently unshared. When you instantiate a new instance of an ActiveRecord model inside the controller, it is rooted to that controller, and is therefore not used between live threads.

It is, of course, possible to use global state, but the vast majority of normal, day-to-day Rails programming (and for that matter, programming in any web framework in any language with a request model) is inherently threadsafe. This means that Ruby will transparently handle switching back and forth between active requests when you do something blocking (file, database, or memcache access, for instance), and you don’t need to personally manage the problems the arise when doing concurrent programming.

This is significantly less true about applications, like chat servers, that keep open a huge number of requests. In those cases, a lot of the application logic happens outside the individual request, so you need to personally manage shared state.

Historical Ruby Issues

What I’ve been talking about so far is how stock Ruby ought to operate. Unfortunately, a group of things have historically conspired to make Ruby’s concurrency story look much worse than it actually ought to be.

Most obviously, early versions of Rails were not threadsafe. As a result, all Rails users were operating with a mutex around the entire request, forcing Rails to behave like the first “Imagined” diagram above. Annoyingly, Mongrel, the most common Ruby web server for a few years, hardcoded this mutex into its Rails handler. As a result, if you spun up Rails in “threadsafe” mode a year ago using Mongrel, you would have gotten exactly zero concurrency. Also, even in threadsafe mode (when not using the built-in Rails support) Mongrel spins up a new thread for every request, not exactly optimal.

Second, the most common database driver, mysql is a very poorly behaved C extension. While built-in I/O (file or pipe access) correctly alerts Ruby to switch to another thread when it hits a blocking region, other C extensions don’t always do so. For safety, Ruby does not allow a context switch while in C code unless the C code explicitly tells the VM that it’s ok to do so.

All of the Data Objects drivers, which we built for DataMapper, correctly cause a context switch when entering a blocking area of their C code. The mysqlplus gem, released in March 2009, was designed to be a drop-in replacement for the mysql gem, but fix this problem. The new mysql2 gem, written by Brian Lopez, is a drop-in replacement for the old gem, also correctly handles encodings in Ruby 1.9, and is the new default MySQL driver in Rails.

Because Rails shipped with the (broken) mysql gem by default, even people running on working web servers (i.e. not mongrel) in threadsafe mode would have seen a large amount of their potential concurrency eaten away because their database driver wasn’t alerting Ruby that concurrent operation was possible. With mysql2 as the default, people should see real gains on threadsafe Rails applications.

A lot of people talk about the GIL (global interpreter lock) in Ruby 1.9 as a death knell for concurrency. For the uninitiated, the GIL disallows multiple CPU cores from running Ruby code simultaneously. That does mean that you’ll need one Ruby process (or thereabouts) per CPU core, but it also means that if your multithreaded code is running correctly, you should need only one process per CPU core. I’ve heard tales of six or more processes per core. Since it’s possible to fully utilize a CPU with a single process (even in Ruby 1.8), these applications could get a 4-6x improvement in RAM usage (depending on context-switching overhead) by switching to threadsafe mode and using modern drivers for blocking operations.

JRuby, Ruby 1.9 and Rubinius, and the Future

Finally, JRuby already runs without a global interpreter lock, allowing your code to run in true parallel, and to fully utilize all available CPUs with a single JRuby process. A future version of Rubinius will likely ship without a GIL (the work has already begun), also opening the door to utilizing all CPUs with a single Ruby process.

And all modern Ruby VMs that run Rails (Ruby 1.9′s YARV, Rubinius, and JRuby) use native threads, eliminating the annoying tax that you need to pay for using threads in Ruby 1.8. Again, though, since that tax is small relative to the time for your requests, you’d likely see a non-trivial improvement in latency in applications that spend time in the database layer.

To be honest, a big part of the reason for the poor practical concurrency story in Ruby has been that the Rails project didn’t take it seriously, which it difficult to get traction for efforts to fix a part of the problem (like the mysql driver).

We took concurrency very seriously in the Merb project, leading to the development of proper database drivers for DataMapper (Merb’s ORM), and a top-to-bottom understanding of parts of the stack that could run in parallel (even on Ruby 1.8), but which weren’t. Rails 3 doesn’t bring anything new to the threadsafety of Rails itself (Rails 2.3 was threadsafe too), but by making the mysql2 driver the default, we have eliminated a large barrier to Rails applications performing well in threadsafe mode without any additional research.

UPDATE: It’s worth pointing to Charlie Nutter’s 2008 threadsafety post, where he talked about how he expected threadsafe Rails would impact the landscape. Unfortunately, the blocking MySQL driver held back some of the promise of the improvement for the vast majority of Rails users.

What’s New in Bundler 1.0.0.rc.1

Taking into consideration the huge amount of feedback we received during the Bundler 0.9 series, we streamlined Bundler 1.0 significantly, and made it fit user expectations better.

Whether you have used bundler before or not, the easiest way to get up to speed is to read the following notes and go to http://gembundler.com/v1.0 for more in-depth information.

(note that gembundler.com is still being updated for the 1.0 changes, and should be ready for the final release).

Starting a new project with bundler

When you generate a new Rails application, Rails will create a Gemfile for you, which has everything needed to boot your application.

Otherwise, you can use bundle init to create a stub Gemfile, ready to go.

First, run bundle install to make sure that you have all the needed dependencies. If you already do, this process will happen instantaneously.

Bundler will automatically create a file called Gemfile.lock. This file is a snapshot of your application’s dependencies at that time.

You SHOULD check both files into version control. This will ensure that all team members (as well as your production server) are working with identical dependencies.

Checking out an existing project using bundler

After checking out an existing project using bundler, check to make sure that the Gemfile.lock snapshot is checked in. If it is not, you may end up using different dependencies than the person who last used and tested the project.

Next, run bundle install. This command will check whether you already have all the required dependencies in your system. If you do not, it will fetch the dependencies and install them.

Updating dependencies

If you modify the dependencies in your Gemfile, first try to run bundle install, as usual. Bundler will attempt to update only the gems you have modified, leaving the rest of the snapshot intact.

This may not be possible, if the changes conflict with other gems in the snapshot (or their dependencies). If this happens, Bundler will instruct you to run bundle update. This will re-resolve all dependencies from scratch.

The bundle update command will update the versions of all gems in your Gemfile, while bundle install will only update the gems that have changed since the last bundle install.

After modifying dependencies, make sure to check in your Gemfile and Gemfile.lock into version control.

By default, gems are installed to your system

If you follow the instructions above, Bundler will install the gems into the same place as gem install.

If necessary, Bundler will prompt you for your sudo password.

You can see the location of a particular gem with bundle show [GEM_NAME]. You can open it in your default editor with bundle open [GEM_NAME].

Bundler will still isolate your application from other gems. Installing your gems into a shared location allows multiple projects to avoid downloading the same gem over and over.

You might want to install your bundled gems to a different location, such as a directory in the application itself. This will ensure that each application has its own copies of the gems, and provides an extra level of isolation.

To do this, run the install command with bundle install /path/to/location. You can use a relative path as well: bundle install vendor.

In RC1, this command will use gems from the system, if they are already there (it only affects new gems). To ensure that all of your gems are located in the path you specified, run bundle install path --disable-shared-gems.

In Bundler 1.0 final, bundle install path will default to --disable-shared-gems.

Deployment

When deploying, we strongly recommend that you isolate your gems into a local path (using bundle install path --disable-shared-gems). The final version of bundler will come with a --production flag, encapsulating all of the best deployment practices.

For now, please follow the following recommendations (described using Capistrano concepts):

  • Make sure to always check in a Gemfile.lock that is up to date. This means that after modifying your Gemfile, you should ALWAYS run bundle install.
  • Symlink the vendor/bundle directory into the application’s shared location (symlink release_path/current/vendor/bundle to release_path/shared/bundled_gems)
  • Install your bundle by running bundle install vendor/bundle --disable-shared-gems

Some of the Problems Bundler Solves

This post does not attempt to convince you to use bundler, or compare it to alternatives. Instead, I will try to articulate some of the problems that bundler tries to solve, since people have often asked. To be clear, users of bundler should not need to understand these issues, but some might be curious.

If you’re looking for information on bundler usage, check out the official Bundler site.

Dependency Resolution

This is the problem most associated with bundler. In short, by asking you to list all of your dependencies in a single manifest, bundler can determine, up front, a valid list of all of the gems and versions needed to satisfy that manifest.

Here is a simple example of this problem:

$ gem install thin
Successfully installed rack-1.1.0
Successfully installed eventmachine-0.12.10
Successfully installed daemons-1.0.10
Successfully installed thin-1.2.7
4 gems installed

$ gem install rails
Successfully installed activesupport-2.3.5
Successfully installed activerecord-2.3.5
Successfully installed rack-1.0.1
Successfully installed actionpack-2.3.5
Successfully installed actionmailer-2.3.5
Successfully installed activeresource-2.3.5
Successfully installed rails-2.3.5
7 gems installed

$ gem dependency actionpack -v 2.3.5
Gem actionpack-2.3.5
  activesupport (= 2.3.5, runtime)
  rack (~> 1.0.0, runtime)

$ gem dependency thin
Gem thin-1.2.7
  daemons (>= 1.0.9, runtime)
  eventmachine (>= 0.12.6, runtime)
  rack (>= 1.0.0, runtime)

$ irb
>> require "thin"
=> true
>> require "actionpack"
Gem::LoadError: can't activate rack (~> 1.0.0, runtime) 
for ["actionpack-2.3.5"], already activated rack-1.1.0
for ["thin-1.2.7"]

What happened here?

Thin declares that it can support any version of Rack above 1.0. ActionPack declares that it can support versions 1.0.x of Rack. When we require thin, it looks for the highest version of Rack that thin can support (1.1), and makes it available on the load path. When we require actionpack, it notes that the version of Rack already on the load path (1.1) is incompatible with actionpack (which requires 1.0.x) and throws an exception.

Thankfully, newer versions of Rubygems provide reasonable information about exactly what gem (“thin 1.2.7″) put Rack 1.1.0 on the load path. Unfortunately, there is often nothing the end user can do about it.

Rails could theoretically solve this problem by loosening its Rack requirement, but that would mean that ActionPack declared compatibility with any future version of Rack, a declaration ActionPack is unwilling to make.

The user can solve this problem by carefully ordering requires, but the user is never in control of all requires, so the process of figuring out the right order to require all dependencies can get quite tricky.

It is conceptually possible in this case, but it gets extremely hard when more than a few dependencies are in play (as in Rails 3).

Groups of Dependencies

When writing applications for deployments, developers commonly want to group their dependencies. For instance, you might use SQLite in development but Postgres in production.

For most people, the most important part of the grouping problem is making it possible to install the gems in their Gemfile, except the ones in specific groups. This introduces two additional problems.

First, consider the following Gemfile:

gem "rails", "2.3.5"
 
group :production do
  gem "thin"
end

Bundler allows you to install the gems in a Gemfile minus the gems in a specific group by running bundle install --without production. In this case, since rails depends on Rack, specifying that you don’t want to include thin means no thin, no daemons and no eventmachine but yes rack. In other words, we want to exclude the gems in the group specified, and any dependencies of those gems that are not dependencies of other gems.

Second, consider the following Gemfile:

gem "soap4r", "1.5.8"
 
group :production do
  gem "dm-salesforce", "0.10.3"
end

The soap4r gem depends on httpclient >= 2.1.1, while the dm-salesforce gem depends on httpclient =2.1.5.2. Initially, when you did bundle install --without production, we did not include gems in the production group in the dependency resolution process.

In this case, consider the case where httpclient 2.1.5.2 and httpclient 2.2 exist on Rubyforge.org. In development mode, your app will use the latest version (2.2), but in production, when dm-salesforce is included, the older version will be used.

Note that this happened even though you specified only hard versions at the top level, because not all gems use hard versions as their dependencies.

To solve this problem, Bundler downloads (but does not install) all gems, including gems in groups that you exclude (via --without). This allows you to specify gems with C extensions that can only compile in production (or testing requirements that depend on OSX for compilation) while maintaining a coherent list of gems used across all of these environments.

System Gems

In 0.8 and before, bundler installed all gems in the local application. This provided a neat sandbox, but broke the normal path for running a new Rails app:

$ gem install rails
$ rails myapp
$ cd myapp
$ rails server

Instead, in 0.8, you’d have to do:

$ gem install rails
$ rails myapp
$ cd myapp
$ gem bundle
$ rails server

Note that the gem bundle command became bundle install in Bundler 0.9.

In addition, this meant that Bundler needed to download and install commonly used gems over and over again if you were working on multiple apps. Finally, every time you changed the Gemfile, you needed to run gem bundle again, adding a “build step” that broke the flow of early Rails application.

In Bundler 0.9, we listened to this feedback, making it possible for bundler to use gems installed in the system. This meant that the ideal Rails installation steps could work, and you could share common gems between applications.

However, there were a few complications.

Since we now use gems installed in the system, Bundler resolves the dependencies in your Gemfile against your system sources at runtime, making a list of all of the gems to push onto the load path. Calling Bundler.setup kicks off this process. If you specified some gems not to install, we needed to make sure bundler did not try to find those gems in the system.

In order to solve this problem, we create a .bundle directory inside your application that remembers any settings that need to persist across bundler invocations.

Unfortunately, this meant that we couldn’t simply have people run sudo bundle install because root would own your application’s .bundle directory.

On OSX, root owns all paths that are, by default, in $PATH. It also owns the default GEM_HOME. This has two consequences. First, we could not trivially install executables from bundled gems into a system path. Second, we could not trivially install gems into a place that gem list would see.

In 0.9, we solved this problem by placing gems installed by bundler into BUNDLE_PATH, which defaults to ~/.bundle/#{RUBY_ENGINE}/#{RUBY_VERSION}. rvm, which does not install executables or gems into a path owned by root, helpfully sets BUNDLE_PATH to the same location as GEM_HOME. This means that when using rvm, gems installed via bundle install appear in gem list.

This also means that when not using rvm, you need to use bundle exec to place the executables installed by bundler onto the path and set up the environment.

In 0.10, we plan to bump up the permissions (by shelling out to sudo) when installing gems so we can install to the default GEM_HOME and install executables to a location on the $PATH. This will make executables created by bundle install available without bundle exec and will make gems installed by bundle install available to gem list on OSX without rvm.

Another complication: because gems no longer live in your application, we needed a way to snapshot the list of all versions of all gems used at a particular time, to ensure consistent versions across machines and across deployments.

We solved this problem by introducing a new command, bundle lock, which created a file called Gemfile.lock with a serialized representation of all versions of all gems in use.

However, in order to make Gemfile.lock useful, it would need to work in development, testing, and production, even if you ran bundle install --without production in development and then ran bundle lock. Since we had already decided that we needed to download (but not install) gems even if they were excluded by --without, we could easily include all gems (including those from excluded groups) in the Gemfile.lock.

Initially, we didn’t serialize groups exactly right in the Gemfile.lock causing inconsistencies between how groups behaved in unlocked and locked mode. Fixing this required a small change in the lock file format, which caused a small amount of frustration by users of early versions of Bundler 0.9.

Git

Very early (0.5 era) we decided that we would support prerelease “gems” that lived in git repositories.

At first, we figured we could just clone the git repositories and add the lib directory to the load path when the user ran Bundler.setup.

We abstracted away the idea of “gem source”, making it possible for gems to be found in system rubygems, remote rubygems, or git repositories. To specify that a gem was located in a git “source”, you could say:

gem "rspec-core", "2.0.0.beta.6", :git => "git://github.com/rspec/rspec-core.git"

This says: “You’ll find version 2.0.0.beta.6 in git://github.com/rspec/rspec-core.git”.

However, there were a number of issues involving git repositories.

First, if a prerelease gem had dependencies, we’d want to include those dependencies in the dependency graph. However, simply trying to run rake build was a nonstarter, as a huge number of prerelease gems have dependencies in their rake file that are only available to a tool like bundler once the gem is built (a chicken and egg problem). On the flip side, if another gem depended on a gem provided by a git repository, we were asking users to supply the version, an error-prone process since the version could change in the git repository and bundler wouldn’t be the wiser.

To solve this, we asked gem authors to put a .gemspec in the root of their repository, which would allow us to see the dependencies. A lot of people were familiar with this process, since github had used it for a while for automatically generating gems from git repositories.

At first, we assumed (like github did) that we could execute the .gemspec standalone, out of the context of its original repository. This allowed us to avoid cloning the full repository simply to resolve dependencies. However, a number of gems required files that were in the repository (most commonly, they required a version file from the gem itself to avoid duplication), so we modified bundler to do a full checkout of the repository so we could execute the gemspec in its original context.

Next, we found that a number of git repositories (notably, Rails) actually contained a number of gems. To support this, we allowed any number of .gemspec files in a repository, and would evaluate each in the context of its root. This meant that a git repository was more analogous to a gem source (like Rubygems.org) than a single .gem file.

Soon enough, people started complaining that they tried to use prerelease gems like nokogiri from git and bundler wasn’t compiling C extensions. This proved tricky, because the process that Rubygems uses to compile C extensions is more than a few lines, and we wanted to reuse the logic if possible.

In most cases, we were able to solve this problem by having bundler run gem build gem_name.gemspec on the gemspec, and using Rubygems’ native C extension process to compile the gem.

In a related problem, we started receiving reports that bundler couldn’t find rake while trying to compile C extensions. It turns out that Rubygems supports a rake compile mode if you use s.extensions = %w(Rakefile) or something containing mkrf. This essentially means that Rubygems itself has an implicit dependency on Rake. Since we sort the installed gems to make sure that dependencies get installed before the gems that depend on them, we needed to make sure that Rake was installed before any gem.

For git gems, we needed to make sure that Gemfile.lock remembered the exact revision used when bundler took the snapshot. This required some more abstraction, so sources could provide and load in agnostic information that they could use to reinstall everything identically to when bundler took the snapshot.

If a git gem didn’t supply a .gemspec, we needed to create a fake .gemspec that we could use throughout the process, based on the name and version the user specified for the repository. This would allow it to participate in the dependency resolution process, even if the repository itself didn’t provide a .gemspec.

If a repository did provide a .gemspec, and the user supplied a version or version range, we needed to confirm that the version provided matched the version specified in the .gemspec.

We checked out the git repositories into BUNDLE_PATH (again, defaulting to ~/.bundle/#{RUBY_ENGINE}/#{RUBY_VERSION} or $GEM_HOME with rvm) using the --bare option. This allows us to share git repositories like the rails repository, and then make local checkouts of specific revisions, branches or tags as specified by individual Gemfiles.

One final problem, if your Gemfile looks like this:

source "http://rubygems.org"
 
gem "nokogiri"
gem "rails", :git => "git://github.com/rails/rails.git", :tag => "v2.3.4"

You do not expect bundler to pull in the version from Rubygems.org, even though it’s newer. Because bundler treats the git repository as a gem source, it initially pulled in the latest version of the gem, regardless of the source. To solve this problem, we added the concept of “pinned dependencies” to the dependency resolver, allowing us to ask it to skip traversing paths that got the rails dependencies from other sources.

Paths

Now that we had git repositories working, it was a hop, skip and jump to support any path. We could use all of the same heuristics as we used for git repositories (including using gem build to install C extensions and having multiple version) on any path in the file system.

With so many sources in the mix, we started seeing cases where people had different gems with the exact same name and version in different sources. Most commonly, people would have created a gem from a local checkout of something (like Rack), and then, when the final version of the gem was released to Rubygems.org, we were still using the version installed locally.

We tried to solve this problem by forcing a lookup in Rubygems.org for a gem, but this contrasted with people who didn’t want to have to hit a remote repository when they had all the gems locally.

When we first started talking to early adopters, they were incredulous that this could happen. “If you do something stupid like that, f*** you”. One by one, those very same people fell victim to the “bug”. Unfortunately, it manifests itself as can't find active_support/core_ext/something_new, which is extremely confusing and can appear to be a generic “bundler bug”. This is especially problematic if the dependencies change in two copies of the gem with identical names and versions.

To solve this problem, we decided that if you had snapshotted the repository via bundle lock and had all of the required gems on your local machine, we would not try to hit a remote. However, if you run bundle install otherwise, we always check to see if there is a newer version in the remote.

In fact, this class of error (two different copies of the gems with the same name and version) has resulting in a fairly intricate prioritization system, which can be different in different scenarios. Unfortunately, the principle of least surprise requires that we tweak these priorities for different scenarios.

While it seems that we could just say “if you rake install a gem you’re on your own”, it’s very common, and people expect things to mostly work even in this scenario. Small tweaks to these priorities have also resulted in small changes in behavior between versions of 0.9 (but only in cases where the exact same name and versioned gems, in different sources, provides different code).

In fact, because of the overall complexity of the problem, and because of different ways that these features can interact, very small tweaks to different parts of the system can result in unexpected changes. We’ve gotten pretty good at seeing the likely outcome of these tweaks, but they can be baffling to users of bundler. A major goal of the lead-in to 1.0 has been to increase determinism, even in cases where we have to arbitrarily pick a “right” answer.

Conclusion

This is just a small smattering of some of the problems we’ve encountered while working on bundler. Because the problem is non-trivial (and parts are np-complete), adding an apparently simple feature can upset the equilibrium of the entire system. More frustratingly, adding features can sometimes change “undefined” behavior that accidentally breaks a working system as a result of an upgrade.

As we head into 0.10 and 1.0, we hope to add some additional features to smooth out the typical workflows, while stabilizing some of the seeming indeterminism in Bundler today. One example is imposing a standard require order for gems in the Gemfile, which is currently “undefined”.

Thanks for listening, and getting to the end of this very long post.

Spinning up a new Rails app

So people have been attempting to get a Rails app up and running recently. I also have some apps in development on Rails 3, so I’ve been experiencing some of the same problems many others have.

The other night, I worked with sferik to start porting merb-admin over to Rails. Because this process involved being on edge Rails, we got the process honed to a very simple, small, repeatable process.

The Steps

Step 1: Check out Rails

$ git clone git://github.com/rails/rails.git

Step 2: Generate a new app

$ ruby rails/railties/bin/rails new_app
$ cd new_app

Step 3: Edit the app’s Gemfile

# Add to the top
directory "/path/to/rails", :glob => "{*/,}*.gemspec"
git "git://github.com/rails/arel.git"
git "git://github.com/rails/rack.git"

Step 4: Bundle

$ gem bundle

Done

Everything should now work: script/server, script/console, etc.

If you want to check your copy of Rails into your app, you can copy it into the app and then change your Gemfile to point to the relative location.

For instance, if you copy it into vendor/rails, you can make the first line of the Gemfile directory "vendor/rails", :glob => => "{*/,}*.gemspec". You’ll want to run gem bundle again after changing the Gemfile, of course.

The Rails 3 Router: Rack it Up

In my previous post about generic actions in Rails 3, I made reference to significant improvements in the router. Some of those have been covered on other blogs, but the full scope of the improvements hasn’t yet been covered.

In this post, I’ll cover a number of the larger design decisions, as well as specific improvements that have been made. Most of these features were in the Merb router, but the Rails DSL is more fully developed, and the fuller emphasis on Rack is a strong improvement from the Merb approach.

Improved DSL

While the old map.connect DSL still works just fine, the new standard DSL is less verbose and more readable.

# old way
ActionController::Routing::Routes.draw do |map|
  map.connect "/main/:id", :controller => "main", :action => "home"
end
 
# new way
Basecamp::Application.routes do
  match "/main/:id", :to => "main#home"
end

First, the routes are attached to your application, which is now its own object and used throughout Railties. Second, we no longer need map, and the new DSL (match/to) is more expressive. Finally, we have a shortcut for controller/action pairs ("main#home" is {:controller => "main", :action => "home").

Another useful shortcut allows you to specify the method more simply than before:

Basecamp::Application.routes do
  post "/main/:id", :to => "main#home", :as => :homepage
end

The :as in the above example specifies a named route, and creates the homepage_url et al helpers as in Rails 2.

Rack It Up

When designing the new router, we all agreed that it should be built first as a standalone piece of functionality, with Rails sugar added on top. As a result, we used rack-mount, which was built by Josh Peek as a standalone Rack router.

Internally, the router simply matches requests to a rack endpoint, and knows nothing about controllers or controller semantics. Essentially, the router is designed to work like this:

Basecamp::Application.routes do
  match "/home", :to => HomeApp
end

This will match requests with the /home path, and dispatches them to a valid Rack application at HomeApp. This means that dispatching to a Sinatra app is trivial:

class HomeApp < Sinatra::Base
  get "/" do
    "Hello World!"
  end
end
 
Basecamp::Application.routes do
  match "/home", :to => HomeApp
end

The one small piece of the puzzle that might have you wondering at this point is that in the previous section, I showed the usage of :to => "main#home", and now I say that :to takes a Rack application.

Another improvement in Rails 3 bridges this gap. In Rails 3, PostsController.action(:index) returns a fully valid Rack application pointing at the index action of PostsController. So main#home is simply a shortcut for MainController.action(:home), and it otherwise is identical to providing a Sinatra application.

As I posted before, this is also the engine behind match "/foo", :to => redirect("/bar").

Expanded Constraints

Probably the most common desired improvement to the Rails 2 router has been support for routing based on subdomains. There is currently a plugin called subdomain_routes that implements this functionality as follows:

ActionController::Routing::Routes.draw do |map|
  map.subdomain :support do |support|
    support.resources :tickets
    ...
  end
end

This solves the most common case, but the reality is that this is just one common case. In truth, it should be possible to constrain routes based not just on path segments, method, and subdomain, but also based on any element of the request.

The Rails 3 router exposes this functionality. Here is how you would constrain requests based on subdomains in Rails 3:

Basecamp::Application.routes do
  match "/foo/bar", :to => "foo#bar", :constraints => {:subdomain => "support"}
end

These constraints can include path segments as well as any method on ActionDispatch::Request. You could use a String or a regular expression, so :constraints => {:subdomain => /support\d/} would be valid as well.

Arbitrary constraints can also be specified in block form, as follows:

Basecamp::Application.routes do
  constraints(:subdomain => "support") do
    match "/foo/bar", :to => "foo#bar"
  end
end

Finally, constraints can be specified as objects:

class SupportSubdomain
  def self.matches?(request)
    request.subdomain == "support"
  end
end
 
Basecamp::Application.routes do
  constraints(SupportSubdomain) do
    match "/foo/bar", :to => "foo#bar"
  end
end

Optional Segments

In Rails 2.3 and earlier, there were some optional segments. Unfortunately, they were hardcoded names and not controllable. Since we’re using a generic router, magical optional segment names and semantics would not do. And having exposed support for optional segments in Merb was pretty nice. So we added them.

# Rails 2.3
ActionController::Routing::Routes.draw do |map|
  # Note that :action and :id are optional, and
  # :format is implicit
  map.connect "/:controller/:action/:id"
end
 
# Rails 3
Basecamp::Application.routes do
  # equivalent
  match "/:controller(/:action(/:id))(.:format)"
end

In Rails 3, we can be explicit about the optional segments, and even nest optional segments. If we want the format to be a prefix path, we can do match "(/:format)/home" and the format is optional. We can use a similar technique to add an optional company ID prefix or a locale.

Pervasive Blocks

You may have noticed this already, but as a general rule, if you can specify something as an inline condition, you can also specify it as a block constraint.

Basecamp::Application.routes do
  controller :home do
    match "/:action"
  end
end

In the above example, we are not required to specify the controller inline, because we specified it via a block. You can use this for subdomains, controller restrictions, request method (get etc. take a block). There is also a scope method that can be used to scope a block of routes under a top-level path:

Basecamp::Application.routes do
  scope "/home" do
    match "/:action", :to => "homepage"
  end
end

The above route would match /home/hello/foo to homepage#foo.

Closing

There are additional (substantial) improvements around resources, which I will save for another time, assuming someone else doesn’t get to it first.

Generic Actions in Rails 3

So Django has an interesting feature called “generic views”, which essentially allow you to to render a template with generic code. In Rails, the same feature would be called “generic actions” (just a terminology difference).

This was possible, but somewhat difficult in Rails 2.x, but it’s a breeze in Rails 3.

Let’s take a look at a simple generic view in Django, the “redirect_to” view:

urlpatterns = patterns('django.views.generic.simple',
    ('^foo/(?P<id>\d+)/$', 'redirect_to', {'url': '/bar/%(id)s/'}),
)

This essentially redirects "/foo/<id>" to "/bar/<id>s/". In Rails 2.3, a way to achieve equivalent behavior was to create a generic controller that handled this:

class GenericController < ApplicationController
  def redirect
    redirect_to(params[:url] % params, params[:options])
  end
end

And then you could use this in your router:

map.connect "/foo/:id", :controller => "generic", :action => "redirect", :url => "/bar/%{id}s"

This uses the new Ruby 1.9 interpolation syntax (“%{first} %{last}” % {:foo => “hello”, :bar => “sir”} == “hello sir”) that has been backported to Ruby 1.8 via ActiveSupport.

Better With Rails 3

However, this is a bit clumsy, and requires us to have a special controller to handle this (relatively simple) case. It also saddles us with the conceptual overhead of a controller in the router itself.

Here’s how you do the same thing in Rails 3:

match "/foo/:id", :to => redirect("/bar/%{id}s")

This is built-into Rails 3′s router, but the way it works is actually pretty cool. The Rails 3 router is conceptually decoupled from Rails itself, and the :to key points at a Rack endpoint. For instance, the following would be a valid route in Rails 3:

match "/foo", :to => proc {|env| [200, {}, ["Hello world"]] }

The redirect method simply returns a rack endpoint that knows how to handle the redirection:

def redirect(*args, &block)
  options = args.last.is_a?(Hash) ? args.pop : {}
 
  path = args.shift || block
  path_proc = path.is_a?(Proc) ? path : proc {|params| path % params }
  status = options[:status] || 301
 
  lambda do |env|
    req = Rack::Request.new(env)
    params = path_proc.call(env["action_dispatch.request.path_parameters"])
    url = req.scheme + '://' + req.host + params
    [status, {'Location' => url, 'Content-Type' => 'text/html'}, ['Moved Permanently']]
  end
end

There’s a few things going on here, but the important part is the last few lines, where the redirect method returns a valid Rack endpoint. If you look closely at the code, you can see that the following would be valid as well:

match "/api/v1/:api", :to => 
  redirect {|params| "/api/v2/#{params[:api].pluralize}" }
 
# and
 
match "/api/v1/:api", :to => 
  redirect(:status => 302) {|params| "/api/v2/#{params[:api].pluralize}" }

Another Generic Action

Another nice generic action that Django provides is allowing you to render a template directly without needing an explicit action. It looks like this:

urlpatterns = patterns('django.views.generic.simple',
    (r'^foo/$',             'direct_to_template', {'template': 'foo_index.html'}),
    (r'^foo/(?P<id>\d+)/$', 'direct_to_template', {'template': 'foo_detail.html'}),
)

This provides a special mechanism for rendering a template directly from the Django router. Again, this could be implemented by creating a special controller in Rails 2 and used as follows:

class GenericController < ApplicationController
  def direct_to_template
    render(params[:options])
  end
end
 
# Router
map.connect "/foo", :controller => "generic", :action => "direct_to_template", :options => {:template => "foo_detail"}

A Prettier API

A nicer way to do this would be something like this:

match "/foo", :to => render("foo")

For the sake of clarity, let’s say that directly rendered templates will come out of app/views/direct unless otherwise specified. Also, let’s say that the render method should work identically to the render method used in Rails controllers themselves, so that render :template => "foo", :status => 201, :content_type => Mime::JSON et al will work as expected.

In order to make this work, we’ll use ActionController::Metal, which exposes a Rack-compatible object with access to all of the powers of a full ActionController::Base object.

class RenderDirectly < ActionController::Metal
  include ActionController::Rendering
  include ActionController::Layouts
 
  append_view_path Rails.root.join("app", "views", "direct")
  append_view_path Rails.root.join("app", "views")
 
  layout "application"
 
  def index
    render *env["generic_views.render_args"]
  end
end
 
module GenericActions
  module Render
    def render(*args)
      app = RenderDirectly.action(:index)
      lambda do |env|
        env["generic_views.render_args"] = args
        app.call(env)
      end
    end
  end
end

The trick here is that we’re subclassing ActionController::Metal and pulling in just Rendering and Layouts, which gives you full access to the normal rendering API without any of the other overhead of normal controllers. We add both the direct directory and the normal view directory to the view path, which means that any templates you place inside app/views/direct will take be used first, but it’ll fall back to the normal view directory for layouts or partials. We also specify that the layout is application, which is not the default in Rails 3 in this case since our metal controller does not inherit from ApplicationController.

Note for the Curious

In all normal application cases, Rails will look up the inheritance chain for a named layout matching the controller name. This means that the Rails 2 behavior, which allows you to provide a layout named after the controller, still works exactly the same as before, and that ApplicationController is just another controller name, and application.html.erb is its default layout.

And then, the actual use in your application:

Rails.application.routes do
  extend GenericActions
 
  match "/foo", :to => render("foo_index")
  # match "/foo" => render("foo_index") is a valid shortcut for the simple case
  match "/foo/:id", :constraints => {:id => /\d+/}, :to => render("foo_detail")
end

Of course, because we’re using a real controller shell, you’ll be able to use any other options available on the render (like :status, :content_type, :location, :action, :layout, etc.).

Better Ruby Idioms

Carl and I have been working on the plugins system over the past few days. As part of that process, we read through the Rails Plugin Guide. While reading through the guide, we noticed a number of idioms presented in the guide that are serious overkill for the task at hand.

I don’t blame the author of the guide; the idioms presented are roughly the same that have been used since the early days of Rails. However, looking at them brought back memories of my early days using Rails, when the code made me feel as though Ruby was full of magic incantations and ceremony to accomplish relatively simple things.

Here’s an example:

module Yaffle
  def self.included(base)
    base.send :extend, ClassMethods
  end
 
  module ClassMethods
    # any method placed here will apply to classes, like Hickwall
    def acts_as_something
      send :include, InstanceMethods
    end
  end
 
  module InstanceMethods
    # any method placed here will apply to instaces, like @hickwall
  end
end

To begin with, the send is completely unneeded. The acts_as_something method will be run on the Class itself, giving the method access to include, a private method.

This code intended to be used as follows:

class ActiveRecord::Base
  include Yaffle
end
 
class Article < ActiveRecord::Base
  acts_as_yaffle
end

What the code does is:

  1. Register a hook so that when the module is included, the ClassMethods are extended onto the class
  2. In that module, define a method that includes the InstanceMethods
  3. So that you can say acts_as_something in your code

The crazy thing about all of this is that it’s completely reinventing the module system that Ruby already has. This would be exactly identical:

module Yaffle
  # any method placed here will apply to classes, like Hickwall
  def acts_as_something
    send :include, InstanceMethods
  end
 
  module InstanceMethods
    # any method placed here will apply to instances, like @hickwall
  end
end

To be used via:

class ActiveRecord::Base
  extend Yaffle
end
 
class Article < ActiveRecord::Base
  acts_as_yaffle
end

In a nutshell, there’s no point in overriding include to behave like extend when Ruby provides both!

To take this a bit further, you could do:

module Yaffle
  # any method placed here will apply to instances, like @hickwall, 
  # because that's how modules work!
end

To be used via:

class Article < ActiveRecord::Base
  include Yaffle
end

In effect, the initial code (override included hook to extend a method on, which then includes a module) is two layers of abstraction around a simple Ruby include!

Let’s look at a few more examples:

module Yaffle
  def self.included(base)
    base.send :extend, ClassMethods
  end
 
  module ClassMethods
    def acts_as_yaffle(options = {})
      cattr_accessor :yaffle_text_field
      self.yaffle_text_field = (options[:yaffle_text_field] || :last_squawk).to_s
    end
  end
end
 
ActiveRecord::Base.send :include, Yaffle

Again, we have the idiom of overriding include to behave like extend (instead of just using extend!).

A better solution:

module Yaffle
  def acts_as_yaffle(options = {})
    cattr_accessor :yaffle_text_field
    self.yaffle_text_field = options[:yaffle_text_field].to_s || "last_squawk"
  end
end
 
ActiveRecord::Base.extend Yaffle

In this case, it’s appropriate to use an acts_as_yaffle, since you’re providing additional options which could not be encapsulated using the normal Ruby extend.

Another “more advanced” case:

module Yaffle
  def self.included(base)
    base.send :extend, ClassMethods
  end
 
  module ClassMethods
    def acts_as_yaffle(options = {})
      cattr_accessor :yaffle_text_field
      self.yaffle_text_field = (options[:yaffle_text_field] || :last_squawk).to_s
      send :include, InstanceMethods
    end
  end
 
  module InstanceMethods
    def squawk(string)
      write_attribute(self.class.yaffle_text_field, string.to_squawk)
    end
  end
end
 
ActiveRecord::Base.send :include, Yaffle

Again, we have the idiom of overriding include to pretend to be an extend, and a send where it is not needed. Identical functionality:

module Yaffle
  def acts_as_yaffle(options = {})
    cattr_accessor :yaffle_text_field
    self.yaffle_text_field = (options[:yaffle_text_field] || :last_squawk).to_s
    include InstanceMethods
  end
 
  module InstanceMethods
    def squawk(string)
      write_attribute(self.class.yaffle_text_field, string.to_squawk)
    end
  end
end
 
ActiveRecord::Base.extend Yaffle

Of course, it is also possible to do:

module Yaffle
  def squawk(string)
    write_attribute(self.class.yaffle_text_field, string.to_squawk)
  end
end
 
class ActiveRecord::Base
  def self.acts_as_yaffle(options = {})
    cattr_accessor :yaffle_text_field
    self.yaffle_text_field = (options[:yaffle_text_field] || :last_squawk).to_s
    include Yaffle
  end
end

Since the module is always included in ActiveRecord::Base, there is no reason that the earlier code, with its additional modules and use of extend, is superior to simply reopening the class and adding the acts_as_yaffle method directly. Now we can put the squawk method directly inside the Yaffle module, where it can be included cleanly.

It may not seem like a huge deal, but it significantly reduces the amount of apparent magic in the plugin pattern, making it more accessible for new users. Additionally, it exposes the new user to include and extend quickly, instead of making them feel as though they were magic incantations requiring the use of send and special modules named ClassMethods in order to get them to work.

To be clear, I’m not saying that these idioms aren’t sometimes needed in special, advanced cases. However, I am saying that in the most common cases, they’re huge overkill that obscures the real functionality and confuses users.

Using the New Gem Bundler Today

As you might have heard, Carl and I released a new project that allows you to bundle your gems (both pure-ruby and native) with your application. Before I get into the process for using the bundler today, I’d like to go into the design goals of the project.

  • The bundler should allow the specification of all dependencies in a separate place from the application itself. In other words, it should be possible to determine the dependencies for an application without needing to start up the application.
  • The bundler should have a built-in dependency resolving mechanism, so it can determine the required gems for an entire set of dependencies.
  • Once the dependencies are resolved, it should be possible to get the application up and running on a new system without needing to check Rubyforge (or gemcutter) again. This is especially important for compiled gems (it should be possible to get the list of required gems once and compile on remote systems as desired).
  • Above all else, the bundler should provide a reproducible installation of Ruby applications. New gem releases or down remote servers should not be able to impact the successful installation of an application. In most cases, git clone; gem bundle should be all that is needed to get an application on a new system and up and running.
  • Finally, the bundler should not assume anything about Rails applications. While it should work flawlessly in the context of a Rails application, this should be because a Rails application is a Ruby application.

Using the Bundler Today

To use the gem bundler today in a non-Rails application, follow the following steps:

  1. gem install bundler
  2. Create a Gemfile in the root of your application
  3. Add dependencies to your Gemfile. See below for more details on the sorts of things you can do in the Gemfile. At the simplest level, gem "gem_name", "version" will add a dependency of the gem and version to your application
  4. At the root, run gem bundle. The bundler should tell you that it is resolving dependencies, then downloading and installing the gems.
  5. Add vendor/gems/gems, vendor/gems/specifications, vendor/gems/doc, and vendor/gems/environment.rb to your .gitignore file
  6. Inside your application, require vendor/gems/environment.rb to add the bundled dependencies to your load paths.
  7. Use Bundler.require_env :optional_environment to actually require the files.
  8. After committing, run gem bundle in a fresh clone to re-expand the gems. Since you left the vendor/gems/cache in your source control, new machines will be guaranteed to use the same files as the original machine, requiring no remote dependencies

The bundler will also install binaries into the app’s bin directory. You can, therefore, run bin/rackup for instance, which will ensure that the local bundle, rather than the system, is used. You can also run gem exec rackup, which runs any command using the local bundle. This allows things like gem exec ruby -e "puts Nokogiri::VERSION" or the even more adventurous gem exec bash, which will open a new shell in the context of the bundle.

Gemfile

You can do any of the following in the Gemfile:

  • gem "name", "version": version may be a strict version or a version requirement like &gt;= 1.0.6. The version is optional.
  • gem "name", "version", :require_as =&gt; "file": the require_as allows you to specify which file should be required when the require_env is called. By default, it is the gem’s name
  • gem "name", "version", :only =&gt; :testing: The environment name can be anything. It is used later in your require_env call. You may specify either :only, or :except constraints
  • gem "name", "version", :git =&gt; "git://github.com/wycats/thor": Specify a git repository to be used to satisfy the dependency. You must use a hard dependency (“1.0.6″) rather than a soft dependency (“>= 1.0.6″). If a .gemspec is found in the repository, it is used for further dependency lookup. If the repository has multiple .gemspecs, each directory will a .gemspec will be considered a gem.
  • gem "name", "version", :git =&gt; "git://github.com/wycats/thor", :branch =&gt; "experimental": Further specify a branch, tag, or ref to use. All of :branch, :tag, and :ref are valid options
  • gem "name", "version", :vendored_at =&gt; "vendor/nokogiri": In the next version of bundler, this option will be changing to :path. This specifies that the dependency can be found in the local file system, rather than remotely. It is resolved relative to the location of the Gemfile
  • clear_sources: Empties the list of gem sources to search inside of.
  • source "http://gems.github.com": Adds a gem source to the list of available gem sources.
  • bundle_path "vendor/my_gems": Changes the default location of bundled gems from vendor/gems
  • bin_path "my_executables": Changes the default location of the installed executables
  • disable_system_gems: Without this command, both bundled gems and system gems will be used. You can therefore have things like ruby-debug in your system and use it. However, it also means that you may be using something in development mode that is installed on your system but not available in production. For this reason, it is best to disable_system_gems
  • disable_rubygems: This completely disables rubygems, reducing startup times considerably. However, it often doesn’t work if libraries you are using depend on features of Rubygems. In this mode, the bundler shims the features of Rubygems that we know people are using, but it’s possible that someone is using a feature we’re unaware of. You are free to try disable_rubygems first, then remove it if it doesn’t work. Note that Rails 2.3 cannot be made to work in this mode
  • only :environment { gem "rails" }: You can use only or except in block mode to specify a number of gems at once

Bundler process

When you run gem bundle, a few things happen. First, the bundler attempts to resolve your list of dependencies against the gems you have already bundled. If they don’t resolve, the metadata for each specified source is fetched and the gems are downloaded. Next (either way), the bundler checks to see whether the downloaded gems are expanded. For any gem that is not yet expanded, the bundler expands it. Finally, the bundler creates the environment.rb file with the new settings. This means that running gem bundler over and over again will be extremely fast, because after the first time, all gems are downloaded and expanded. If you change settings, like disable_rubygems, running gem bundle again will simply regenerate the environment.rb.

Rails 2.3

To get this working with Rails 2.3, you need to create a preinitializer.rb and insert the following:

require "#{File.dirname(__FILE__)}/../vendor/bundler_gems/environment"
 
class Rails::Boot
  def run
    load_initializer
    extend_environment
    Rails::Initializer.run(:set_load_path)
  end
 
  def extend_environment
    Rails::Initializer.class_eval do
      old_load = instance_method(:load_environment)
      define_method(:load_environment) do
        Bundler.require_env RAILS_ENV
        old_load.bind(self).call
      end
    end
  end
end

It’s a bit ugly, but you can copy and paste that code and forget it. Astute readers will notice that we’re using vendor/bundler_gems/environment.rb. This is because Rails 2.3 attaches special, irrevocable meaning to vendor/gems. As a result, make sure to do the following in your Gemfile: bundle_path "vendor/bundler_gems".

Gemcutter uses this setup and it’s working great for them.

Bundler 0.7

We’re going to be releasing Bundler 0.7 tomorrow. It has some new features:

  • List outdated gems by passing --outdated-gems. Bundler conservatively does not update your gems simply because a new version came out that satisfies the requirement. This is so that you can be sure that the versions running on your local machine will make it safely to production. This will allow you to check for outdated gems so you can decide whether to update your gems with –update. Hat tip to manfred, who submitted this patch as part of his submission to the Rumble at Ruby en Rails
  • Specify the build requirements for gems in a YAML file that you specify with --build-options. The file looks something like this:
    mysql:
      config: /path/to/mysql_config

    This is equivalent to –with-mysql-config=/path/to/mysql_config

  • Specify :bundle =&gt; false to indicate that you want to use system gems for a particular dependency. This will ensure that it gets resolved correctly during dependency resolution but does not need to be included in the bundle
  • Support for multiple bundles containing multiple platforms. This is especially useful for people moving back and forth between Ruby 1.8 and 1.9 and don’t want to constantly have to nuke and rebundle
  • Deprecate :vendored_at and replace with :path
  • A new directory DSL method in the Gemfile:
    directory "vendor/rails" do
      gem "activesupport", :path =&gt; "activesupport" # :path is optional if it's identical to the gem name
                                                    # the version is optional if it can be determined from
                                                    # a gemspec
    end
  • You can do the same with the git DSL method
    git "git://github.com/rails/rails.git" do # :branch, :tag, or :ref work here
      gem "activesupport", :path =&gt; "activesupport" # same rules as directory, except that the files are
                                                    # first downloaded from git.
    end
  • Fix some bugs in resolving prerelease dependencies

Archives

Categories

Meta