Yehuda Katz is a member of the Ruby on Rails core team, and lead developer of the Merb project. He is a member of the jQuery Core Team, and a core contributor to DataMapper. He contributes to many open source projects, like Rubinius and Johnson, and works on some he created himself, like Thor.

@charlmatthee fixed!

Archive for November, 2008

Stop Watching Sophie’s Choice (And Get Some Work Done)

I woke up this morning to Kirin Dave’s cantankerous rant about how Ruby’s going down the tubes. The whole post was a giant whine-fest, with the exception of the beginning, where he heaps faux past praise on Ruby to justify his complaints.

The entire post read kind of like Joe Lieberman supporting McCain: “I used to be a Democrat, but now I think Obama’s in love with terrorists.” Dave didn’t even really attempt to be even-handed in his critique (if I can even call it a critique); he just goes after the Ruby language, interpreter, and community with the full force of his giant… rhetoric.

Let’s take a look at Dave’s claims:

Claim 1. Ruby’s interpreter is so outdated, it’s impossible to write code without it leaking memory. As evidence, he points to a case encountered by Tom Preston-Warner in his God monitor library. What Dave fails to point out is that despite his claims that Ruby 1.9 is basically useless, the last post in the thread he references informs the readers that the bug is fixed in Ruby 1.9.

He also doesn’t inform the reader that many large web applications, including YellowPages.com, scribd.com, and hulu.com, not exactly tiny web sites, are running on infrastructures that depend on this “outdated” interpreter. While I’m not going to claim that the Ruby interpreter is all rainbows, it’s hardly the problem that Dave claims it is.

Dave also dismisses JRuby with a wave of his hand, using the argument that it’s not useful for web applications because it can’t be used for scripting (which Ruby is frequently used for). This is a classic example of a red herring fallacy. The existing “terrible” Ruby interpreter works just fine for scripting, where the supposed memory leaks aren’t an issue. As a result, dismissing JRuby, which solves all of his other deployment concerns, is just pure malarky.

Claim 2. Ruby is the slowest thing imaginable. I’ve tackled this argument with some vigor before, but suffice it to say that real-life Ruby applications must be compared against real-life PHP or Django applications, and they perform quite well. Even Rails, not exactly the fastest Ruby web framework, beats out CodeIgniter in Hello World benchmarks, and is dead even with CodeIgniter on more robust benchmarks. Comparing it with CakePHP, which is a closer feature-for-feature comparison, Rails completely destroys Cake in all benchmarks.

Merb, which takes more effort to avoid being slow, does significantly better than Rails, and beats CodeIgniter on hello world benches by around 5x. The reason for this is that despite Dave’s claims that the Ruby community is stagnant, some of the worst speed offenders have been handled by native extensions that provide speedups for the community without the community having to drop into C all the time. The most recent example: ThirdBase, a library to make Ruby’s date facilities an order of magnitude faster than the built-in Date class.

Because the Ruby interpreter has a fairly good C API, developers without commit access have been able to improve the speed of everything from web servers (who uses the stdlib’s webrick!?) to XML parsing (why use REXML when you can use Nokogiri and gain orders of magnitude in speed). Real-life applications are simply not as slow as the benchmarks (which even Dave admits are not dispositive) would seem to imply. Real apps simply spend more time IO-waiting on things like databases or grabbing strings out of cache than in calculating the fibonacci sequence.

Claim 3. Rubinius didn’t happen and JRuby… these are not the droids you’re looking for… Rubinius didn’t happen. Somehow, in all his bloviating, Dave didn’t really address JRuby at all. Let’s take a look at his hand-wave real quick:

jRuby is great, but I kind of liked the illusion of a small memory footprint–a lot of Ruby use is in scripting and starting up a “java -server” instance is not really desirable for that. 

So let’s take a look: JRuby has too large a memory footprint. Actually, not so much. When you take into consideration that you only need a single JVM instead of multiple processes, JRuby starts looking very competitive. In fact, when you start factoring in the impact of real threads on IO-wait, JRuby starts killing MRI.

I already addressed the other claim: Ruby is used in scripting and JRuby is thus unsuitable for it. For those cases, MRI works just fine. Even if we were to grant Dave his claim of unbeatable memory problems, scripts can handle a slow 200k leak every 10,000 seconds (the big gotcha memory leak). As a result of all the effort around alternative implementations (and spearheaded by Rubinius), JRuby and MRI are remarkably compatible. Developing on MRI (for fast startup times) and deploying on JRuby (for a rock-solid memory footprint and great concurrency handling) is viable by design.

In essence “There are no good choices for a Ruby interpreter” is straight-up not true.

Claim 4. Ruby’s future is… yawn. Dave takes a look at the new features in Ruby, and based on the future of the Ruby language, decides that Ruby itself is booooring. He makes one tiny mistake when he claims that better FFI was only in alternative implementations: the aforementioned C API made it possible for wmeissner of the JRuby team to roll out a 100% compatible version of Rubinius’ FFI (which they have now adopted) for Ruby 1.8.

Probably the most misleading part of this claim is ignoring all the work being done on Ruby libraries. I can be competitive with Rails from time to time, but even I’m not going to claim that Rails is no longer producing good work. And Merb is fairly well acknowledged for innovation around these parts. Dave says:

And what do I have to look forward to in the Ruby world? Essentially the same promise that I heard in late 2005, “Christmas will bring a faster ruby.” 

This assumes that the only interesting things happening around a library are related to its core interpreter or the core language. While Ruby 1.9 does see some improvements to the complaints that he raised, and JRuby flat-our resolves them, Dave’s fundamental mistake is here. By assuming that the only way the community around a language can get better is by improving its core, he is ignoring reality: there are tons of exciting developments around Ruby that are being pushed by its users.

Dave closes by saying:

People will leave trying to stake out The Next Big Thing™, bleeding the Ruby talent pool dry and making it even less capable of recovering from these bad fortunes.

I predict the opposite. The Ruby talent pool will continue to grow, and we will continue to produce new, innovative, and exciting projects. JRuby will continue to improve and Rubinius will be released, sparking a new Ruby speed war that will lead to the faster Ruby that so dominates Dave’s dissatisfaction. And the Ruby community will continue to learn from other languages, like Python and Erlang. It’s only going to get better from here.

Merb 1.0.2

We’re releasing Merb 1.0.2 today, which is just a small number of patches that address some of the most urgent issues:

  • Some of the built-in rake tasks were deleting some gems that were in the specs as fixtures. This only affected people building Merb from the git sources, and is now resolved.
  • The Merb source code hardcoded in certain versions in multiple places. We have removed another such case, and created Merb::DM_VERSION to specify what version of DataMapper the merb stack shipped with. We now use that variable when generating the merb gemspec.
  • Merb was truncating the log file with each start. It now opens the file in append mode.
  • Many people reported that they accidentally overrode a core Merb::Controller method in their controllers, and received only an obscure warning. Merb now raises an error when you attempt to override a non-overridable method in your controller. If you want to override something anyway, call override! :your_method before doing so.
  • merb -i was silently failing for users who did not have webrat installed. We now print a warning and allow merb -i to load. Webrat is required for some advanced features that allow you to emulate the browser in the console.
  • An issue with numeric routes (ticket 1036) is now resolved.
  • An issue with cookies (ticket 1022) is now resolved.
In case people were wondering, we are working on 1.0.x in parallel with 1.1. The 1.0.x will continue to provide bugfixes for the 1.0 release, while 1.1 will add additional features in the march toward 2.0. You can get the nightlies of both the latest 1.0.x release and the 1.1 release at the edge.merbivore.com gem server.

Merb 1.0 Spec change

When we released 1.0, we also copied the existing 1.0 specs into a directory marked spec10 and added a rake task called spec:oneoh. This allows us to make sure that new versions of Merb still run against the same API as Merb 1.0, and that we’re not accidentally breaking working 1.0 APIs (to the extent that our 1.0 specs can make such an assertion).

The rule is that those specs must not be changed, with one exception. Specifically, we’re allowed to modify the 1.0 specs if the spec itself broke for reasons unrelated to the thing it was attempting to spec, and the breakage does not indicate a breakage in the 1.0 API. Additionally, any such change requires a public announcement, to minimize the number of such changes.

As a result, I’m announcing a small change in spec10, and the rationale for that change.

Many people have accidentally overridden core Merb::Controller methods in their controllers, with unpredictable (and confusing) results. From my Rails days, I remember this happening as well. As a result, Merb, starting with 1.0.2 and 1.1, will raise an error if you attempt to override a Merb::Controller method.

One of our tests had a controller which used the “method” action. Of course, “method” is already a method on all objects, so our new code caused it to raise an error. However, the test was not asserting that it was legal to override “method”; it was the same sort of accident that caused us to add this feature in the first place. As a result, we have gone into the spec10 directory and modified the action name to “request_method” instead of “method”, allowing 1.0.2 to pass the 1.0 specs.

And that requires this public post.

I hope that these posts will be few and far between, but at the very least, they show a commitment to the public API from the Merb core team that I hope you find reassuring. Thanks!

MythBusting — We Agree! Ruby is Awesome!

With all the disagreement over the past few days, it might seem like Merb and Rails are worlds apart. But David’s latest post demonstrates what I’ve been saying all along: we’re more alike than different.

In the latest installment, he takes on the myth that “Rails is hard because of Ruby”. Effectively, a bunch of people are comfortable in their language of choice (PHP, Java, Perl) and prefer to switch over to an MVC clone in their language of choice than learn the big bad scary Ruby.

As David succinctly argued, Ruby is just not that hard a language to learn. It’s better organized than PHP (with its absurdly large global namespace) and less ceremonious than Java (no IStrategyBuilderFactories here).

And what’s fantastic about Ruby is how quickly and simply new programmers are exposed to advanced concepts like lambdas. Because iteration is accomplished in Ruby almost exclusively with blocks, it’s near impossible to spend even a day in Ruby without learning what blocks are. Spend a little more time with Ruby, and the power of the closures that come along with lambdas becomes obvious. All without the need for an extensive study of the CS benefits of the construct.

Ruby is so easy, in fact, that the slim Learn to Program tome is written using Ruby as a base. It’s so straight forward that my wife, who’s hardly a programmer, was writing a program to count down “99 bottles of beer on the wall” within a few hours of picking up the book (and was even embellishing the program with a few interactive touches :-D )

Finally, implicit in the “I don’t want Rails because it’s Ruby” claim is the oft-heard myth that Ruby is slow. As I explained in my keynote at MerbCamp, Ruby does very well even compared with raw PHP. But when you compare Ruby frameworks (even Rails) against CakePHP, Rails destroys the competition. And Merb does even better. That’s because PHP’s fundamental architecture does not play very well with large frameworks, while Ruby deployment options can manage a fairly large framework with very small runtime impact.

What I showed at MerbCamp was that it was possible to squeeze up to 4,000 requests per second through the Merb stack, or around 1,500 requests per second when going through a controller and template, but that frameworks like CakePHP got only about 100 requests per second, even with a code accelerator.

Bottom line: don’t let anyone tell you that Ruby web applications need be slow. The language itself is certainly slow, but I don’t see a ton of Fibonacci web applications being built, so the real question is about where the bottlenecks are, and Ruby acquits itself very well.

MythBusting — Rails is not a monolith

Continuing his interesting train of thought, DHH posted on his blog yesterday that Rails is, in fact, not monolithic. In reality, he says, it is quite modular. In the post, he finally fully articulated the rationale behind the excessive use of alias_method_chain in Rails, which he says is to keep the code even more “modular”.

Let’s take a look at some of the claims:

Rails is not actually that large

They count all lines including comments and whitespace in Ruby files, thus punishing well-documented and formatted code

As I said yesterday, I did not actually include comments and whitespace. I specifically provided the command that I used, which removed whitespace-only lines and comment-only lines. In fact, the comment by bitsweat (a member of the Rails Core team) which started this back-and-forth erroneously included comments in Merb’s count, when merb-core has about 1 line of comment per line of code. This is why Jeremy incorrectly thought that merb-core was over 15,000 lines of code.

They count tests, thus punishing well-tested code

Neither Jeremy nor I counted tests. I’m not sure which LOC-count he’s referring to.

They count bundled dependencies, thus punishing dependency-free code

I did in fact make this mistake, but I disagree with the assertion that bundling dependencies makes code more “modular”. For example, bundling xml-simple in Rails causes a conflict with gems that require a newer version of xml-simple than the one bundled with Rails. Conversely, merb-haml has a haml dependency, which means that users can use newer versions of Haml that are released between Merb releases.

This statement by David actually teases out a fundamental difference of opinion between Rails and Merb. In effect, Rails prefers to bundle everything to reduce dependencies, while Merb prefers to use the existing Rubygems system so that applications can use different versions of the “bundled” dependencies.

Additionally, we don’t want to be maintaining bitrotted versions of things like tmail. We’d prefer to rely on gems created by experts in their niche who can maintain, and more importantly, fix bugs in, their code. We fully admit that our approach pushes the limits of Rubygems, but that has forced us to work with rubygems to improve a core piece of Ruby infrastructure.

Rails is actually pretty modular

The arguments made here almost defy reason, but let’s go through some of them.

First, Rails can include almost as much or as little of the six major pieces as you prefer.

Absolutely. I’ve referred to this in the past as the Lego vs. Duplo philosophy. Rails has added in a feature to allow you to remove entire blocks of functionality, but isn’t built on an architecture that lets you granularly opt-out. Granular opt-out allows you to reuse foundational code without buying into the full set of opinionated defaults.

One example of this is our auth system, which allows you to reuse the base auth code, which simply allows you to define strategies inside of a framework, even if you don’t want to use our built-in strategies or login views.

Granular opt-out builds a community around chunks of code that can be swapped in; having an auth core makes it easy to share small, simple authentication strategies between users of Merb.

The next part is the part that makes me incredulous. According to David, because Rails is spread across many files, it is “modular”. He describes how you would go about granularly removing certain features:

All these optional parts can actually very easily be turned off as well, if you so please. If you look at actionpack/lib/action_controller.rb, you’ll see something like the following:

ActionController::Base.class_eval do

  include ActionController::Flash
  include ActionController::Benchmarking
  include ActionController::Caching
  ...

This is where all the optional bits are being mixed into Action Pack. But they didn’t need to be. If you really wanted to, you could just edit this 1 file and remove the optional bits you didn’t need and you’d have some 3,500 lines of optional goodies to pick from.

Read that carefully. If you want to opt out of certain parts of Rails, you need to:

  1. Read the source, and figure out which files contain the features you want
  2. Figure out where the modules in question have been mixed in
  3. Fork Rails
  4. Modify your own personal version of Rails to remove modules that have been mixed in
  5. Never upgrade Rails again (or upgrade Rails and hope they haven’t made any changes to the part of the file you’ve modified)

Fundamentally, Rails’ “modular” architecture is fine for core committers, but it’s not particularly useful for consumers of the framework. Which makes sense, since 37 Signals sees Rails primarily as a library that they use to write their apps. So thinking about modularity in terms of the code of the framework itself makes perfect sense from that perspective.

On the other hand, Merb looks at modularity from the perspective of the developer using the framework.

alias_method_chain makes Rails modular

As I’ve said many times before, I don’t like alias_method_chain. But Rails’ philosophy around it is a perfect example of its problems. Superficially, it seems like it divides up responsibilities neatly into their own modules. And again, this is perfectly true from the perspective of developers working on the framework itself.

For consumers of the framework, it is simply maddening. Here’s a code snippet David provided in his post:

module Benchmarking
  def self.included(base)
    base.extend(ClassMethods)

    base.class_eval do
      alias_method_chain :perform_action, :benchmark
      alias_method_chain :render, :benchmark
    end
  end

This is an extremely common idiom in Rails. Unfortunately, it makes Rails methods extremely opaque. It is nearly impossible, without reading through a dozen files and putting together the puzzle, to figure out what the perform_action method actually does. This is evident in a Rails stack trace, which includes close to 10 frames for different parts of perform_action.

And this alias_method_chain isn’t even greppable. Effectively, it’s up to the user to divine that including a module modifies methods in the class, since it’s done via metaprogramming in an included hook. Again, this all makes perfect sense for developers on the Rails framework. But for consumers of the framework, it leads to many frustrating days trying to track down all the pieces of a particular method.

But why bother?

Again, this question makes perfect sense coming from the perspective of using Rails as an internal library. Don’t need something, like pagination? Simply move it out of the core framework. From the perspective of trying to develop something that can be used for many different purposes, in many different situations, we have different priorities.

Merb has already been used as the base for SproutCore, which generates static HTML out of a Merb base. It has been used for sinatra-like web services. And it’s being used by several large companies who don’t want to use DataMapper (and are using Sequel instead).

As people use Merb for more kinds of things, decisions like hardcoding features of Merb to a particular ORM (like Rails has done), requiring a certain directory structure, or bundling in dependencies become much harder to justify. People using Merb actually do use other gems that conflict with Rails bundled dependencies. People using Merb want to be able to upgrade their versions of Haml, ParseTree, or MailFactory without expensive surgery to the framework itself.

For the moment, these differences are the reason that Rails will continue to dominate amongst developers seeking to build apps similar in scope to apps built by 37Signals. I suspect that Merb will pick up steam amongst developers looking to build innovative apps leveraging the latest and greatest Ruby techniques and libraries.

I, for one, spent several years building large Rails applications, and am happy to leave the “just keep your own frozen, patched copy of Rails” philosophy behind.

UPDATE: Just to be clear, I know that Rails tries to use more recent versions of gems if they’re available on the system. However, that is a very naïve dependency approach, that in our experience, produced issues. It’s much better to be warned that you have dependency conflicts up front than to have them manifest in running production code.

MythBusting — Merb Style

The Rails team has lately been out in full force trying to bust some myths that seem to have come from the Merb community. In a reddit comment, in attempting to rebut a claim that Merb was significantly smaller than Rails, bitsweat made this “myth-busting” claim:

The merb stack is about 30KLOC; merb-core is half that. Rails minus Active Record is 46KLOC.

That seemed wrong to me, since I’ve recently clocked in merb-core at just 7,000 lines of code, and I know that merb-more plugins are relatively small, so I decided to do some counting.

My tool of choice: find lib -name "*.rb" | xargs cat |grep -v "^[[:space:]]*$" |grep -v "^[[:space:]]*\#" |wc -l

It may be relatively coarse, but it should be close enough to the actual figure to get us some real numbers. In case you need help there, I’m getting all Ruby files in the lib directory, catting them, then removing lines that are just whitespace or begin with “#” (comments).

Drumroll please.

merb-core clocks in at just 6,944. If you add in all of merb-more but auth, exceptions, and slices (features that Rails doesn’t have), that “bloats” us up to 13,322. If you add in those plugins (which clock in at a “massive” 2,282 LOC) we’re up to 15,604. Finally, throw in extlib (our “clone” of ActiveSupport) and we’re up to 17,291.

Next stop: Rails.

As per bitsweat, we did not count ActiveRecord. We also didn’t count ActiveResource or ActiveModel, which don’t have analogues in Merb.

Starting with humble action-pack, it clocks in at a “tiny” 12,126 LOC. action-mailer clocks in at a “miniscule” 6,405 LOC (take that, merb-mailer, with your bloated 216 LOC). ActiveSupport clocks in at 23,217 LOC. I wonder what they had to leave out to keep it so small compared to the extlib “clone”.

And finally, railties, a veritable smorgasbord of Rails extras, clocks in at 5,447 LOC.

merb-core + merb-more + extlib clocks in at 17,291. Rails clocks in at 47,195.

As you can see, it’s a tie.

Oh, and I forgot to mention that Merb contains rack handlers for mongrel, thin, ebb, fcgi, cgi, webrick, etc. which clock in at over 700 LOC. In Rails-land, that code is contained in the individual gems (such as the mongrel or thin gem).

UPDATE: When you remove the LOC that’s the result of bundled gems, it’s still a tie, with Rails clocking in at 23,923 and Merb clocking in at 15,009. See my Nov. 15 post for a more exhaustive treatment of modularity.

Merb RC5 (Final RC!)

Today, we’re releasing the final RC before 1.0 final ships at RubyConf. Again, the vast majority of the changes were bugfixes, and there were some concrete improvements as well:

  • The Merb spec suite was modified to run correctly on JRuby and Windows. All specs pass :)
  • Merb action-args now has support for jruby. Unfortunately, it doesn’t work with actions defined via define_method (this will be reported to jruby for a fix).
  • Webrat support is included out of the box with Merb. It currently requires the HEAD version of webrat, but we’re told they’ll be releasing a new version at RubyConf. See below for some webrat examples.
  • We now have a nightly gem server, hosted at http://edge.merbivore.com. At the moment, we release the next version every night (for instance, last nigh we released 0.9.13), which means you need to manually remove the merb gems before updating. Soon, we’ll be releasing 0.9.13.xxxx, so you’ll be able to run gem cleanup to remove old nightlies.
  • The goal with the nightly server is for people to use nightlies for riding edge as opposed to trying to install out of git itself. Trying to install out of git has its share of issues compared with installing from rubygems. To install from the remote gem server, either do gem install merb –source http://edge.merbivore.com –source http://gems.rubyforge.org or add edge.merbivore.com to your gem sources via gem source –add.
  • We put some work into the bundling system (as I previewed last week). To use bundling, simply make sure you have a gems directory in your application, then run thor merb:gem:install merb. This will install merb and all its dependencies, as well as binary scripts to work with the gems in Merb.root/gems. To run Merb, you run bin/merb. Note that you’ll probably need to install mongrel as well, via thor merb:gem:install mongrel if you want to use the bare merb command.
  • The view helpers were converted to use nokogiri and fallback on rexml when nokogiri is not available. have_selector now will take a CSS3 compatible selector, as opposed to the somewhat buggy and non-compliant hpricot selector.
  • have_tag and match_tag convert the attributes to a CSS selector in the background. We left have_tag and match_tag in for backward compatibility, although, it is advisable to use have_selector over have_tag. 
  • It is possible that have_tag / match_tag do not behave exactly the same way as a few bugs in the logic were fixed by moving over to a nokogiri back end.
  • As a result of this work, we have now marked all test methods with their API level, which should be locked as of 1.0.
We’re still on track for 1.0 final at RubyConf. The remaining major tasks are:
  • fully vet the test suite so that it can be used moving forward for all 1.x.x releases
  • work on the merb server to handle graceful shutdown of worker threads, as well as –restart to encapsulate the idiom of startup=>graceful kill.
  • go through the remaining tickets on LightHouse and fix them or defer to 1.x