Yehuda Katz is a member of the Ember.js, Ruby on Rails and jQuery Core Teams; he spends his daytime hours at the startup he founded, Tilde Inc.. Yehuda is co-author of best-selling jQuery in Action and Rails 3 in Action. He spends most of his time hacking on open source—his main projects, like Thor, Handlebars and Janus—or traveling the world doing evangelism work. He can be found on Twitter as @wycats and on Github.

The Building Blocks of Ruby

When showing off cool features of Ruby to the uninitiated (or to a language sparring partner), the excited Rubyist often shows off Ruby’s “powerful block syntax”. Unfortunately, the Rubyist uses “powerful block syntax” as shorthand for a number of features that the Pythonista or Javaist simply has no context for.

To start, we usually point at Rake, Rspec or Sinatra as examples of awesome usage of block syntax:

get "/hello" do
  "Hello World"
end

In response, Pythonistas usually point to these syntaxes as roughly equivalent:

@get('/hi')
def hello():
  return "Hello World"
 
def hello() -> "/hi":
  return "Hello World"

While the Python version may not be quite as pretty, nothing about them screams “Ruby has much stronger capabilities here”. Instead, by using examples like Sinatra, Rubyists trade in an argument about great semantic power for one about superficial beauty.

Rubyists, Pythonistas and others working on web development share a common language in JavaScript. When describing blocks to “outsiders” who share a common knowledge of JavaScript, we tend to point at JavaScript functions as a close analogue. Unfortunately, this only furthers the confusion.

On the Ruby side, when PHP or Java announces that they’re “adding closures”, many of us don’t stop to ask “what kind of closures?”

Cut to the Chase

Let’s cut to the chase and use a better example of the utility of Ruby blocks.

def append(location, data)
  path = Pathname.new(location)
  raise "Location does not exist" unless path.exist?
 
  File.open(path, "a") do |file|
    file.puts YAML.dump(data)
  end
 
  return data
end

Here, the File.open method takes a block. It then opens a new file (in “append” mode), and yields the open file into the block. When the block completes, Ruby closes the file. Except that Ruby doesn’t just close the file when the block completes; it guarantees that the File will be closed, even if executing the block results in a raise. Let’s take a look at the implementation of File in Rubinius:

def self.open(*args)
  io = new *args
 
  return io unless block_given?
 
  begin
    yield io
  ensure
    begin
      io.close unless io.closed?
    rescue StandardError
      # nothing, just swallow them.
    end
  end
end

This means that you can wrap up idioms like pervasive try/catch/finally in methods.

# Without blocks
def append(location, data)
  path = Pathname.new(location)
  raise "Location does not exist" unless path.exist?
 
  begin
    file = File.open(path, "a")
    file.puts YAML.dump(data)
  ensure
    file.close
  end
 
  return data
end

Because Ruby runs ensure clauses even when the exception happened in a block, programmers can reliably ensure that Ruby executes teardown logic hidden away in abstractions.

This example only demonstrates the power of well-designed lambdas. With the addition of one small additional feature, Ruby’s blocks become something altogether different.

def write(location, data)
  path = Pathname.new(location)
  raise "Location does not exist" unless path.exist?
 
  File.open(path, "w") do |file|
    return false if Digest::MD5.hexdigest(file.read) == data.hash
    file.puts YAML.dump(data)
  end
 
  return true
end

In the above case, imagine that writing the data to disk is quite expensive, and we can skip writing if the MD5 hash of the file’s contents match a hash method on the data. Here, we’ll return false if the method did not write to disk, and true if the method did.

Ruby’s blocks support non-local-return (some references), which means that a return from the block behaves identically to returning from the block’s original context. In this case, returning from inside the block returns from the write method, but Ruby will still run the ensure block closing the file.

You can think of non-local-return as behaving something like:

def write(location, data)
  path = Pathname.new(location)
  raise "Location does not exist" unless path.exist?
 
  File.open(path, "w") do |file|
    raise Return.new(false) if Digest::MD5.hexdigest(file.read) == data.hash
    file.puts YAML.dump(data)
  end
 
  return true
rescue Return => e
  return e.object
end

where Return is Return = Struct.new(:object).

Of course, any reasonable lambda implementation will support this, but Ruby’s version has the benefit of feeling just like a normal return, and requiring much less chrome to achieve it. It also behaves well in scenarios that already use rescue or ensure, avoiding mind-warping combinations.

Further, Ruby also supports super inside of blocks. Imagine the write method was defined on a subclass of a simpler class whose write method took the raw data from the file and printed it to a log.

def write(location, data)
  path = Pathname.new(location)
  raise "Location does not exist" unless path.exist?
 
  File.open(path, "w") do |file|
    file_data = file.read
    super(location, file_data)
    return false if Digest::MD5.hexdigest(file_data) == data.hash
    file.puts YAML.dump(data)
  end
 
  return true
end

In a purer lambda scenario, we would need to store off a reference to the self, then use that reference inside the lambda:

def write(location, data)
  path = Pathname.new(location)
  raise "Location does not exist" unless path.exist?
 
  this = self
  File.open(path, "w") do |file|
    file_data = file.read
 
    # imaginary Ruby construct that would be needed without
    # non-local-super
    this.super.write(location, file_data)
    raise Return.new(false) if Digest::MD5.hexdigest(file_data) == data.hash
    file.puts YAML.dump(data)
  end
 
  return true
rescue Return => e
  return e.object
end

You can also yield to a method’s block inside a block. Imagine that the write method is called with a block that chooses the correct data to use based on whether the file is executable:

def write(location)
  path = Pathname.new(location)
  raise "Location does not exist" unless path.exist?
 
  File.open(path, "w") do |file|
    file_data = file.read
    super(location)
    data = yield file
    return false if Digest::MD5.hexdigest(file_data) == data.hash
    file.puts YAML.dump(data)
  end
 
  return true
end

This would be called via:

write("/path/to/file") do |file|
  if file.executable?
    "#!/usr/bin/env ruby\nputs 'Hello World!'"
  else
    "Hello World!"
  end
end

In a pure-lambda language, we would take the block in as a normal argument to the function, then call it inside the closure:

def write(location, block)
  path = Pathname.new(location)
  raise "Location does not exist" unless path.exist?
 
  this = self
  File.open(path, "w") do |file|
    file_data = file.read
 
    # imaginary Ruby construct that would be needed without
    # non-local-super
    this.super.write(location, file_data)
    data = block.call(file)
    raise Return.new(false) if Digest::MD5.hexdigest(file_data) == data.hash
    file.puts YAML.dump(data)
  end
 
  return true
rescue Return => e
  return e.object
end

The real benefit of Ruby’s approach comes from the fact that the code inside the block would be identical if the method did not take a block. Consider the identical method, except taking a File instead of a location:

def write(file)
  file_data = file.read
  super(file)
  data = yield file
  return false if Digest::MD5.hexdigest(file_data) == data.hash
  file.puts YAML.dump(data)
  return true
end

Without the block, the Ruby code looks exactly the same. This means that Ruby programmers can more easily abstract out repeated patterns into methods that take blocks without having to rewrite a bunch of code. It also means that using a block does not interrupt the normal flow of code, and it’s possible to create new “control flow” constructs that behave almost identically to built-in control flow constructs like if and while.

Rails uses this to good effect with respond_to, which provides convenient syntax for declaring content negotiation:

def index
  @people = Person.find(:all)
 
  respond_to do |format|
    format.html # default action is render
    format.xml { render :xml => @people.xml }
  end
end

Because of the way Ruby blocks work, you can also return from any of the format blocks:

def index
  @people = Person.find(:all)
 
  respond_to do |format|
    format.html { redirect_to(person_path(@people.first)) and return }
    format.xml  { render :xml => @people.xml }
    format.json { render :json => @people.json }
  end
 
  session[:web_service] = true
end

Here, we returned from the HTML format after redirecting, allowing us to take additional action (setting a :web_service key on the session) for other cases (XML and JSON mime types).

Keep in mind that the code above is a demonstration of a number of features of Ruby’s blocks. It’s very rare to see return, yield and super all used in a single block. That said, Ruby programmers commonly use one or more of these constructs inside blocks, because their usage is seamless.

So Why Are Ruby’s Blocks Better?

If you made it this far, let’s take a look at another use of blocks in Ruby: mutex synchronization.

Java supports synchronization via a special synchronized keyword:

class Example {
  final Lock lock = new Lock();
 
  void example() {
    synchronized(lock) {
      // do dangerous stuff here
    }
  }
}

Essentially, Java provides a special construct for expressing the idea that it should run a block of code once at a time for a given instance of the synchronization object. Because Java provides a special construct, you can return from inside the synchronization block, and the Java runtime does the appropriate things.

Similarly, Python required the use of try/finally until Python 2.5, when they added a special language feature to handle the try/finally idiom:

class Example:
  # old
  def example(self):
    lock.acquire()
    try:
      ... access shared resource
    finally:
      lock.release() # release lock, no matter what
 
  # new
  def example(self):
    with lock:
      ... access shared resource

In Python’s 2.5′s case, the object passed to with must implement a special protocol (including __enter__ and __exit__ methods), so the with statement cannot be used like Ruby’s general-purpose, lightweight blocks.

Ruby represents the same concept using a method that takes a block:

class Example
  @@lock = Mutex.new
 
  def example
    @@lock.synchronize do
      # do dangerous stuff here
    end
  end
end

Importantly, synchronize is a normal Ruby method. The original version, written in pure Ruby, looks like this:

def synchronize
  lock
  begin
    yield
  ensure
    unlock
  end
end

It has all the hallmarks of what we’ve discussed so far. It locks, yields to the block, and ensures that the lock will be released. This means that if a Ruby programmer returns from inside the block, synchronize will behave correctly.

This example demonstrates the key power of Ruby’s blocks: they can easily replace language constructs. In this case, a Ruby programmer can take unsafe code, plop it inside a synchronization block, and it will continue to work.

Postscript

I’ve historically written my posts without very many links, mostly out of a fear of links going out of date. I’ve received increasing requests for more annotations in my posts, so I’ll start doing that. Let me know if you think my annotations in this post were useful, and feel free to give me any suggestions on that front that you find useful.

22 Responses to “The Building Blocks of Ruby”

Great Article. I thought the source code was great.

Great article. However, I’ve been looking for an article that compares Rails and Django, for the purposes of creating a webapp (i.e. focus on user-generated content, along with some content). Almost every post I’ve come across says “it’s up to religion” and leaves it at that. But for me, I have no preference between Ruby and Pyhton. Ruby has nice things like blocks, but on the other hand, Rails seems to use a lot of “magic” behind the scenes. Anyway, the point is, you’re good at writing comparisons. Can you do an entry on Rails vs. Django?

When you are using Pathname, you can translate:

File.open(path, “a”) do |file|
# …
end

to just:

path.open(“a”) do |file|
# …
end

Thanks for the good article.

@wycats I would also love to read an article about “magic” in Rails source. Of course after the article @Idris has asked for ;)

Great stuff!

Something to be aware of with non-local-return: The block must have access to the context you want to return from.

def a
yield
end
a{ return 0 } # => LocalJumpError: unexpected return

def c
yield
end

def b
c { return 1 }
end
b # => 1

def d
lambda{return 2}.call
end
d # => 2

Great post. Blocks are just one of the great thing I enjoy with Ruby. :D

Thanks for the article. My only criticism would be that you use file handling and tear-down logic as the example of blocks providing “much stronger capabilities”, then go on to describe the exact language feature of Python that does the same thing in your synchronization example, namely the “with” statement – still waiting for the compelling argument here. It’s also worth pointing out that this has been around for almost half a decade and isn’t some new fandangled feature as is implied by referencing the version number it was officially released with.

Keep in mind that the different ways to make closures (Proc.new, proc, lambda) aren’t always equivalent:

http://innig.net/software/ruby/closures-in-ruby.rb

Great post, and references are always a good thing. Even if the links fail with time, they’ll be useful until they do.

I love this! Great article, awesome comparisons. I too would love to see a Rails/Django comparison post from you. :)

My point is that in order to achieve the same functionality, other languages hardcode solutions for narrow cases. I didn’t mean to imply it was newfangled — I was using similar verbiage to that found in the official documentation.

Great article Yehuda, the examples are awesome

but python’s with statement is not anywhere narrow as you make it out to be. in fact, if you use it with a file object then it behaves almost exactly as your File.open example earlier, complete with try/finally boilerplate removal, non-local returns, and expected super behavior. a great example of the flexibility of the with-protocol is in the standard library decorator contextlib.contextmanager:

@contextlib.contextmanager()
def manager(*args, **kwargs):
# some setup work
yield object # a single yield in a contextmanager sends control out
# cleanup here – “finally” ensures this will run

# …

with manager(arg, kwarg=None) as object:
# this block is protected in try/finally so the rest of ‘manager’ gets run

it is also worth pointing out that the contextlib standard library module is pure-python, so there is no other “hardcoded solutions for narrow cases”, everything you see there is enabled by the simple with-protocol.

Thank you for the great writeup.

Most rubyists(I guess many of your readers are) are much more familiar with features that exist in Ruby 1.8 than 1.9. Maybe you can show some great usage of new features in 1.9 too?

Great article… I always wanted to write something like this myself about blocks but you’ve done a way better job than I could’ve. The points about Java needing a “synchronized” keyword to do something any Rubyist can accomplish with blocks (passed to their own methods) is a really powerful example of what Ruby affords you that other languages don’t, especially when the syntax is practically identical.

Loving ruby’s blocks. Thanks for the article.

For “Rails vs Django”, there are three things Rails is the winner:

1) Convention over Configuration.
Django does not have whole lot configuration, just one settings.py file. So Django is a framework with “Easy Configuration,” not “Convention over Configuration.”

2) REST
Rails really embraces REST. The seven-action controller is awesome. Django doesn’t have restful resource/route built-in. Once you go REST, you won’t go back.

3) The Rails Eco System
There are Rails plugins for everything. Plus, there are commercial support, books, blogs, screencasts and hostings for Rails. Django really lacks behind.

the biggest weakness about Ruby is it is interpreted.

how does someone protect his intellectual property (i.e code) if you are writing commercial applications (for the sake of discussion, let’s pretend a “word” application ) with it? you need to supply the code as part of your application deployment to your client, don’t you ?

now don’t come to us saying we need to encrypt and decrypt and install in a temp folder while running, this is too slow of an approach, and is still open to be broken into even by a junior level person.

your thoughts ?

yehuda,

Ever played with Rebol ?

I used to use Ruby (with fox/fx library) . I tried Rebol in December 2009, it looks cooler than Ruby. I like their philosophy too, “let’s do it in 1Mb not in 200 Mb” , and they fight against software complexity and bloatware

view layout [ label " type your name" field button "click" ]

gives you a screen with a label , an empty field and a button to click.

how many lines of code does it take to do that in Java, .Net or Ruby?

its full environment is under 1 Mb. amazing isn’t it? Compare that to Ruby’s environment.

Here’s a really old blog post – http://t-a-w.blogspot.com/2007/05/syntactic-tradeoffs-in-functional.html

The point of Ruby blocks is that even in highly functional languages which don’t have special syntax for 1-lambda case, virtually all (99.3% in the sample) of functions take 0 or 1 lambda. 2+ lambda functions almost never happen.

Optimizing syntax for the most common case is a huge win.

As mentioned above, you’ve misrepresented Python’s equivalent feature. The contextmanager allows you to turn any function into something that takes a block, similar to Ruby. It is one extra line of code to turn a function into a contextmanager, which seems pretty “lightweight” to me.

I am a Java Developer and I have decided to learn Ruby/Rails to increase my productivity :)
I am really amazed about the power of Ruby and I think it will take quite some time and practice to get me acquainted with it.
Thanks for your article, I will use it as a reference when I will start encountering strange bugs (I hope I will not need it that much :) ).
BTW, I have just written an article on my blog about the features of Ruby: http://a-developer-life.blogspot.com/2011/04/programming-in-ruby-features-you-must.html#more

In your `raise Return new` would it be better to use ruby throw/catch instead of raise/rescue?

There’s an aesthetic argument that exceptions should be used for exceptional circumstances, not flow control; the little-used throw/catch are for flow control if you need that kind of flow control. There’s also apparently a performance argument, in at least some ruby interpreters (like, apparently jruby especially) raise/rescue are ridiculously expensive, and throw/catch are not. (The expense doesn’t matter that much when you’re using them for actual exceptional conditions, but when used for flow control, such that they’re going to be called all the time, possibly in an ‘inner loop’…)

People come to wycats for good best practices (rightfully!), so I am surprised to see raise/rescue recommended for non-exceptional flow control.

Wouldn’t this example actually blow away the file by opening it for writing and then result in some kind of IO error when you try to subsequently read from a file opened for write?

Leave a Reply

Archives

Categories

Meta