The Building Blocks of Ruby
When showing off cool features of Ruby to the uninitiated (or to a language sparring partner), the excited Rubyist often shows off Ruby's "powerful block syntax". Unfortunately, the Rubyist uses "powerful block syntax" as shorthand for a number of features that the Pythonista or Javaist simply has no context for.
To start, we usually point at Rake, Rspec or Sinatra as examples of awesome usage of block syntax:
get "/hello" do
"Hello World"
end
In response, Pythonistas usually point to these syntaxes as roughly equivalent:
@get('/hi')
def hello():
return "Hello World"
def hello() -> "/hi":
return "Hello World"
While the Python version may not be quite as pretty, nothing about them screams "Ruby has much stronger capabilities here". Instead, by using examples like Sinatra, Rubyists trade in an argument about great semantic power for one about superficial beauty.
Rubyists, Pythonistas and others working on web development share a common language in JavaScript. When describing blocks to "outsiders" who share a common knowledge of JavaScript, we tend to point at JavaScript functions as a close analogue. Unfortunately, this only furthers the confusion.
On the Ruby side, when PHP or Java announces that they're "adding closures", many of us don't stop to ask "what kind of closures?"
Cut to the Chase
Let's cut to the chase and use a better example of the utility of Ruby blocks.
def append(location, data)
path = Pathname.new(location)
raise "Location does not exist" unless path.exist?
File.open(path, "a") do |file|
file.puts YAML.dump(data)
end
return data
end
Here, the File.open
method takes a block. It then opens a new file (in "append" mode), and yields the open file into the block. When the block completes, Ruby closes the file. Except that Ruby doesn't just close the file when the block completes; it guarantees that the File will be closed, even if executing the block results in a raise. Let's take a look at the implementation of File in Rubinius:
def self.open(*args)
io = new *args
return io unless block_given?
begin
yield io
ensure
begin
io.close unless io.closed?
rescue StandardError
# nothing, just swallow them.
end
end
end
This means that you can wrap up idioms like pervasive try/catch/finally in methods.
# Without blocks
def append(location, data)
path = Pathname.new(location)
raise "Location does not exist" unless path.exist?
begin
file = File.open(path, "a")
file.puts YAML.dump(data)
ensure
file.close
end
return data
end
Because Ruby runs ensure clauses even when the exception happened in a block, programmers can reliably ensure that Ruby executes teardown logic hidden away in abstractions.
This example only demonstrates the power of well-designed lambdas. With the addition of one small additional feature, Ruby's blocks become something altogether different.
def write(location, data)
path = Pathname.new(location)
raise "Location does not exist" unless path.exist?
File.open(path, "w") do |file|
return false if Digest::MD5.hexdigest(file.read) == data.hash
file.puts YAML.dump(data)
end
return true
end
In the above case, imagine that writing the data to disk is quite expensive, and we can skip writing if the MD5 hash of the file's contents match a hash
method on the data. Here, we'll return false if the method did not write to disk, and true if the method did.
Ruby's blocks support non-local-return (some references), which means that a return from the block behaves identically to returning from the block's original context. In this case, returning from inside the block returns from the write
method, but Ruby will still run the ensure
block closing the file.
You can think of non-local-return as behaving something like:
def write(location, data)
path = Pathname.new(location)
raise "Location does not exist" unless path.exist?
File.open(path, "w") do |file|
raise Return.new(false) if Digest::MD5.hexdigest(file.read) == data.hash
file.puts YAML.dump(data)
end
return true
rescue Return => e
return e.object
end
where Return is Return = Struct.new(:object)
.
Of course, any reasonable lambda implementation will support this, but Ruby's version has the benefit of feeling just like a normal return, and requiring much less chrome to achieve it. It also behaves well in scenarios that already use rescue
or ensure
, avoiding mind-warping combinations.
Further, Ruby also supports super inside of blocks. Imagine the write
method was defined on a subclass of a simpler class whose write
method took the raw data from the file and printed it to a log.
def write(location, data)
path = Pathname.new(location)
raise "Location does not exist" unless path.exist?
File.open(path, "w") do |file|
file_data = file.read
super(location, file_data)
return false if Digest::MD5.hexdigest(file_data) == data.hash
file.puts YAML.dump(data)
end
return true
end
In a purer lambda scenario, we would need to store off a reference to the self, then use that reference inside the lambda:
def write(location, data)
path = Pathname.new(location)
raise "Location does not exist" unless path.exist?
this = self
File.open(path, "w") do |file|
file_data = file.read
# imaginary Ruby construct that would be needed without
# non-local-super
this.super.write(location, file_data)
raise Return.new(false) if Digest::MD5.hexdigest(file_data) == data.hash
file.puts YAML.dump(data)
end
return true
rescue Return => e
return e.object
end
You can also yield to a method's block inside a block. Imagine that the write method is called with a block that chooses the correct data to use based on whether the file is executable:
def write(location)
path = Pathname.new(location)
raise "Location does not exist" unless path.exist?
File.open(path, "w") do |file|
file_data = file.read
super(location)
data = yield file
return false if Digest::MD5.hexdigest(file_data) == data.hash
file.puts YAML.dump(data)
end
return true
end
This would be called via:
write("/path/to/file") do |file|
if file.executable?
"#!/usr/bin/env ruby\nputs 'Hello World!'"
else
"Hello World!"
end
end
In a pure-lambda language, we would take the block in as a normal argument to the function, then call it inside the closure:
def write(location, block)
path = Pathname.new(location)
raise "Location does not exist" unless path.exist?
this = self
File.open(path, "w") do |file|
file_data = file.read
# imaginary Ruby construct that would be needed without
# non-local-super
this.super.write(location, file_data)
data = block.call(file)
raise Return.new(false) if Digest::MD5.hexdigest(file_data) == data.hash
file.puts YAML.dump(data)
end
return true
rescue Return => e
return e.object
end
The real benefit of Ruby's approach comes from the fact that the code inside the block would be identical if the method did not take a block. Consider the identical method, except taking a File instead of a location:
def write(file)
file_data = file.read
super(file)
data = yield file
return false if Digest::MD5.hexdigest(file_data) == data.hash
file.puts YAML.dump(data)
return true
end
Without the block, the Ruby code looks exactly the same. This means that Ruby programmers can more easily abstract out repeated patterns into methods that take blocks without having to rewrite a bunch of code. It also means that using a block does not interrupt the normal flow of code, and it's possible to create new "control flow" constructs that behave almost identically to built-in control flow constructs like if
and while
.
Rails uses this to good effect with respond_to, which provides convenient syntax for declaring content negotiation:
def index
@people = Person.find(:all)
respond_to do |format|
format.html # default action is render
format.xml { render :xml => @people.xml }
end
end
Because of the way Ruby blocks work, you can also return from any of the format blocks:
def index
@people = Person.find(:all)
respond_to do |format|
format.html { redirect_to(person_path(@people.first)) and return }
format.xml { render :xml => @people.xml }
format.json { render :json => @people.json }
end
session[:web_service] = true
end
Here, we returned from the HTML format after redirecting, allowing us to take additional action (setting a :web_service key on the session) for other cases (XML and JSON mime types).
Keep in mind that the code above is a demonstration of a number of features of Ruby's blocks. It's very rare to see return
, yield
and super
all used in a single block. That said, Ruby programmers commonly use one or more of these constructs inside blocks, because their usage is seamless.
So Why Are Ruby's Blocks Better?
If you made it this far, let's take a look at another use of blocks in Ruby: mutex synchronization.
Java supports synchronization via a special synchronized
keyword:
class Example {
final Lock lock = new Lock();
void example() {
synchronized(lock) {
// do dangerous stuff here
}
}
}
Essentially, Java provides a special construct for expressing the idea that it should run a block of code once at a time for a given instance of the synchronization object. Because Java provides a special construct, you can return from inside the synchronization block, and the Java runtime does the appropriate things.
Similarly, Python required the use of try/finally until Python 2.5, when they added a special language feature to handle the try/finally
idiom:
class Example:
# old
def example(self):
lock.acquire()
try:
... access shared resource
finally:
lock.release() # release lock, no matter what
# new
def example(self):
with lock:
... access shared resource
In Python's 2.5's case, the object passed to with
must implement a special protocol (including enter
and exit
methods), so the with
statement cannot be used like Ruby's general-purpose, lightweight blocks.
Ruby represents the same concept using a method that takes a block:
class Example
@@lock = Mutex.new
def example
@@lock.synchronize do
# do dangerous stuff here
end
end
end
Importantly, synchronize
is a normal Ruby method. The original version, written in pure Ruby, looks like this:
def synchronize
lock
begin
yield
ensure
unlock
end
end
It has all the hallmarks of what we've discussed so far. It locks, yields to the block, and ensures that the lock will be released. This means that if a Ruby programmer returns
from inside the block, synchronize
will behave correctly.
This example demonstrates the key power of Ruby's blocks: they can easily replace language constructs. In this case, a Ruby programmer can take unsafe code, plop it inside a synchronization block, and it will continue to work.
Postscript
I've historically written my posts without very many links, mostly out of a fear of links going out of date. I've received increasing requests for more annotations in my posts, so I'll start doing that. Let me know if you think my annotations in this post were useful, and feel free to give me any suggestions on that front that you find useful.