Yehuda Katz is a member of the Ember.js, Ruby on Rails and jQuery Core Teams; he spends his daytime hours at the startup he founded, Tilde Inc.. Yehuda is co-author of best-selling jQuery in Action and Rails 3 in Action. He spends most of his time hacking on open source—his main projects, along with others, like Thor, Handlebars and Janus—or traveling the world doing evangelism work. He can be found on Twitter as @wycats.
How to Marshal Procs Using Rubinius
November 19th, 2011
The primary reason I enjoy working with Rubinius is that it exposes, to Ruby, much of the internal machinery that controls the runtime semantics of the language. Further, it exposes that machinery primarily in order to enable user-facing semantics that are typically implemented in the host language (C for MRI, C and C++ for MacRuby, Java for JRuby) to be implemented in Ruby itself.
There is, of course, quite a bit of low-level functionality in Rubinius implemented in C++, but a surprising number of things are implemented in pure Ruby.
One example is the
Binding object. To create a new binding in Rubinius, you call
def self.setup(variables, code, static_scope, recv=nil) bind = allocate() bind.self = recv || variables.self bind.variables = variables bind.code = code bind.static_scope = static_scope return bind end
This method takes a number of more primitive constructs, which I will explain as this article progresses, but we can describe the constructs that make up the high-level Ruby
Binding in pure Ruby.
In fact, Rubinius implements
Kernel#binding itself in terms of
def binding return Binding.setup( Rubinius::VariableScope.of_sender, Rubinius::CompiledMethod.of_sender, Rubinius::StaticScope.of_sender, self) end
Yes, you’re reading that right. Rubinius exposes the ability to extract the constructs that make up a binding, one at a time, from a caller’s scope. And this is not just a hack (like Binding.of_caller for a short time in MRI). It’s core to how Rubinius manages
eval, which of course makes heavy use of bindings.
For a while, I have wanted the ability to
Marshal.dump a proc in Ruby. MRI has historically disallowed it, but there’s nothing conceptually impossible about it. A proc itself is a blob of executable code, a local variable scope (which is just a bunch of pointers to other objects), and a constant lookup scope. Rubinius exposes each of these constructs to Ruby, so Marshaling a proc simply means figuring out how to Marshal each of these constructs.
Let’s take a quick detour to learn about the constructs in question.
Rubinius represents Ruby’s constant lookup scope as a
Rubinius::StaticScope object. Perhaps the easiest way to understand it would be to look at Ruby’s built-in
module Foo p Module.nesting module Bar p Module.nesting end end module Foo::Bar p Module.nesting end # Output: # [Foo] # [Foo::Bar, Foo] # [Foo::Bar]
Every execution context in Rubinius has a
Rubinius::StaticScope, which may optionally have a parent scope. In general, the top static scope (the static scope with no parent) in any execution context is
Because Rubinius allows us to get the static scope of a calling method, we can implement
Module.nesting in Rubinius:
def nesting scope = Rubinius::StaticScope.of_sender nesting =  while scope and scope.module != Object nesting << scope.module scope = scope.parent end nesting end
A static scope also has an addition property called
current_module, which is used during
class_eval to define which module the runtime should add new methods to.
Marshal.dump support to a static scope is therefore quite easy:
class Rubinius::StaticScope def marshal_dump [@module, @current_module, @parent] end def marshal_load(array) @module, @current_module, @parent = array end end
These three instance variables are defined as Rubinius slots, which means that they are fully accessible to Ruby as instance variables, but don’t show up in the
instance_variables list. As a result, we need to explicitly dump the instance variables that we care about and reload them later.
A compiled method holds the information necessary to execute a blob of Ruby code. Some important parts of a compiled method are its instruction sequence (a list of the compiled instructions for the code), a list of any literals it has access to, names of local variables, its method signature, and a number of other important characteristics.
It’s actually quite a complex structure, but Rubinius has already knows how to convert an in-memory
CompiledMethod into a String, as it dumps compiled Ruby files into compiled files as part of its normal operation. There is one small caveat: this String form that Rubinius uses for its compiled method does not include its static scope, so we will need to include the static scope separately in the marshaled form. Since we already told Rubinius how to marshal a static scope, this is easy.
class Rubinius::CompiledMethod def _dump(depth) Marshal.dump([@scope, Rubinius::CompiledFile::Marshal.new.marshal(self)]) end def self._load(string) scope, dump = Marshal.load(string) cm = Rubinius::CompiledFile::Marshal.new.unmarshal(dump) cm.scope = scope cm end end
A variable scope represents the state of the current execution context. It contains all of the local variables in the current scope, the execution context currently in scope, the current
self, and several other characteristics.
I wrote about the variable scope before. It’s one of my favorite Rubinius constructs, because it provides a ton of useful runtime information to Ruby that is usually locked away inside the native implementation.
Dumping and loading the
VariableScope is also easy:
class VariableScope def _dump(depth) Marshal.dump([@method, @module, @parent, @self, nil, locals]) end def self._load(string) VariableScope.synthesize *Marshal.load(string) end end
synthesize method is new to Rubinius master; getting a new variable scope previously required synthesizing its locals using
class_eval, and the new method is better.
A Proc is basically nothing but a wrapper around a
Rubinius::BlockEnvironment, which wraps up all of the objects we’ve been working with so far. Its
scope attribute is a
VariableScope and its
code attribute is a
Dumping it should be quite familiar by now.
class BlockEnvironment def marshal_dump [@scope, @code] end def marshal_load(array) scope, code = *array under_context scope, code end end
The only thing new here is the
under_context method, which gives a
BlockEnvironment its variable scope and compiled method. Note that we dumped the static scope along with the compiled method above.
Finally, a Proc is just a wrapper around a BlockEnvironment, so dumping it is easy:
class Proc def _dump(depth) Marshal.dump(@block) end def self._load(string) block = Marshal.load(string) self.__from_block__(block) end end
__from_block__ method constructs a new Proc from a BlockEnvironment.
So there you have it. Dumping and reloading Proc objects in pure Ruby using Rubinius! (the full source is at https://gist.github.com/1378518).