Yehuda Katz is a member of the Ruby on Rails core team, and lead developer of the Merb project. He is a member of the jQuery Core Team, and a core contributor to DataMapper. He contributes to many open source projects, like Rubinius and Johnson, and works on some he created himself, like Thor.
@nzkoz Pairing is caring! Go Carlhuda! @carllerche ftw
New Rails Isolation Testing
July 1st, 2009
A little while ago, Carl and I starting digging into Rails’ initializer. We already made a number of improvements, such as adding the ability to add a new initializer at any step in the process, and to make it possible to have multiple initializers in a single process. The second improvement is the first step toward running multiple Rails apps in a single process, which requires moving all global Rails state into instances of objects, so each application can have its own contained configuration in its own object. More on this in the next few weeks.
As I detailed on the Engine Yard blog this week, when moving into a new area to refactor, it’s important to make sure that there are very good tests. Although the Rails initializer tests covered a fair amount of area, successfully getting the tests to pass did not guarantee that Rails booted. Thankfully, Sam Ruby’s tests were comprehensive enough to get us through the initial hump.
After making the initial change, we went back to see what we could do to improve the test suite. The biggest problem was a problem we’d already encountered in Merb: you can’t uninitialize Rails. Once you’ve run through the initialization process, many of the things that happen are permanent.
Our solution, which we committed to master today, is to create a new test mixin that runs each test case in its own process. Getting it working on OSX wasn’t trivial, but it was pretty elegant once we got down to it. All we did was override the run method on TestCase to fork before actually running the test. The child then runs the test (and makes whatever invasive changes it needs to), and communicates any methods that were called on the Test::Unit result object back to the parent.
The parent then replays those methods, which means that as far as the parent is concerned, all of the cases are part of a single suite, even though they are being run in a separate process. Figuring out what parts of Test::Unit to hook into took all of yesterday afternoon, but once we were done, it was only about 40 lines of code.
Today, we tackled getting the same module to work in environments that don’t support forking, like JRuby and Windows. Unfortunately, these environments are going to run these tests significantly more slowly, because they have to boot up a full process for each test case, where the forking version can simply use the setup already done in the parent process (which makes it almost as fast as running all the tests in a single process).
The solution was to emulate forking by shelling out to a new process that was identical to the one that was just launched, but with an extra constraint on the test name (basically, booting up the test suite multiple times, but each run only runs a single test). The subprocess then communicates back to the launching process using the same protocol as in the forking model, which means that we only had to change the code that ran the tests in isolation; everything else remains the same.
There was one final caveat, however. It turns out that in Test::Unit, using a combination of -t to specify the test case class and -n to specify the test case name doesn’t work. Test::Unit’s semantics are to include any test for which ANY of the appropriate filters match. I’m not proud of this, but what we did was a small monkey-patch of the Test::Unit collector in the subprocess only which does the right thing:
# Only in subprocess for windows / jruby. if ENV['ISOLATION_TEST'] require "test/unit/collector/objectspace" class Test::Unit::Collector::ObjectSpace def include?(test) super && test.method_name == ENV['ISOLATION_TEST'] end end end
Not great, but all in all, not all that much code (the entire module, including both forking and subprocess methods is just 98 lines of code).
A crazy couple of days yielding a pretty epic hack, but it works!
Merb Vulnerability Fix (1.0.12)
June 30th, 2009
Over the weekend, it was discovered that the json_pure gem is subject to a DoS attack when parsing specially formed JSON objects. This vulnerability does not affect the json gem, which uses a C extension, or ActiveSupport::JSON, which is used in Rails.
By default, Merb uses the json gem, which is not vulnerable, but falls back to json_pure if json is not available. As a result, if you have json_pure but not json on your system, you may be vulnerable. Additionally, Ruby 1.9.1 (but not Ruby 1.9 trunk) ships with json_pure, which remains vulnerable.
The easiest way to immunize yourself from this problem, no matter what Ruby version you are on, is to upgrade to the latest version of json_pure, which resolves the vulnerability. Additionally, Merb 1.0.12 has been released, which monkey-patches json_pure to remove the vulnerability, but prints a warning encouraging you to upgrade to the latest. Merb 1.0.12 only adds this patch on top of 1.0.11, so it should be perfectly safe to upgrade.
Guest Blogging
June 29th, 2009
I’ll occasionally be posting on the Engine Yard blog in addition to my posts here. Union Station has really stepped up lately in terms of technical content. We’ve had posts on pairing, TDD with Cucumber, Rack and a slew of other things, and I know there’s more in the pipe.
I posted today on how we’ve been refactoring Rails, and how you can apply what we’ve learned to your own projects: let me know what you think!
On Rails Testing
June 20th, 2009
One of the things that has both pleasantly surprised and frustrated me over the past six months is the state of Rails’ internal tests. While the tests can sometimes cover the most obscure, important bugs, they can sometimes be heavily mock-based or very coupled to internal implementation.
In large part, this is because of the requirement that patches come with tests, so the test suite has accumulated a lot of knowledge over time, but isn’t always the clearest when it comes time for a major refactor. Just to be clear, without the test suite, I don’t think the work we’ve been doing would have been possible, so I’m not complaining much.
However, I recently became aware that Sam Ruby, one of the authors of Agile Web Development on Rails, has released a test suite that tests each step in the depot application, as well as each of the advanced chapters.
Last week, Carl and I started digging into the Rails initializer, and the tests in the initializer (railties) are more mock-based and less reliable than the tests in ActionPack (which we’ve been working with so far). They’re pretty reasonable unit tests for individual components, but getting all of the tests to pass did not result in an (even close) bootable Rails app. Thanks to Sam’s tests, we were able to work through getting a booting Rails app by the end of last week.
Over the past week or two, Carl and I have been running the tests a few times a day, and while they definitely take a while to get through, they’ve added a new sense of confidence to the work we’re doing.
Delicious Food
June 18th, 2009
A couple of months ago, I read The End of Overeating, which got me started on a series of books about food. I worked my way through In Defense of Food, The Omnivore’s Dilemma, and Animal, Vegetable, Miracle.
After reading through these books and doing a bunch of auxiliary research, I came to the fairly disturbing conclusion that the food we eat from the grocery is sorely lacking in required nutrients. That’s especially true for packaged, processed foods, but even the fruits and vegetables purchased in a typical produce section are lacking.
Studies show that fruits and vegetables grown without pesticides or herbicides have significantly higher levels of certain vitamins and antioxidants. And simple replacements of the missing compounds assumes that we have a full understanding of what’s missing. We don’t.
Additionally, breeding fruits and vegetables for high yield, uniform appearance, and long travel distance necessarily reduces other more important factors, like taste and nutritional content. Like I said, studies have found this to be true, but in retrospect, it’s fairly self-evident. Evolution is a process of competition among zero-sum ends. If yield and long-distance travel win out, something else loses out.
With that introduction, over the past few months, I’ve slowly started working toward cooking virtually all of my own food. In the past few weeks, that has expanded to include breads and sauces. In short, the rule I try to follow is: “Only purchase items with a single ingredient in their ingredient list.” And at a very minimum, a rule I got from Michael Pollan’s books: “Only purchase items with a few ingredients, all of which are understandable.”
It probably sounds like I’ve absorbed too much of San Francisco’s culture, but I have to tell you, the quality of the food I’ve been eating has increased dramatically. For the most part, food tastes better, and even when it doesn’t, I really enjoy putting together meals.
In order to make this happen, I’ve purchased a few pieces of equipment. First of all, I purchased a bread-maker. It cost only $100, and I’ve already made two loaves of pretty good tasting whole-wheat bread.
I also purchased a rice maker, which makes cooking brown rice myself actually possible. Trying to do it on a stove yourself is basically impossible (try Googling instructions on cooking brown rice).
I dramatically increased my consumption of organic fruits and vegetables, much of it locally grown. After a month or so, I can absolutely confirm that the quality and taste of the fruits and vegetables is significantly higher, and I’ve straight-up stopped worrying about the small differences in fat in things like whole milk and skim milk, since those items are mostly garnishes on fruits and vegetables. As a result, I’ve been able to focus on the taste of my ingredients, instead of trying to squeeze a few more grams of fat out of a meal, and I’ve still been able to lose a bunch of weight since I started (again, in retrospect, it’s not all that surprising that eating a ton more fruits and vegetables, regardless of what else is in the diet, would result in weight loss).
From The End of Overeating, I also completely cut out snacks (snacks are actually a relatively new Western invention), focusing heavily on three meals a day, which increases the quality and enjoyment of actually eating food that I prepare carefully and well.
Finally, I joined a local CSA, which delivers a weekly box of vegetables. I got my first shipment today, which brought me 1.5 pounds of yellow peaches, 1.5 pounds black plums, 6 oz. blueberries, 1 pund summer squash, 1 bunch chard, 1/2 pund gypsey peppers, 1/2 pound lipstick peppers, 1.5 pounds heirloom tomatoes, 1 bunch greenleaf lettuce, 1 bunch red beets, 1 bunch nantes carrots, and 1 pound red onion. All from local farms, all organic, and all for just $30.
In the past, when I heard people talking about stuff like this, they sounded like kooky hippies, so I suspect that’s how I sound to people as well. But if eating tasty, nutritious food that I prepare myself, losing weight and saving money is a kooky hippie thing to do, I’ll take kookie hippie any day.
And the Pending Tests — They Shall Pass!
June 18th, 2009
While we’ve been working on ActionController::Base, there were a few tests for fixes in Rails 2.3 that were mostly hacked in and we were waiting for a cleaner underlying implementation to build them on top of. Today, we finally got all the pending tests passing!
There were basically two categories of pending tests:
Layout Selection
Rails 2 had a feature called :exempt_from_layout, which allowed users to specify that a particular kind of layout (defaulting to RJS) should be exempt from layout. The reason for this feature was that people were noticing that their RJS layouts were being wrapped in their HTML layouts.
When I first saw this feature, I had a simple question: “Why were RJS layouts being wrapped in HTML templates in the first place?” And also, if you want an RJS layout (application.js.erb), why shouldn’t you be allowed to.
As it turns out, the reason for this was that any number of parts of Rails render templates, and the layout lookup is completely separate. So when it came time to look up a layout, Rails didn’t realize that it had just rendered an RJS template (as far as it was concerned, since the available formats allow for HTML and JS, either would do).
The solution: Have template and layout lookup go through a deterministic process, and limit layout lookup to the mime type for the template that was actually rendered (not templates that might have been rendered).
The upshot: exempt_from_layout isn’t needed. If you don’t want a layout around your RJS templates, don’t make an application.js.*. If you do, do. Rails will do the right thing.
An aside: the hardest part of Rails layouts involves when to raise an exception for a missing layout. Since Rails allows layouts to be missing in certain circumstances, making sure an exception is raised in other circumstances is quite tricky. In master, we raise an exception if you explicitly provided a layout (layout "foo"), and none exist for any MIME type. Implicit layouts or explicit layouts that exist for another MIME are permitted to be absent. The exception to that rule is render :layout => true, which converts an implicit layout to a required, explicit one.
RJS and HTML
Rails 2 has a bunch of hardcoded rules that allow RJS templates to render HTML. This allows page[:foo].replace_html :partial => "some_partial" to render some_partial.html.erb.
Effectively, when you got into an RJS template, the acceptable formats list was hardcoded to [:html]. Again, this blocks the use of RJS partials, and if you supply only an RJS partial, it will not be used (a missing template error would be raised).
When we thought about it, we realized that what is desired is to look for templates matching the mime type of the current template first, and if such a template could not be found, to allow templates matching any mime (with HTML leading the list). If you’re in an RJS template and you call render :partial => "foo", and only a foo.xml.erb exists, we can assume that you mean to render that template.
This handles all of the cases supported by Rails 2, with one small change. If you have both a js and html template, the js template will win inside of RJS. If you didn’t want the js template to win, why did you create it?
That rule now applies to any template. Partials matching the existing mime will be rendered if they exist, but any other mime will work fine as a fallback. So no special rules required for RJS anymore. The same rules apply for other templates (:file etc.) rendered from the context of another template.
One last thing. If you do page[:customer].replace_html :partial => "world", an html template will be required. When we noticed that we could separate out the rules for replace_html from other partials, we realized that we could hardcode more restrictive rules for that specific API than the general API.
And with that, the remaining pending tests pass. I really enjoy being able to solve problems that required hacks in the past by reorganizing the code to produce conceptually nicer solutions.
RubyGems: Problems and (proposed) Solutions
June 15th, 2009
There’s been a fair bit of discussion around RubyGems lately, and some suggestions about what the core problems with RubyGems are.
People have the general sense that there’s something wrong with dependencies, and that it might have something to do with multiple versions being installed in one repository. It also seems (to people) that having require do magical things is Bad(tm). And in general, people like knowing exactly what versions of things are being loaded.
To some degree, all of these concerns are valid, and led to the rather hackish solution that we distributed with Merb called merb.thor. What we did:
- Created a manifest for your application that would describe the gems and versions you wanted to use. That same manifest was used at runtime to load those gems.
- Create a virtual environment just for your application, with the one-version-per-environment rule. This meant that it was always possible to see what versions and gems were being used.
- Make it reasonably easy to update the local environment when the manifest changes. Make such changes *not* require knowledge of the dependencies and versions of either the old or new gems.
What we did not do:
- Put all the gems in a single directory, so normal Ruby require would work.
At first glance, this seems like a very good idea. Instead of relying on magical runtime load-path manipulation, just take, for instance, the merb-core gem, and stick it in a top-level. Then add that top-level to the load path and you don’t need Rubygems at runtime.
The problem with this fabulous idea is that there isn’t a consistent way that people use Rubygems. Consider the following scenario:
A gem called “bad-behavior” that has a lib loadpath, but puts server.rb, initializer.rb, and omg.rb at the top-level. In omg.rb, the gem does Dir["#{File.dirname(__FILE__)}/*"].each {|f| require f }. This works fine when the gem actually owns the entire directory. But if you drop the gem into a larger file structure (similar to how other package managers handle the problem), its top-level is now everyone else’s top-level.
Another scenario: A gem called rack-silliness that puts its files in rack/*, and then calls Dir["#{File.dirname(__FILE__)}/*"].each {|f| require f } from rack/silliness.rb. Again, this works fine if the gem owns the entire directory, but if multiple gems put things in rack/*, moving everything to a shared structure will fail.
With all that said, if we *could* use a shared structure, things would automatically fall into place. We wouldn’t need rubygems at runtime. It would be easy to have separate environments with the one-version rule. It would be easy to have local environments. *All within the existing Rubygems structure*.
The solution I promised
So how do we solve this problem? We need to agree to deprecate everything but the following structure for Rubygems:
Given a gem foo, there should be a foo.rb at the top-level, and optionally, a foo directory underneath. No other files or directories are allowed
Update:What I meant here was lib/foo.rb and lib/foo/…, which will be the directory that gets added to the load path. As a result, the vast majority of existing gems would not need to change.
Other solutions that work with Rubygems but use a single shared directory structure *assume* well-behaved gems only. If we could enforce well-behaved gems, we would both have an excellent solution in Rubygems proper, and make it easier for people to build additional solutions and plugins around the gem format.
So here’s my proposal: For the next version of Rubygems, print a warning if installing a gem that does not comply. Over the next few months, get the few existing gem authors who have non-complying gems to release new versions that comply.
At the same time, I will release a gem plugin that provides virtual environments and local environments for Rubygems (I have already been working on this). It will support the one-version rule, named virtual environments, a gem manifest for applications, and gem resolution (thanks to the hard work by Tim Carey-Smith on gem_resolver).
In the interim, we have a slightly clunky solution that will work well. Instead of putting all gems into a single load-path and using that, we leave the current structure (each gem has its own space). Then, when a gem is installed into an environment, we preresolve all load-paths, and keep a list of them. When you switch into an environment, we add those load-paths to the default set of Ruby load-paths, which will behave exactly the same, but still support misbehaving gems.
In the long-term, all gems will be able to live side-by-side in a single load-path, which will allow us to create a cleaner version of the virtual environments (and will improve startup times, especially on JRuby and Google App Engine, but won’t have any user-facing implications).
So, are we up for finally getting our gem packaging format under control?
P.S. I am aware that rip was just announced, and is attempting to do a lot of the same things. This blog post has been a long time coming (the ideas were hatched a year ago, and many are available today as part of Merb). What I’d like to do here is take the good ideas that exist in Merb, rip, and the Python community and make them native to Rubygems, addressing the problems I outlined above that are inherent to the transition. It’s perfectly fine for rip to simply require well-formed gems, but a solution that gets us from here to there as a community is important.
Rails Edge Architecture
June 11th, 2009
I’ve talked a bunch about the Rails 3 architecture in various talks, and it’s finally come together enough to do a full blog post on it. With that said, there’s a few things to keep in mind.
We’ve done a lot of work on ActionController, but have only mildly refactored ActionView. We’re going to tackle that next, but I don’t expect nearly as many internal changes as there were for ActionController. Also, we’ve been working on the ActionController changes for a few months, and have focused quite a bit on maintaining backward compatibility (for the user-facing API) with Rails 2.3. If you try edge and find that we’ve broken something, definitely let us know.
AbstractController::Base
One of the biggest internal changes is creating a base superclass that is separated from the notions of HTTP. This AbstractController handles the basic notion of controllers, actions, and action dispatching, and not much else.
At that level of abstraction, there are also a number of modules that can be added in, such as Renderer, Layouts, and Callbacks. The API in the AbstractController is very normalized, and it’s not intended to be used directly by end users.
For instance, the template renderer takes an ActionView::Template object and renders it. Developers working with Rails shouldn’t have to know about Template objects; this is simply an internal API to make it easier to work with the Rails internals and build things on top of Rails.
Another example is related to action dispatching. At its core, action dispatching is simply taking an action name, calling the method if it exists, or raising an exception if it doesn’t. However, other parts of Rails have different semantics. For instance, method_missing is used if the action name is not a public method, and the action is dispatched if no action is present but a template with the same name is present.
In order to make it easy to implement these features at different levels of abstraction, AbstractController::Base calls method_for_action to determine what method to call for an action. By default, that method simply checks to see if the action is present. In ActionController::HideActions, that method is overridden to return nil if actions are explicitly removed using hide_action. ActionController::Base further overrides it to return “default_render”, which handles the rendering.
That’s, in a nutshell, how the new architecture works: simple abstractions that represent a bunch of functionality that you can layer on if you need it.
ActionController::Http
ActionController::Http layers a Rack abstraction on top of the core controller abstraction so that actions can be called as Rack endpoints. On top of the core rendering functionality available in AbstractController::Base, ActionController provides rendering functionality that builds on top of HTTP constructs. For instance, the ActionController renderer knows how to set the content type based on the template it rendered. Additionally, ActionController implements more developer-friendly APIs. Instead of having to know about the ActionView::Template object, developers can just use the name of the template (pretty much the Rails 2.3 rendering API).
ActionController normalizes the developer’s inputs into the inputs that AbstractController is expecting, before calling super (ActionController::Http inherits from AbstractController::Base). Additional functionality is added via the ActionController::Rails2Compatibility module, which provides support for things like stripping a leading “/” off of template names, and ActionController::Base, which provides a final layer of normalization (for instance, normalizing render :action => “foo” and render “foo”). As a result, ActionController::Base ends up being identical to the fully-featured ActionController::Base that you used in Rails 2.3, and none of your app code needs to change.
As an example, let’s look at rendering. If you call render :template => “hello”, the first thing that happens is the normalization pass on ActionController::Base. Since we used a relatively normalized form, not much happens, and the options hash is passed up into the compatibility module, which checks to see if there’s a leading “/” in the template name. Since there isn’t, it passes the options hash up again into the AbstractController::Renderer module.
There, Renderer checks to see if we’ve already set the response body. If we have, a DoubleRenderError is raised. If not, render_to_body is called with the options hash. The first place in the chain we find render_to_body is in ActionController::Renderer, where it processes options like :content_type or :status.
It then calls super, passing control back into AbstractController, which promptly calls _determine_template, again with the options. The job of _determine_template is to take in some options and return the template to render. This hook point is provided so that other modules, like the Layouts module, can use the template that was looked up to control something they care about. In this case, the Layouts module wants to limit the search for a layout to the formats and locale of the template that was actually rendered.
This solves a long-standing problem in Rails where templates and layouts were looked up separately, so it was possible for an HTML layout to be implicitly wrapped around an XML template. No longer
The Layouts module gets called first with _determine_template. It calls super, allowing the default template lookup to occur, which populates options[:_template]. It then uses options[:_template] to look up the layout to use, populating options[:_layout]. I hadn’t mentioned it before, but AbstractController uses options with leading underscores, leaving undecorated options for subclasses like ActionController::Base.
After the template to render is determined, Renderer calls _render_template, which makes the call into ActionView.
I know it sounds rather complicated, but a graphic showing the relationships is forthcoming, which should clean things up. The nice thing is that there are several layers of abstraction, and while the final system is complex (mostly because the functionality is complex), it’s reasonably easy to understand the higher levels of abstraction on their own. It’s also easy to put various kinds of normalization into standard places, so the code you’re reading at any given point is code that expects normalized inputs. Since the normalization that Rails does can sometimes be quite gnarly (in the service of making the user experience really pleasant), separating that stuff out can reduce the amount of gnarliness in the internals themselves.
Finally, an important part of an architecture like this is making sure that there is great internal documentation (which we’ve already started with), and some visualizations that show what’s going on. If you were forced to track down the control flow above on your own for the first time, it would probably be non-trivial. So a key part of this architecture is making sure you never have to do that. I would also note that I’m not particularly good at expressing code reading in blog posts, so the process definitely sounds a lot more complex than it actually is
The Importance of Executable Class Bodies
June 4th, 2009
I spent the past few days at JavaOne, where I gave a well-received talk on Ruby, and got to attend a number of sessions on both Ruby and other related technologies.
Out of curiosity, I went to a session on Groovy, a language that has a syntax that is derived directly from Java, but with semantics that are fairly close to Ruby’s. Groovy is missing a number of features that Ruby has, and is more clunky in a number of cases.
For instance, while Ruby has pure open classes, Groovy allows you to open or reopen the metaclass of a class and insert new methods. Groovy 1.6 (released in February) added the ability to insert a number of methods to a metaclass at once.
But what I want to discuss here is another distinction. Unlike Ruby, Groovy does not allow executable code anywhere. Instead Groovy classes are compiled, so runtime code execution inside of class bodies can not work. This means that a large number of the features that make Rails stand out, like declarative callbacks and validations, various forms of accessors, runtime method generation based on introspecting the database, and other per-class mutable structures cannot be implemented nearly as elegantly in Groovy.
In case you’re not familiar, Ruby doesn’t need annotations, because class bodies in Ruby are simply executable code, with a self that is the class that is being defined.
Let’s take a simple example. In Ruby, accessors are defined as followed:
class Car attr_accessor :model, :make end
In Groovy, accessors are defined as:
class Car { model make }
At first glance, they seem pretty similar. In both cases, getters and setters are added, and new fields (in the respective languages) exist. The difference is that while Groovy needed to add new syntax to support this, Ruby’s version can be implemented in Ruby itself:
class Class def attr_accessor(*names) names.each do |name| class_eval "def #{name}() @#{name} end" class_eval "def #{name}=(val) @#{name} = val end" end end end
Rubinius, a complete implementation of Ruby in Ruby, implements attr_accessor as:
class Class def attr_reader(name) meth = Rubinius::AccessVariable.get_ivar name @method_table[name] = meth return nil end def attr_writer(name) meth = Rubinius::AccessVariable.set_ivar name @method_table["#{name}=".to_sym] = meth return nil end def attr_accessor(name) attr_reader(name) attr_writer(name) return true end end
Here, Rubinius exposes the method table to Ruby, and we store a method representing the instance variable directly into the method table. Obviously, this requires more Ruby infrastructure, but it shows how powerful “everything is executable code” can be.
In an effort to support Ruby’s declarative style, Groovy has added what they call “AST Transformations”, which allows a declarative rule plus some code to be converted, at compile time, into different code to be passed into the compiler.
To make this immediately useful, they shipped a bunch of these annotations with Groovy 1.6, so we can take a look at how this is supposed to work. One example is their “Lazy” annotation, which allows the creation of an accessor that is initialized to something slow, so you want to defer initializing the accessor until it is actually accessed. It works like this (from the Groovy documentation):
class Person { @Lazy pets = ['Cat', 'Dog', 'Bird'] }
Assuming that creating that Array was slow, this would defer loading the Array until pets was accessed. Pretty nice. Unfortunately, implementing this nice abstraction is a non-trivial operation:
package org.codehaus.groovy.transform; import org.codehaus.groovy.ast.*; import org.codehaus.groovy.ast.expr.*; import org.codehaus.groovy.ast.stmt.*; import org.codehaus.groovy.control.CompilePhase; import org.codehaus.groovy.control.SourceUnit; import org.codehaus.groovy.runtime.MetaClassHelper; import org.codehaus.groovy.syntax.Token; import org.objectweb.asm.Opcodes; /** * Handles generation of code for the @Lazy annotation * * @author Alex Tkachman */ @GroovyASTTransformation(phase= CompilePhase.CANONICALIZATION) public class LazyASTTransformation implements ASTTransformation, Opcodes { public void visit(ASTNode[] nodes, SourceUnit source) { if (!(nodes[0] instanceof AnnotationNode) || !(nodes[1] instanceof AnnotatedNode)) { throw new RuntimeException("Internal error: wrong types: $node.class / $parent.class"); } AnnotatedNode parent = (AnnotatedNode) nodes[1]; AnnotationNode node = (AnnotationNode) nodes[0]; if (parent instanceof FieldNode) { FieldNode fieldNode = (FieldNode) parent; final Expression init = getInitExpr(fieldNode); fieldNode.rename("$" + fieldNode.getName()); fieldNode.setModifiers(ACC_PRIVATE | (fieldNode.getModifiers() & (~(ACC_PUBLIC|ACC_PROTECTED)))); create(fieldNode, init); } } private void create(FieldNode fieldNode, final Expression initExpr) { BlockStatement body = new BlockStatement(); final FieldExpression fieldExpr = new FieldExpression(fieldNode); if ((fieldNode.getModifiers() & ACC_VOLATILE) == 0) { body.addStatement(new IfStatement( new BooleanExpression(new BinaryExpression(fieldExpr, Token.newSymbol("!=",-1,-1), ConstantExpression.NULL)), new ExpressionStatement(fieldExpr), new ExpressionStatement(new BinaryExpression(fieldExpr, Token.newSymbol("=",-1,-1), initExpr)) )); } else { body.addStatement(new IfStatement( new BooleanExpression(new BinaryExpression(fieldExpr, Token.newSymbol("!=",-1,-1), ConstantExpression.NULL)), new ReturnStatement(fieldExpr), new SynchronizedStatement( VariableExpression.THIS_EXPRESSION, new IfStatement( new BooleanExpression(new BinaryExpression(fieldExpr, Token.newSymbol("!=",-1,-1), ConstantExpression.NULL)), new ReturnStatement(fieldExpr), new ReturnStatement(new BinaryExpression(fieldExpr,Token.newSymbol("=",-1,-1), initExpr)) ) ) )); } final String name = "get" + MetaClassHelper.capitalize(fieldNode.getName().substring(1)); fieldNode.getDeclaringClass().addMethod(name, ACC_PUBLIC, fieldNode.getType(), Parameter.EMPTY_ARRAY, ClassNode.EMPTY_ARRAY, body); } private Expression getInitExpr(FieldNode fieldNode) { Expression initExpr = fieldNode.getInitialValueExpression(); fieldNode.setInitialValueExpression(null); if (initExpr == null) initExpr = new ConstructorCallExpression(fieldNode.getType(), new ArgumentListExpression()); return initExpr; } }
Pretty ugly. If you look closely at the code, you’ll see that the amount of code necessary to express even simple concepts is huge, because you’re manipulating an actual AST.
A similar feature in Ruby might look like:
class Person lazy(:pets) { ["Cat", "Dog", "Bird"] } end
And the implementation:
class Class def lazy(name, &block) define_method("_lazy_#{name}", &block) class_eval "def #{name}() @#{name} ||= _lazy_#{name} end" end end
Because the lazy method is just a method being run on class at runtime, we can evaluate code live into the context. First, we define a method on the object called _lazy_pets. Next, we define a method called pets that memoizes the results of calling that method into an instance variable. And that’s it.
A slightly slower solution in Ruby that doesn’t require eval is:
class Class def lazy(name) ivar = "@#{name}" define_method(name) do instance_variable_get(ivar) || instance_variable_set(ivar, yield) end end end
In this case, since we’re defining the method in a block, we still have access to the block that was passed in to the original lazy method, so we can yield to it inside the new method. Pretty cool, no?
Because all code is executable in Ruby, it’s easy to abstract away repetitive code in around the same number of lines as it took to write the code in the first place. With these simple examples, it would be possible to implement a simpler way to express these transforms. But as these sorts of things are expected to compose well with each other, the flexibility of executable, runtime code starts to really add up, in the same way that languages that are dynamic at runtime can be more flexible and powerful than languages that try to precompute everything at compile time.
My Code Directory
May 30th, 2009
So after a couple of weeks, I’ve managed to remain mostly clean. A couple of observations:
- It’s crucially important to keep the Downloads folder clean. This means finding a more permanent home for downloaded files quickly or throwing them into the trash. The “broken window” impact of having a anything-but-empty Downloads directory is higher than I expected.
- Similarly, a strictly controlled Documents directory is crucial. I have “Presentations”, “Virtual Machines”, and “jQuery Doc Files” (for my book).
- Applications has turned out to be more difficult than I expected. There’s a fine balance between keeping commonly used things at the top-level and keeping the top-level relatively small. Obviously, this is mostly obviated by LaunchBar, but part of this project is about making it a lot easier for me to understand what is on my system, so having a global trash bin isn’t really acceptable, even if it’s easy to rummage in it.
- It’s very hard to control what gets installed in the system. I’ve tried to install as much as possible via MacPorts, just because I know I’ll be able to uninstall it later.
- I’ve been using Adium for IRC, AIM, GTalk, and Twitter. Even though it’s not as good as the special-purpose tools for IRC or Twitter, there’s a lot of value in keeping everything in a single app, and being able to combine IRC users with their GTalk counterparts is a nice side-effect that has started to pay off over time. I definitely don’t expect everyone to do this (especially for Twitter), but I’d recommend playing with it and see if having all of your contacts in a single place is a win for you
- Incidentally, part of what makes Adium for Twitter doable for me is Cotweet, which sends me a batched email every so often of the @wycats mentions on Twitter.
What about my code directory?
- I divided the top-level into two directories: active and vendor. The active directory is for the projects I’m actively working on and have commit (or a fork).
- That includes: rails, merb, rubygems, gem_resolver, and evented_jquery (private for now).
- My vendor directory includes git or hg repos I’m watching and use frequently. Under vendor, I have a java directory, which includes the davinci project, jruby, jvmscript, ruby2java, and Rhino.
- Right under vendor, I have adium, bespin, jquery, macports, matzruby, and rubymacros.
This may or may not scale over time, but again, it has the pleasing property of being able to determine, at a glance, what’s going on in my code directory for some aspect of my work. So far, organizing things reasonably well has helped me a lot to find what I need when I need it in the appropriate context.
