Yehuda Katz is a member of the Ember.js, Ruby on Rails and jQuery Core Teams; he spends his daytime hours at the startup he founded, Tilde Inc.. Yehuda is co-author of best-selling jQuery in Action and Rails 3 in Action. He spends most of his time hacking on open source—his main projects, like Thor, Handlebars and Janus—or traveling the world doing evangelism work. He can be found on Twitter as @wycats and on Github.

Archive for the ‘Other’ Category

RailsConf Wrapup

I had a really great time at RailsConf. I met a ton of really cool people bursting with enthusiasm for great projects. I sort of knew this already, but now I know it viscerally: the diversity of things that people are doing with Rails far outstrips the ability of those most deeply involved in building it day-to-day to anticipate. What was also really great was that so many of the ideas people asked me about could be answered by pointing at new extension points or modular APIs that we’re adding to Rails.

About my talks: let’s just say some of them could have been better. I’ll post the slides inline below with my personal wrap-up of each.

The jQuery Tutorial

The slides for the tutorial are available on this blog; the code is at Github. I had high hopes for this tutorial initially, but a few things (mostly my fault) made it worse than I expected.

First of all, I lost my co-presenter around a week before the tutorial. Thanks to Andy Delcambre, this did not catastrophically destroy the tutorial, but it did add additional challenges. Had I been better-prepared, it would not have been such a problem, and Andy performed very well under pressure, getting into the flow in a week in which he was also preparing the Flex demo for RailsConf.

Second of all, I didn’t specify the required skill level for the tutorial, which led to a wide disparity in the skill level of the attendees. I expected an intermediate group, so the focus that should have been on the introductory section and the labs was spread out to parts of the tutorial that we never really got to (e.g. evented programming). As a result, almost all of the feedback was either “that was wayyyy too introductory” or “that was wayyyy too advanced.” Again, this is something I should have prepared for, so definitely a mea culpa on my part.

All that said, I thought the content and labs were really good for the portion of the audience in the sweet spot (knew some JavaScript and CSS, but still needed more training in jQuery).

Technical Deep Dive

No code or slides here; the entire session was Carl and I talking through the recent architectural changes. For an evening Birds of a Feather, we got excellent attendance, and almost exclusively positive feedback. I thought it went very well, and really enjoyed the opportunity to share the gory technical details with the folks who took a night off of Vegas to stay for a very low-level technical talk.

Russian Doll Pattern

Again, this didn’t go as well as I’d hoped, but I’m still relatively pleased with the outcome. It was the first talk given jointly by Carl and me, and our relentless focus on getting the Rails code we’re working on ready for RailsConf left us with less time to get our delivery anywhere near as smooth as the Ruby standard-bearers for joint talks, Joe O’Brien and Jim Weirich.

The talk is available from this blog as well.

We also focused heavily on the concept of mountable apps and addressed a lot of the concerns that people have made in the past, as well as focused on appropriate techniques for building applications that could be reusable. Unfortunately, we’re not yet at a point where we can show a final DSL for the specifics. What I find personally frustrating is that for a lot of the most interesting projects, the hard work is the first 98%, after which there’s nothing “cool” to show. The final 2%, which is usually a day or two of work, produces the whiz-bang demos. For a lot of what we’ve been doing in Rails, we’re at the 90% or so mark, which means that the underlying infrastructure for doing really cool stuff is in place, but we don’t have the whiz-bang demos yet.

Thankfully, there was a great complementary talk by James Adam at RailsConf which covered the state of the art of engines as of Rails 2.3, so I was happy focusing our presentation on the likely future. Still, the “unicorn”, ethereal focus definitely detracted from what was otherwise a good first showing for the pair of Carl and me.

If anyone attended any of the sessions and has questions or constructive feedback, I’d love to hear it. Please feel free to comment below or email me at wycats@gmail.com.

Incentivizing Innovation

One of the things I love the most about the Ruby community is how easy it is to try out small mutations in practices, which leads to very rapid evolution in best practices. Rather than having the community look toward authority to design, plan, and implement “best practices” (a la the JSR model), members of the Ruby community try different things, and have rapidly made refinements to the practices over time.

It is natural to assume, looking from the outside, that the proliferation of practices is dangerous or fracturing. It is not. Instead, it functions more like biological evolution, where small mutations conspire over time to refine and improve the underlying organism. Consider the example of testing. There are a number of testing frameworks used by Rubyists, but they have largely converging feature-sets. As the feature-sets converge on superior solutions (e.g. Rails’ flavor of Test::Unit now comes with Rspec-style declarative tests), another round of differentiation occurs, allowing the community to zoom in on the now smaller differences and allow evolution to take its course.

The analogy isn’t perfect, but the basic idea is sound. It’s tempting to find consolidated practices on May 2, 2009, and find a way to shout them from the rooftops in more official form, so that those who haven’t caught up yet will have a way to immediately select the “winner” practice without having to do detailed investigation. Further, some have suggested that we should rank Ruby firms by how well they conform with the most popular practices of the moment. This will allow those who are looking for a firm to hire to determine whether or not their potential hirees conform with those practices.

Unfortunately, while that might work for a given slice in time, it provides unwelcome and artificial inertia for the practices of today. Now, in addition to having to contend with the normal inertial forces that resist changes until they are proven (wise forces), firms that want to try out new practices will need to contend with the artificial inertia imposed by being moved down on a list of firms conforming with other practices.

In effect, in creates a chilling effect on experimentation and innovation, and a drag on natural evolution.

It makes perfect sense to create a forum for sharing and aggregating the practices that people are finding useful at the moment. What makes less sense is creating a ranked list of “popular” practices, with no obvious mechanism for mediating differences except pure popularity. And even worse is ranking firms by their aggregate level of conformance.

As Rubyists, we need to discourage artificial attempts to encourage conformance and discourage innovation. Rails shops should find other ways to advertise the quality of their practices without falling back on appeals to the masses, and those in the market for Rails services should do their due dilligence. Measuring the popularity of a practice as a replacement for due diligence is frankly a recipe for failure, and once real investigations have been done, hollow measures of popularity won’t add much.

Evented Programming With jQuery

Over the past several years, I’ve been actively using jQuery for a variety of things. Early on, I shared the frustration that people had around using jQuery for more substantial projects. Starting with version 1.2 and continuing with version 1.3, however, jQuery provides a powerful evented model that can be used to build up fairly complex applications using evented, as opposed to traditional objected oriented programming.

The basic idea is that by leveraging asynchronous events, it is easier to model the fundamentally asynchronous nature of client-side web applications. Also, by writing components of your application to simply emit and handle events, it becomes a lot easier to extend or modify behavior down the road.

To illustrate this point, I’m going to show how you can build a tab widget using evented techniques, and show how it can be extended in several useful ways.

First, a few general principles

We’re going to use basic jQuery plugins to write the widget, but instead of putting all of the startup code directly inside the plugin constructor, we’re going to pass in an Object containing functions to be executed. In order to simplify matters, we’ll create a default set of functions to be executed containing the default behavior, but it will be possible to clone, extend, delete, or otherwise modify that set of functions and pass in an alternative.

The core of the tabs plugin will not handle any DOM manipulation. It will simply generate events that can be bound to the main tab node that will perform the manipulation. This increases the flexibility of the widget.

We’re also going to treat the main tabs node as a stateful container, holding information about the widget that we can retrieve later. We will achieve this using $().data(), an API added to jQuery in jQuery 1.2 and enhanced several times since then.

Finally, we’re going to use the executing anonymous closure trick ((function() { ... })()) to create a scratch-pad for helper functions that are used in various parts of the codebase. I use this trick frequently to create simple namespaces for internal methods.

To start (a quick detour)

Let’s start with a quick bit of sugar that I added to simplify setting and getting state out of a node. As I said above, jQuery already supplies the facilities for this using it’s internal data cache.

  var $$ = function(param) {
    var node = $(param)[0];
    var id = $.data(node);
    $.cache[id] = $.cache[id] || {};
    $.cache[id].node = node;
    return $.cache[id];
  };

Now, instead of doing $("#foo").data("foo") and ("#foo").data("foo", "bar"), you can do $$("#foo").foo and $$("#foo").foo = bar. The main motivation for this is that you can also do $$("#foo").bar = $$("#foo").bar + 1.

Step 1

The first step is to determine which events we’re going to need. After brainstorming a bit, we can come to:

  • An event each time a tab is activated. Parameter: the tab that was clicked
  • An event at startup time for each panel that is bound to a tab. Parameter: the panel
  • An event once startup is complete. No parameters.

The initial code looks like:

  $.fn.tabs = function(options) {
    options = options || {};
 
    // Initialize
    this.each(function() {
      var tabList = $(this);
      $$(tabList).panels = $();
 
      $("li a", tabList)
        .click(function() {
          tabList.trigger("activated", this);
          return false;
        }).each(function() {
          var panel = $($(this).attr("href"));
          $$(tabList).panels = $$(tabList).panels.add(panel);
          tabList.trigger("setupPanel", [panel]);
        });
 
      tabList.trigger("initialize");
    });
 
    return this;
  };

First, we get a reference to the tab list, since we will be creating callbacks later that will need access to it. Next, store a list of panels in the tab list using $$ (<code>$$(tabList).panels = $();</code>). We start by storing an empty jQuery object as “panels” in the tab widget.

Next, we bind a click handler to each <code>”li a”</code> inside the tab list ul. When it’s clicked, we simply trigger the activate event. We’ll implement the default behavior in a bit.

Finally, for each tab, we collect its associated panel and trigger the setupPanel event. The default behavior for setupPanel will simply be to hide it. We’ll implement that behavior in a bit as well.

Once we’re done, we trigger the panel’s initialize event.

Step 2

The next step is to declare the default functionality. Before we do that, let’s create a small jQuery helper that applies a set of functions to an object. You’ll see how it’s used in a moment.

jQuery.fn.setupPlugin = function(setup, options) {
  for(extra in setup) {
    var self = this;
    if(setup[extra] instanceof Array) {
      for(var i=0; i<setup[extra].length; i++) 
        setup[extra][i].call(self, options);
    } else {
      setup[extra].call(self, options);
    }
  }
};

This method is called on a jQuery object. It takes an object containing a set of setup methods, and the options that were passed into the plugin. For each key in the setup object, this method takes each function attached to it (the value for the key can either be a function or an Array of functions), and calls it with the jQuery object as “this” and the options as the first parameter.

What we’re going to do is call that method with a default set of methods (which we’ll call $.fn.tabs.base

this.setupPlugin(options.setup || $.fn.tabs.base, options);

Next, under the implemention (or in a separate file), define the defaults.

  var getPanel = function(selected) {
    return $($(selected).attr("href"));
  };
 
  $.fn.tabs.base = {
    setupPanel: [function(options) {
      this.bind("setupPanel", function(e, selector) {
        $(selector).hide();
      });
    }],
 
    initialize: [function(options) {
      this.bind("initialize", function() {
        var firstTab = $(this).find("li a:first")[0];
        $(this).trigger("activated", firstTab);
      });
    }],
 
    activate: [function(options) {
      this.bind("activated", function(e, selected) {
        var panel = getPanel(selected);
        $$(this).panels.hide();
        $(panel).show();
        $(this).find("li a").removeClass("active");
        $(selected).addClass("active").blur();
      });
    }]
  };

First, we define a small helper method that gets the associated panel for a tab. We could also have used the new $$ technique above to add it directly to the node.

Next, we define a few categories of methods to be run. Each of this methods will be triggered when we call setupPlugin, and takes the options hash as a parameter (optional). We will make each of the categories an Array, so they can be extended easily by prepending a setup method.

The setupPanel method binds the setupPanel event. If you recall, setupPanel takes the panel as its second parameter. As with all events, the first parameter is the event object, which we will not use now. When setupPanel is triggered, we will simply hide the panel. Note that we have decoupled the display details of the tab list from the events.

The next thing we do is handle initialization. In the simplest case, we’ll just always activate the first tab. To activate the tab, we’ll just trigger another event: “activated”.

Finally, we’ll define the default behavior for activation. When a tab is activated, we will:

  • Get the associated panel
  • Get all of the panels in the tab widget and hide them
  • Show the associated panel
  • Remove the “active” class from all tabs
  • Add the active class to this tab, and blur it

If you run this code, (available at <a href=”http://github.com/wycats/js_tabs_example/tree/master”>my github</a>”), you will have a basic tab widget set up and working!

Step 3: Add support for Ajax

Now that we have the basic functionality completed, let’s add support for Ajax. The API we’ll use it to support an option passed into the main widget like {xhr: {"#nameOfTab": "url_to_load"}}.

The first thing that we’ll do is make a clone of the default setup object so we can modify it.

var wycats = $.extend({}, $.fn.tabs.base);

Next, we’ll want to add a new function to run immediately before the default activated function. We still want to run the default activated function, which handles all the logic for displaying the tab, but we want to do some stuff first. Here’s the code:

  wycats.activate.unshift(function(options) {
    var xhr = options.xhr;
    this.bind("activated", function(e, selected) {
      var url = xhr && xhr[$(selected).attr("href")];
      if(url) {
        var panel = getPanel(selected);
        panel.html("<img src='throbber.gif'/>").load(url);
      }
    });
  });

Now you can see why we pass the options into the setup methods. First, we store off the xhr options into a local variable, which will be available to callbacks. Next, we bind the activated event. Note that since this is event, all bound events (including the default activated event from above) will get triggered. We’re using unshift here to bind this activated event first, which will cause it to get triggered before the default behavior later.

When the tab is activated, we check to see whether the tab’s href property is listed inside the xhr option. If it is, we replace the HTML of the panel with a throbber, and load the URL into it.

Step 4: Adding support for history

Finally, let’s add support for modifying the hash tag as we click on tabs, and loading the right tab on startup.

  delete wycats.initialize;
  wycats.hash = [function(options) {
    var tabs = this;
 
    this.bind("initialize", function() {
      var tab = $(this).find("li a");
      if(window.location.hash) tab = tab.filter("a[href='" + window.location.hash + "']");
      $(this).trigger("activated", tab[0]);
    });
 
    this.bind("activated", function(e, selected) {
      window.location.hash = $(selected).attr("href");
    });
  }];

We add a new category of functionality called “hash”. First, we delete the default initialize function, because it forces the first tab to be activated no matter what. We want to activate the tab that is present in the hash value (window.location.hash).

On initialization, we first get all of the tabs. Next, we check to see whether window.location.hash is populated. If it is, we filter the list of tabs to include only the one that matches the hash. Next, we activate the first remaining tab.

When a tab is activated, we update the hash to match the tab’s href.

Finally, because there is no generic “hashchange” event in the browser, we need to emulate one (in case the user presses the back button after clicking a tab):

    var lastHash = window.location.hash;
    setInterval(function() {
      if(lastHash != window.location.hash) {
        var tab = $(tabs).find("li a[href='" + window.location.hash + "']");
        if(!tab.is(".active")) $(tabs).trigger("activated", tab[0]);
        lastHash = window.location.hash;
      }
    }, 500);

Every 1/2 second, we check to see the if the hash has changed. If it did, and the tab is not yet active (which would happen if the user explicitly triggered the event), we trigger the activated event.

Wrapup

The basic idea here is that we have created an extensible tab system. The core of the tab system is just the necessary events, and we added a set of default event handlers for the events. Adding functionality, for the most part, simply required binding additional functionality to those events, and occasionally deleting a handler.

There are a few unconventional techniques here ($$, setupOptions), but they make it easier to implement the basic idea espoused in this article. It’s only one possible implementation, and I’d love to hear about tweaks to this approach or entirely different event-driven designs.

Better Module Organization

As I said in my last post, Carl and I have been working on a more modular version of ActionController. As we fleshed out the feature set, we had a few needs that aren’t directly addressed by Ruby’s built-in feature-set.

  • Modules occasionally depended on other modules (there’s no point in having Layouts without Renderer), but including Renderer into Layout meant that we couldn’t have setup on Renderer that got applied to the controller class itself. In this case, Renderer adds a “_view_paths” inheritable accessor to new Controller classes that is used to store a list of paths containing templates. If we included Renderer into Layouts, and then Layouts into ActionController::Base, that setup would happen on Layouts, which is wrong.
  • We used the def self.included(klass) klass.class_eval { ... } end idiom a whole lot. In fact, that’s the only thing we used the included hook for, except…
  • Extending ClassMethods onto the class.

When I was at Locos X Rails, I spent some time with Evan, and he argued that using the included hook should be done only after trying other abstractions. After speaking for a few minutes, Evan suggested abstracting away the above ideas in a higher-level abstraction that wrapped include. We could then more directly control the inclusion process, and even add our own hooks where needed.

I ended up with:

module AbstractController
  module Callbacks
    setup do
      include ActiveSupport::NewCallbacks
      define_callbacks :process_action
    end
    ...
  end
end

replacing:

module AbstractController
  module Callbacks
    def self.included(klass)
      klass.class_eval do
        include ActiveSupport::NewCallbacks
        define_callbacks :process_action
        extend ClassMethods
      end
    end
    ...
  end
end

For dependencies, I replaced:

module AbstractController
  module Helpers
 
    def self.included(klass)
      klass.class_eval do
        extend ClassMethods
        unless self < ::AbstractController::Renderer
          raise "You need to include AbstractController::Renderer before including " \
                "AbstractController::Helpers"
        end
        extlib_inheritable_accessor :master_helper_module
        self.master_helper_module = Module.new
      end
    end
    ...
  end
end

with

module AbstractController
  module Helpers
    depends_on Renderer
 
    setup do
      extlib_inheritable_accessor :master_helper_module
      self.master_helper_module = Module.new
    end
    ...
  end
end

And finally, the Base controller itself could now be replaced with:

module ActionController
  class Base2 < AbstractBase
    use AbstractController::Callbacks
    use AbstractController::Helpers
    use AbstractController::Logger
 
    use ActionController::HideActions
    use ActionController::UrlFor
    use ActionController::Renderer # just for clarity -- not required
    use ActionController::Layouts
  end
end

from:

module ActionController
  class Base2 < AbstractBase
    include AbstractController::Callbacks
    include AbstractController::Renderer
    include AbstractController::Helpers
    include AbstractController::Layouts
    include AbstractController::Logger
 
    include ActionController::HideActions
    include ActionController::UrlFor
    include ActionController::Layouts
    include ActionController::Renderer
  end
end

It’s not a tremendous change, but it definitely reduces the likelihood of accidental mistakes, and makes the actual usage a lot clearer. Of course, we will need to document this new mechanism, but it has already simplified the necessary mental model of the setup.

As always, thanks for reading!

Another Dispatch: AbstractController

On Monday, Carl started pairing with me on Rails 3. This week, we worked on further fleshing out the AbstractController. A few principles we’re following:

  • The AbstractController should be a low-level API. Nobody should be using the AbstractController directly, and subclasses of AbstractController (like ActionController::Base) are expected to provide their own #render method, since rendering means different things depending on the context.
  • However, AbstractController should provide enough facilities to avoid reinventing the wheel. For instance, the subclasses might be responsible for figuring out which layout to use, but should not need to know how to render a template with a layout.
  • It is a common desire to be able to add new options to render. This should be possible in an isolated way without interfering with other parts of the controller. We achieved this by making it possible to do:
module ActionController
  module Layouts
    def render_to_string(options)
      options[:_layout] = options[:layout] || _layout
      super
    end
 
    def _layout
      # implementation here
    end
  end
end
 
class ActionController::Base &lt; ActionController::HTTP
  include ActionController::Layouts
end

In this case, options[:_layout] is used by AbstractController’s Layouts module. If you are implementing a subclass of AbstractController and want to make it possible to supply or use a layout, simply make a new module that sets the :_layouts key in the options hash and super. It may well turn out that all subclasses have the same logic for implicit layouts. If that’s the case, we can move the _layout up into AbstractController and let subclasses invoke it.

The specifics aren’t particularly important (yet), because we’re still working out what exactly needs to be modular, and what is more appropriately shared. But the bottom line is that we’re working toward some simple APIs that people can use to build their own customized controllers as well as extend built-in controllers without fear of breaking something else. Effectively, the API is “take in options hash, modify it if appropriate (possibly adding in keys to be used by AbstractController), super”.

Of course, we will need to document what the AbstractController keys are (so far we have :_prefix to specify the path to the template name to use, and :_layout to specify the layout to use). The specifics of all of this may change, and a crucial part of this is to cleanly document the architecture and API (my biggest frustration with the similarly architected current ActionController::Base is just how difficult is it to discover what’s going on), but I think we’re on the right track.

Finally, we’ve begun adding a large number of happy-path tests using Rack::Test that have been very effectively in helping to design and develop the more modular code. I’ll have more on this in the next couple of days, but the tests we’ve been writing have been very cool.

alias_method_chain in models

As people know, there’s been a fair bit of back-and-forth between me, the apparent foe of alias_method_chain, and folks who feel that alias_method_chain is a perfectly reasonable API that people should not blindly hate.

There are basically two use-cases for alias_method_chain:

  1. Organizing internal code
  2. Modifying existing code

Using alias_method_chain to organize internal code is an interesting discussion that I will hopefully continue to have into the future. Today, I want to address uses of alias_method_chain to override methods in ActionController::Base or ActiveRecord::Base. People who do this are blindly using the technique that has been most evangelized as the solution to all their problems when Ruby comes with a perfectly good solution.

Consider this post from vaporbase, which I am decidedly not picking on. It represents a common idiom that people have been trying to use in Rails. First, the usage in a model:

class Foo < ActiveRecord::Base
  include FooBar
end

Second, the code implementation:

module FooBar
  module ClassMethods
    def find_with_bar( *args ) 
      find_without_bar( *args )
      #...or whatever 
    end
  end
 
  def self.included(base)
    base.class_eval do
      extend ClassMethods
      class << self
        alias_method_chain :find, :bar
      end
    end
  end  
end

This is exactly equivalent to:

module FooBar
  def self.included(base)
    base.extend(ClassMethods)
  end
 
  module ClassMethods
    def find(*args)
      super
      #...or whatever
    end
  end
end

That’s right… if you’re looking to modify subclasses of ActiveRecord::Base or ActionController::Base, keep in mind that you’re (gasp) in an OO language with inheritance and super.

If you want to modify all of the models in your application, create your own custom ActiveRecord::Base subclass, and inherit from that throughout your application. That’s what inheritance is there for!

Another example

Another example, by a very, very smart person, goes even further overboard.

First, he started with:

class Post < ActiveRecord::Base
  class << self
    def find_with_tags(*args)
      options = extract_options_from_args!(args)
      if tag = options.delete(:tags)
        options[:select] ||= 'posts.*'
        options[:joins]  ||= ''
        options[:joins]  << <<-END
          INNER JOIN posts_tags AS inner_posts_tags ON posts.id = inner_posts_tags.post_id
          INNER JOIN tags AS inner_tags ON inner_tags.id = inner_posts_tags.tag_id
        END
        add_to_conditions(options, tags.map { 'inner_tags.name = ?' }.join(' OR '), *tags)
      end
      find_without_tags(*(args + [options]))
    end
    alias_method_chain :find, :tags
 
    def find_with_query(*args)
      options = extract_options_from_args!(args)
      if query = options.delete(:query)
        if query.empty?
          add_to_conditions(options, 'false')
        else
          term = "%#{query}%" 
          add_to_conditions(options, "posts.content LIKE ? OR posts.title LIKE ?", term, term)
        end
      end
      find_without_query(*(args + [options]))
    end
    alias_method_chain :find, :query
 
    protected
 
    def add_to_conditions(options, condition, *args)
      condition = args.empty? ? condition : [condition, *args]
      if options[:conditions].nil?
        options[:conditions] = condition
      else
        options[:conditions] = sanitize_sql(options[:conditions]) + " AND (#{sanitize_sql(condition)})" 
      end
    end
  end
end

Noticing it wasn’t very DRY, he resorted to metaprogramming:

class Post < ActiveRecord::Base
  class << self
    def handle_find_option(name, &block)
      eigenclass = class << self; self; end
      eigenclass.send :define_method, "find_with_#{name}_handled" do |*args|
        options = extract_options_from_args!(args)
        if option = options.delete(name)
          block[options, option]
        end
        send("find_without_#{name}_handled", *(args + [options]))
      end
      eigenclass.send :alias_method_chain, :find, "#{name}_handled" 
    end
  end
end
 
class Post < ActiveRecord::Base
  handle_find_option(:tags) do |options, tags|
    options[:select] ||= 'posts.*'
    options[:joins]  ||= ''
    options[:joins]  << <<-END
      INNER JOIN posts_tags AS inner_posts_tags ON posts.id = inner_posts_tags.post_id
      INNER JOIN tags AS inner_tags ON inner_tags.id = inner_posts_tags.tag_id
    END
    add_to_conditions(options, tags.map { 'inner_tags.name = ?' }.join(' OR '), *tags)
  end
 
  handle_find_option(:query) do |options, query|
    if query.empty?
      add_to_conditions(options, 'false')
    else
      term = "%#{query}%" 
      add_to_conditions(options, "posts.content LIKE ? OR posts.title LIKE ?", term, term)
    end
  end
end

An alternative, using super:

class Post < ActiveRecord::Base
  class << self
    def find(*args)
      options = args.last.is_a?(Hash) ? args.last : {}
      add_tag_conditions(options)
      add_query_conditions(options)
      super
    end
 
    private
    def add_tag_conditions(options)
      if tag = options.delete(:tags)
        options[:select] ||= 'posts.*'
        options[:joins]  ||= ''
        options[:joins]  << <<-END
          INNER JOIN posts_tags AS inner_posts_tags ON posts.id = inner_posts_tags.post_id
          INNER JOIN tags AS inner_tags ON inner_tags.id = inner_posts_tags.tag_id
        END
        add_to_conditions(options, tags.map { 'inner_tags.name = ?' }.join(' OR '), *tags)
      end
    end
 
    def add_query_conditions(options)
      if query = options.delete(:query)
        if query.empty?
          add_to_conditions(options, 'false')
        else
          term = "%#{query}%" 
          add_to_conditions(options, "posts.content LIKE ? OR posts.title LIKE ?", term, term)
        end
      end      
    end
 
    def add_to_conditions(options, condition, *args)
      condition = args.empty? ? condition : [condition, *args]
      if options[:conditions].nil?
        options[:conditions] = condition
      else
        options[:conditions] = sanitize_sql(options[:conditions]) + " AND (#{sanitize_sql(condition)})" 
      end
    end
  end
end

We just override find, have it modify the options as appropriate, and call super. This same technique works fine for your own applications’ ActionController modifications, and in any case where the framework API involves subclassing. Folks: this is what subclassing is FOR!

Rack as a Transformative Figure

I had occasion to think about Rack a few times today. First of all, I was writing some proposals to a few conferences about Rails, which always gets me thinking about architecture. In this case, it got me thinking about how Rack is influencing Rails’ architecture moving forward. Second, a friend told me he was considering moving from an archaic web language to Python (specifically Django). I got to thinking about how the future of Ruby is intertwined with Rack, and why that might not be the case with Python and WSGI, unfortunately.

Like Ruby, Python has a single dominant web framework in Django. According to Google Trends, Django gets about eight times as many programming-related searches as the next-most-popular framework, Turbogears. And TurboGears is 1.5 to 2x the size of Pylons and CherryPy, other frameworks. Naturally, Rails dominates the Ruby space more thoroughly (in the last 12 months, Rails beat the next-most-popular framework, Merb, by more than 60 times), but the effect is roughly the same. Many more people contribute to the Django and Rails ecosystem than the Turbogears and Merb ecosystems.

Fortunately for Ruby, however, Rails (our dominant framework) has thoroughly embraced Rack, an abstraction layer and specification for the code between a web framework and a web server. This embrace of Rack has brought much-needed oxygen to the specification, and has allowed Merb, Rails, Sinatra, and others to cooperate in a shared space. In the next few months, Merb and Rails will be making their routers a shared Rack component, and the same is true for a number of smaller elements, like parameter parsing. In part, this is possible because unlike WSGI, Rack is a specification *and* a library. In the WSGI universe, the WebOb library provides similar functionality, but is not as widely shared.

In particular, while Django itself can run on WSGI servers, Django does not use WebOb, nor is its middleware WSGI-compliant. Let’s look at a specific example, a piece of middleware that wraps all requests in Django in a transaction:

from django.db import transaction
 
class TransactionMiddleware(object):
    """
    Transaction middleware. If this is enabled, each view function will be run
    with commit_on_response activated - that way a save() doesn't do a direct
    commit, the commit is done when a successful response is created. If an
    exception happens, the database is rolled back.
    """
    def process_request(self, request):
        """Enters transaction management"""
        transaction.enter_transaction_management()
        transaction.managed(True)
 
    def process_exception(self, request, exception):
        """Rolls back the database and leaves transaction management"""
        if transaction.is_dirty():
            transaction.rollback()
        transaction.leave_transaction_management()
 
    def process_response(self, request, response):
        """Commits and leaves transaction management."""
        if transaction.is_managed():
            if transaction.is_dirty():
                transaction.commit()
            transaction.leave_transaction_management()
        return response

As you can see, Django allows multiple methods in its middleware (process_request, process_response, process_exception, and process_view). As with rack, WSGI middleware uses a single method to wrap the entire request. Here’s how the above Django middleware would look in Rack:

class TransactionMiddleware
  def initialize(app) @app = app end
 
  def call(env)
    transaction.enter_transaction_management
    tranaction.managed(true)  
    begin
      @app.call(env)
    rescue
      transaction.rollback if transaction.dirty?
      transaction.leave_transaction_management
    end
    if transaction.managed?
      transaction.commit if transaction.dirty?
      transaction.leave_transaction_management
    end
  end
end

Leaving aside the differences in language, the Django middleware is incompatible with WSGI for no discernable reason. This means that Django middleware cannot be shared with the rest of the (admittedly smaller) Python web ecosystem, which makes it harder, for instance, to use Django’s ORM separately from Django (because you’d have to write your own transaction-wrapping middleware).

The big win of WSGI and Rack is that like Unix pipes, they make it easy to mix and match web framework elements to develop something that works for you. It makes it easier to develop entire frameworks, like CloudKit, that can be used as simple plugins to larger frameworks like Rails. Rails’ embrace of Rack, in the long term, is going to significantly open up experimentation in the community to non-Rails solutions that can be used WITH RAILS while they are being ironed out. In fact, being able to trivially run a Sinatra app inside a Rails app is an important design goal for Rails moving forward.

When you look inside the Rails middleware folder, you see things like this in the Rails Failsafe middleware:

def call(env)
  @app.call(env)
rescue Exception => exception
  # Reraise exception in test environment
  if env["rack.test"]
    raise exception
  else
    failsafe_response(exception)
  end
end

Instead of assuming something specific about Rails test mode, this middleware is allowing the community to come together around specifications that all Ruby frameworks can use together (in this case, a generic “test mode”). And this middleware can be used in Sinatra, Merb, or CloudKit by simply requiring “action_dispatch/middleware/failsafe” (as of Rails3) and then adding ActionDispatch::FailSafe as a middleware. No other Rails assumptions are made.

The bottom line is that this is going to be very powerful moving forward, and it’s too bad that Django’s insistence on a go-it-your-alone approach is damaging the Python community’s attempts to do the same.

On a side note, I want to be clear that I am aware that Django can run inside of a WSGI server, and can use WSGI middleware. However, Django middleware cannot be used by WSGI, and a recent post on the topic, which advocated the use of WSGI middleware in Django apps, had to be sure to include “I’m not trying to stir up any controversy, I’m not saying we should stop making Django middleware or anything like that”. I, for one, think that’s a shame.

S3 Moneta Store (also, a Rails update)

As I had hoped, moneta has been encouraging various members of the community to play with using a unified key/value abstraction where they would typically have used just a Hash. Both Cloudkit and DataMapper have been playing with using moneta for various parts of their infrastructure, and Anthony Eden just contributed an S3 adapter.

How to use moneta in your own project

It’s actually quite simple. Any place where you’re currently using or considering using a Hash for its key/value properties, simply continue to use the Hash, but provide a configuration option for swapping in a different object instead.

Here’s a potential example from DataMapper. The original code:

def initialize(second_level_cache = nil)
  @cache = {}
  @second_level_cache = second_level_cache
end

And the code with moneta:

module DataMapper
  cattr_accessor :identity_map_klass
  self.identity_map_klass = Hash
end

def initialize(second_level_cache = nil)
  @cache = DataMapper.identity_map_klass.new
  @second_level_cache = second_level_cache
end

As demonstrated above, you can use a bare Hash as a moneta store, with one caveat. A bare Hash does not support Moneta’s expiration features. If you want to be able to assume those features, change the above to:

module DataMapper
  cattr_accessor :identity_map_klass
  self.identity_map_klass = Moneta::Memory
end

def initialize(second_level_cache = nil)
  @cache = DataMapper.identity_map_klass.new
  @second_level_cache = second_level_cache
end

Moneta::Memory inherits from Hash, but adds expiration features. Of course, if you go with Moneta::Memory, you will need to include Moneta as a dependency (if you used a bare hash, the user could choose whether or not to use Moneta, but expiration would not be available).

Rails

I’ve been traveling an insane amount recently, but I’ve still been working on Rails. I’ll have a more thorough post on this soon, but I’ve started work on a more modular base class for ActionController::Base and ActionMailer::Base that should also serve as a base class for a Rails port of Merb’s parts. You can follow along at the abstract_controller branch of my github fork. I pulled in my callbacks branch as well, which is so far working wonderfully on my fork.

What’s the Point?

So after my announcement about moneta yesterday, a very common question was “what’s the point of this?”

There are basically two targets:

  • If you are needing a cache store that might potentially need to scale, you can use Moneta::Memory to start, and then scale up to Moneta::Memcache or Moneta::TokyoCabinet, etc. It’s certainly possible to do that right now, but you’d have to retool some of your code in order to do that. With Moneta, key/value stores are literally just drop-in-replacements for each other.
  • More interestingly, libraries that want to provide sugar around key/value stores (e.g. Rails and Merb’s caching support) can use Moneta as their backend in much the same way as they use Rack to connect to web servers or use DataObjects to connect to backend databases. By providing a simple API, libraries can say “we support moneta stores”, and then any stores created by the community will just work. One interesting ideas that I heard from dkubb last night was moving DataMapper’s IdentityMap to be compatible with Moneta (pretty darn easy since Moneta’s API is a subset of the Hash API), and support plugging in other moneta stores. This would allow processes or even multiple servers to share an IdentityMap, providing a simple write-through cache. When potentially combined with something like Memcache, this could be a powerful combination.
  • The bottom line is that by making key/value stores an abstracted idea, the community can build a slew of stores for Moneta, which will then work with whatever front-ends use them. The simplicity of the idea belies the potential power that comes from creating easy to understand APIs and letting the community at large take care of the actual implementaion.

Initial Release of Moneta: Unified Key/Value Store API

I’m happy to announce the (very) initial release of a new library called moneta, which aims to provide a unified interface for key/value stores. The interface mirrors the key/value API of hash, but not the iteration API, since iteration is not possible across all key/value stores. It is likely that a future release will add support for enumeration by keeping a list of keys in a key (very meta :P )

In order to prove out the API, I created five highly experimental stores for the initial release:

  • A basic file store: It uses the file name as the key, the file contents as the value, and xattrs to hold expiration information. The file store requires the xattr gem
  • An xattrs-only store: This uses a single file’s xattrs for keys and values, and a second file’s xattrs to hold an identical hash with expiration information
  • An in-memory store: This was the first store I wrote, purely to prove out the API. It uses a Hash internally, and a second hash for expiration information
  • A DataMapper store: Uses any DataMapper-supported storage with three columns: the_key, value, and expiration
  • A memcached store: Uses memcached and its native expiration

Note that the initial release also does not do any kind of locking; a Moneta Store should be treated as a standard hash, and should be locked appropriately by the consumer. Also note that the stores themselves do not perform any locking, so they should probably not be used between processes at this time (i.e. they are only experimental). It should be pretty straight-forward to add locking to most of the stores.

Likely the only locking issue in the memcached store is in #delete, which does:

def delete(key)
  value = self[key]
  @cache.delete(key) if value
  value
end

As a result, it is possible (at the moment, again) for delete to return a key that has been modified before it was actually deleted. For most use-cases, this is rather unlikely to matter, but keep in mind that pretty much all of the stores are unoptimized and not concurrent-safe across processes. Treat the stores, at the moment, as proofs of concept for the overall API.

Some details

The Moneta API is purposely extremely similar to the Hash API. In order so support an
identical API across stores, it does not support iteration or partial matches, but that
might come in a future release.

The API:

#initialize(options)
options differs per-store, and is used to set up the store

#[](key)
retrieve a key. if the key is not available, return nil

#[]=(key, value)
set a value for a key. if the key is already used, clobber it.
keys set using []= will never expire

#delete(key)
delete the key from the store and return the current value

#key?(key)
true if the key exists, false if it does not

#has_key?(key)
alias for key?

#store(key, value, options)
same as []=, but you can supply an :expires_in option, 
which will specify a number of seconds before the key
should expire. In order to support the same features
across all stores, only full seconds are supported

#update_key(key, options)
updates an existing key with a new :expires_in option.
if the key has already expired, it will not be updated.

#clear
clear all keys in this store

Archives

Categories

Meta