Automatic Flushing: The Rails 3.1 Plan

preamble: this post explains, in some detail, how we will implement a nice performance boost for Rails developers. Understanding the details might help gain the full benefits of the optimization, but you will gain some benefits even if you have no idea how it works.

As you've probably seen, DHH announced that we'd be looking at flushing in Rails 3.1 to improve the client-side performance of typical Rails applications.

The most obvious solution, and one that already exists in plugin form, is to allow a layout to have a new flush method, which would immediately flush the contents of the layout to the browser. By putting the flush method below the JavaScript and CSS includes, the browser could begin downloading and evaluating those static assets while the server continues building the page.

Unfortunately, this solution has a major problem: it requires a fairly significant change in the current model of how people build applications. In general, for performance optimizations (including client-side optimizations), we like to make the default as fast as possible, without asking people to understand a brand new paradigm, centered around the optimization.

The problem lies in the fact that a Rails layout is essentially a template with a bunch of holes to fill in.

<html>
  <head>
    <title><%= yield :title %></title>
    <%= javascript_include_tag :defaults %>
    <%= yield :extra_javascripts %>
    <%= stylesheet_link_tag :defaults %>
    <%= yield :extra_stylesheets %>
  </head>
  <body>
    <%= yield :sidebar %>
    <%= yield %>
  </body>
</html>

I this simple example, each yield is a slot that is filled in by the template (usually via content_for). In order to achieve this, Rails evaluates the template first, which populates a Hash with each piece of content. Next, it renders the layout, and each yield checks the Hash for that content. In short, because of the way layouts work, Rails renders the template first, and then the layout.

To get around this, one option would be to say that everything before the flush must not use yield, and must be able to run before the template. Unfortunately, it's somewhat common for people to set up a content_for(:javascripts) in a template, to keep the JavaScript needed for a particular snippet of HTML close to the HTML. This means that not only does the user have to be careful about what can go above and below the flush, he can no longer use content_for for things high up in the template, which is a fairly significant change to the overall design of Rails applications.

For Rails 3.1, we wanted a mostly-compatible solution with the same programmer benefits as the existing model, but with all the benefits of automatic flushing. After a number of very long discussions on the topic, José Valim came up with the idea of using Ruby 1.9 fibers to jump back and forth between the template and layout.

Let's start by taking a look at a very simplified version of the current Rails rendering pipeline. First, we set up a Buffer object purely for logging purposes, so we can see what's happening as we push things onto the buffer.

module Basic
  class Buffer < String
    def initialize(name, context)
      @name    = name
    end

    def <<(value)
      super

      puts "#{@name} is pushing #{value.inspect}"
    end
  end
end

Next, we create a simple version of ActionView::Base. We implement the content_for method simply, to print out a bit of logging information and stash the value into the @content_for Hash. Note that the real version is pretty similar, with some added logic for capturing the value of the block from ERB.

module Basic
  class ViewContext
    def initialize
      @buffer      = Buffer.new(:main, self)
      @content_for = {}
    end

    def content_for(name, value = nil)
      value = yield if block_given?
      puts "Setting #{name} to #{value.inspect}"
      @content_for[name] = value
    end

    def read_content(name)
      @content_for[name]
    end
  end
end

Next, we create a number of methods on the ViewContext that look like compiled ERB templates. In real life, the ERB (or Haml) compiler would define these methods.

module Basic
  class ViewContext
    def layout
      @buffer << "<html><head>"
      @buffer << yield(:javascripts).to_s
      @buffer << yield(:stylesheets).to_s
      @buffer << "</head><body>"
      @buffer << yield.to_s
      @buffer << yield(:not_existant).to_s
      @buffer << "</body></html>"
      @buffer
    end

    def template
      buffer =  Buffer.new(:template, self)
      content_for(:javascripts) do
        "<script src='application.js'></script>"
      end
      content_for(:stylesheets) do
        "<link href='application.css' rel='stylesheet' />"
      end
      puts "Making a SQL call"
      sleep 1 # Emulate a slow SQL call
      buffer << "Hello world!"
      content_for(:body, buffer)
    end
  end
end

Finally, we define the basic rendering logic:

module Basic
  class ViewContext
    def render
      template
      layout { |value| read_content(value || :body) }
    end
  end
end

As you can see, we first render the template, which will fill up the @content_for Hash, and then call the layout method, with a block which pulls the value from that Hash. This is how yield :javascripts in a layout works.

Unfortunately, this means that the entire template must be rendered first, including the (fake) slow SQL query. We'd prefer to flush the buffer after the JavaScripts and CSS are determined, but before the SQL query is made. Unfortunately, that requires running half of the template method, then continuing with the layout method, retaining the ability to resume the template method later.

You can think of the way that templates are currently rendered (in Rails 2.x and 3.0) like this:

Unfortunately, this makes it very hard to get any more performance juice out without asking the end-developer to make some hard choices. The solution we came up with is to use Ruby 1.9 fibers to allow the rendering to jump back and forth between the template and layout.

Instead of starting with the template and only rendering the layout when ready, we'll start with the layout, and jump over to the template when a yield is called. Once the content_for that piece is provided by the template, we can jump back to the layout, flush, and continue rendering. As we need more pieces, we can jump back and forth between the template and layout, flushing as we fill in the holes specified by the yield statements.

The implementation is mostly straight-forward:

require "fiber"

module Fibered
  class ViewContext < Basic::ViewContext
    def initialize
      super
      @waiting_for = nil
      @fiber       = nil
    end

    def content_for(name, value = nil)
      super
      @fiber.resume if @waiting_for == name
    end

    def read_content(name)
      content = super
      return content if content

      begin
        @waiting_for = name
        Fiber.yield
      ensure
        @waiting_for = nil
      end

      super
    end

    def layout
      @fiber = Fiber.new do
        super
      end
      @fiber.resume
      @buffer
    end

    def render
      layout { |value| read_content(value || :body) }
      template
      @fiber.resume while @fiber.alive?
      @buffer
    end
  end
end

For our fibered implementation, we'll inherit from Basic::ViewContext, because we want to be able to use the same templates as we used in the original implementation. We update the content_for, read_content, layout and render methods to be fiber-aware. Let's take them one at a time.

def layout
  @fiber = Fiber.new do
    super
  end
  @fiber.resume
  @buffer
end

First, we wrap the original implementation of layout in a Fiber, and start it right away. Next, we modify the read_content method to become Fiber-aware:

def read_content(name)
  content = super
  return content if content

  begin
    @waiting_for = name
    Fiber.yield
  ensure
    @waiting_for = nil
  end

  super
end

If the @content_for Hash already has the content, return it right away. Otherwise, say that we're waiting for the key in question, and yield out of the Fiber. We modify the render method so that the layout is rendered first, followed by the template. As a result, yielding out of the layout will start the template's rendering.

def render
  layout { |value| read_content(value || :body) }
  template
  @fiber.resume while @fiber.alive?
  @buffer
end

Next, modify the content_for method so that when the content we're waiting for is provided, we jump back into the layout.

def content_for(name, value = nil)
  super
  @fiber.resume if @waiting_for == name
end

With this setup, the layout and template will ping-pong back and forth, with the layout requesting data, and the template rendering only as far as it needs to go to provide the data requested.

Finally, let's update the Buffer to take our fibered implementation into consideration.

module Basic
  class Buffer < String
    def initialize(name, context)
      @name    = name
      @fibered = context.fibered?
    end

    def <<(value)
      super

      if @fibered
        puts "Flushing #{value.inspect}" if @fibered
      else
        puts "#{@name} is pushing #{value.inspect}"
      end
    end
  end

  class ViewContext
    def fibered?
      false
    end
  end
end

module Fibered
  class ViewContext
    def fibered?
      true
    end
  end
end

Now that we're rendering the layout in order, we can flush as we go, instead of being forced to wait for the entire template to render before we can start flushing.

It's worth mentioning that optimal flushing performance will be based on the order of the content_for in your template. If you run your queries first, then put the expensive template rendering, and only finally do the content_for(:javascript) at the end, the flushing behavior will look like this:

Instead of flushing quickly, before the SQL call, things are barely better than they are in Rails 2.3, when the entire template must be rendered before the first flush. Because things are no worse, even in the worst-case scenario, we can make this the default behavior. Most people will see some benefit from it, and people interested in the best performance can order their content_for blocks so they cause the most beneficial flushing.

Even for people willing to put in the effort, this API is better than forcing a manual flush, because you can still put your content_for blocks alongside the templates that they are related to.

Look for this feature in Rails 3.1!

Small Caveat

For the purposes of this simplified example, I assumed that content_for can only be run once, immediately setting the value in the @content_for Hash. However, in some cases, people want to accumulate a String for a particular value. Obviously, we won't be able to flush until the full String for that value is accumulated.

As a result, we'll be adding a new API (likely called provide), which will behave exactly the same as content_for, but without the ability to accumulate. In the vast majority of cases, people will want to use provide (for instance, provide :sidebar), and get all the benefits of autoflushing. In a few cases, people will want to be able to continue accumulating a String, and will still be able to use content_for, but the template will not return control to the layout when that happens.

Also note that this (fairly detailed) explanation is not something that you will need to understand as a Rails developer. Instead, you will continue to go about your development as before, using provide if you have just a single piece of content to add, and content_for if you have multiple pieces of content to add, and Rails will automatically optimize flushing for you as well as we can.

Automatic Flushing: The Rails 3.1 Plan

Small Caveat

Enjoy this Post?