Yehuda Katz is a member of the Ember.js, Ruby on Rails and jQuery Core Teams; he spends his daytime hours at the startup he founded, Tilde Inc.. Yehuda is co-author of best-selling jQuery in Action and Rails 3 in Action. He spends most of his time hacking on open source—his main projects, like Thor, Handlebars and Janus—or traveling the world doing evangelism work. He can be found on Twitter as @wycats and on Github.

Automatic Flushing: The Rails 3.1 Plan

preamble: this post explains, in some detail, how we will implement a nice performance boost for Rails developers. Understanding the details might help gain the full benefits of the optimization, but you will gain some benefits even if you have no idea how it works.

As you’ve probably seen, DHH announced that we’d be looking at flushing in Rails 3.1 to improve the client-side performance of typical Rails applications.

The most obvious solution, and one that already exists in plugin form, is to allow a layout to have a new flush method, which would immediately flush the contents of the layout to the browser. By putting the flush method below the JavaScript and CSS includes, the browser could begin downloading and evaluating those static assets while the server continues building the page.

Unfortunately, this solution has a major problem: it requires a fairly significant change in the current model of how people build applications. In general, for performance optimizations (including client-side optimizations), we like to make the default as fast as possible, without asking people to understand a brand new paradigm, centered around the optimization.

The problem lies in the fact that a Rails layout is essentially a template with a bunch of holes to fill in.

<html>
  <head>
    <title><%= yield :title %></title>
    <%= javascript_include_tag :defaults %>
    <%= yield :extra_javascripts %>
    <%= stylesheet_link_tag :defaults %>
    <%= yield :extra_stylesheets %>
  </head>
  <body>
    <%= yield :sidebar %>
    <%= yield %>
  </body>
</html>

I this simple example, each yield is a slot that is filled in by the template (usually via content_for). In order to achieve this, Rails evaluates the template first, which populates a Hash with each piece of content. Next, it renders the layout, and each yield checks the Hash for that content. In short, because of the way layouts work, Rails renders the template first, and then the layout.

To get around this, one option would be to say that everything before the flush must not use yield, and must be able to run before the template. Unfortunately, it’s somewhat common for people to set up a content_for(:javascripts) in a template, to keep the JavaScript needed for a particular snippet of HTML close to the HTML. This means that not only does the user have to be careful about what can go above and below the flush, he can no longer use content_for for things high up in the template, which is a fairly significant change to the overall design of Rails applications.

For Rails 3.1, we wanted a mostly-compatible solution with the same programmer benefits as the existing model, but with all the benefits of automatic flushing. After a number of very long discussions on the topic, José Valim came up with the idea of using Ruby 1.9 fibers to jump back and forth between the template and layout.

Let’s start by taking a look at a very simplified version of the current Rails rendering pipeline. First, we set up a Buffer object purely for logging purposes, so we can see what’s happening as we push things onto the buffer.

module Basic
  class Buffer < String
    def initialize(name, context)
      @name    = name
    end
 
    def <<(value)
      super
 
      puts "#{@name} is pushing #{value.inspect}"
    end
  end
end

Next, we create a simple version of ActionView::Base. We implement the content_for method simply, to print out a bit of logging information and stash the value into the @content_for Hash. Note that the real version is pretty similar, with some added logic for capturing the value of the block from ERB.

module Basic
  class ViewContext
    def initialize
      @buffer      = Buffer.new(:main, self)
      @content_for = {}
    end
 
    def content_for(name, value = nil)
      value = yield if block_given?
      puts "Setting #{name} to #{value.inspect}"
      @content_for[name] = value
    end
 
    def read_content(name)
      @content_for[name]
    end
  end
end

Next, we create a number of methods on the ViewContext that look like compiled ERB templates. In real life, the ERB (or Haml) compiler would define these methods.

module Basic
  class ViewContext
    def layout
      @buffer << "<html><head>"
      @buffer << yield(:javascripts).to_s
      @buffer << yield(:stylesheets).to_s
      @buffer << "</head><body>"
      @buffer << yield.to_s
      @buffer << yield(:not_existant).to_s
      @buffer << "</body></html>"
      @buffer
    end
 
    def template
      buffer =  Buffer.new(:template, self)
      content_for(:javascripts) do
        "<script src='application.js'></script>"
      end
      content_for(:stylesheets) do
        "<link href='application.css' rel='stylesheet' />"
      end
      puts "Making a SQL call"
      sleep 1 # Emulate a slow SQL call
      buffer << "Hello world!"
      content_for(:body, buffer)
    end
  end
end

Finally, we define the basic rendering logic:

module Basic
  class ViewContext
    def render
      template
      layout { |value| read_content(value || :body) }
    end
  end
end

As you can see, we first render the template, which will fill up the @content_for Hash, and then call the layout method, with a block which pulls the value from that Hash. This is how yield :javascripts in a layout works.

Unfortunately, this means that the entire template must be rendered first, including the (fake) slow SQL query. We’d prefer to flush the buffer after the JavaScripts and CSS are determined, but before the SQL query is made. Unfortunately, that requires running half of the template method, then continuing with the layout method, retaining the ability to resume the template method later.

You can think of the way that templates are currently rendered (in Rails 2.x and 3.0) like this:

flush.001.png

Unfortunately, this makes it very hard to get any more performance juice out without asking the end-developer to make some hard choices. The solution we came up with is to use Ruby 1.9 fibers to allow the rendering to jump back and forth between the template and layout.

flush.002.png

Instead of starting with the template and only rendering the layout when ready, we’ll start with the layout, and jump over to the template when a yield is called. Once the content_for that piece is provided by the template, we can jump back to the layout, flush, and continue rendering. As we need more pieces, we can jump back and forth between the template and layout, flushing as we fill in the holes specified by the yield statements.

The implementation is mostly straight-forward:

require "fiber"
 
module Fibered
  class ViewContext < Basic::ViewContext
    def initialize
      super
      @waiting_for = nil
      @fiber       = nil
    end
 
    def content_for(name, value = nil)
      super
      @fiber.resume if @waiting_for == name
    end
 
    def read_content(name)
      content = super
      return content if content
 
      begin
        @waiting_for = name
        Fiber.yield
      ensure
        @waiting_for = nil
      end
 
      super
    end
 
    def layout
      @fiber = Fiber.new do
        super
      end
      @fiber.resume
      @buffer
    end
 
    def render
      layout { |value| read_content(value || :body) }
      template
      @fiber.resume while @fiber.alive?
      @buffer
    end
  end
end

For our fibered implementation, we’ll inherit from Basic::ViewContext, because we want to be able to use the same templates as we used in the original implementation. We update the content_for, read_content, layout and render methods to be fiber-aware. Let’s take them one at a time.

def layout
  @fiber = Fiber.new do
    super
  end
  @fiber.resume
  @buffer
end

First, we wrap the original implementation of layout in a Fiber, and start it right away. Next, we modify the read_content method to become Fiber-aware:

def read_content(name)
  content = super
  return content if content
 
  begin
    @waiting_for = name
    Fiber.yield
  ensure
    @waiting_for = nil
  end
 
  super
end

If the @content_for Hash already has the content, return it right away. Otherwise, say that we’re waiting for the key in question, and yield out of the Fiber. We modify the render method so that the layout is rendered first, followed by the template. As a result, yielding out of the layout will start the template’s rendering.

def render
  layout { |value| read_content(value || :body) }
  template
  @fiber.resume while @fiber.alive?
  @buffer
end

Next, modify the content_for method so that when the content we’re waiting for is provided, we jump back into the layout.

def content_for(name, value = nil)
  super
  @fiber.resume if @waiting_for == name
end

With this setup, the layout and template will ping-pong back and forth, with the layout requesting data, and the template rendering only as far as it needs to go to provide the data requested.

Finally, let’s update the Buffer to take our fibered implementation into consideration.

module Basic
  class Buffer < String
    def initialize(name, context)
      @name    = name
      @fibered = context.fibered?
    end
 
    def <<(value)
      super
 
      if @fibered
        puts "Flushing #{value.inspect}" if @fibered
      else
        puts "#{@name} is pushing #{value.inspect}"
      end
    end
  end
 
  class ViewContext
    def fibered?
      false
    end
  end
end
 
module Fibered
  class ViewContext
    def fibered?
      true
    end
  end
end

Now that we’re rendering the layout in order, we can flush as we go, instead of being forced to wait for the entire template to render before we can start flushing.

It’s worth mentioning that optimal flushing performance will be based on the order of the content_for in your template. If you run your queries first, then put the expensive template rendering, and only finally do the content_for(:javascript) at the end, the flushing behavior will look like this:

flush.003.png

Instead of flushing quickly, before the SQL call, things are barely better than they are in Rails 2.3, when the entire template must be rendered before the first flush. Because things are no worse, even in the worst-case scenario, we can make this the default behavior. Most people will see some benefit from it, and people interested in the best performance can order their content_for blocks so they cause the most beneficial flushing.

Even for people willing to put in the effort, this API is better than forcing a manual flush, because you can still put your content_for blocks alongside the templates that they are related to.

Look for this feature in Rails 3.1!

Small Caveat

For the purposes of this simplified example, I assumed that content_for can only be run once, immediately setting the value in the @content_for Hash. However, in some cases, people want to accumulate a String for a particular value. Obviously, we won’t be able to flush until the full String for that value is accumulated.

As a result, we’ll be adding a new API (likely called provide), which will behave exactly the same as content_for, but without the ability to accumulate. In the vast majority of cases, people will want to use provide (for instance, provide :sidebar), and get all the benefits of autoflushing. In a few cases, people will want to be able to continue accumulating a String, and will still be able to use content_for, but the template will not return control to the layout when that happens.

Also note that this (fairly detailed) explanation is not something that you will need to understand as a Rails developer. Instead, you will continue to go about your development as before, using provide if you have just a single piece of content to add, and content_for if you have multiple pieces of content to add, and Rails will automatically optimize flushing for you as well as we can.

52 Responses to “Automatic Flushing: The Rails 3.1 Plan”

This is well thought out and looks like it might work. Would make a big difference for some sites I run that have page creation slow downs in certain situations.

Good work :)

Sounds good (from the small bits I understood). I may have overlooked it, but any guesstimates so far on how much of a performance improvement auto-flushing will yield?

If this new scheme requires Ruby 1.9, does that mean that Rails 3.1 will be dropping support for 1.8.7? I kind of hope so – I think it would really help push the Ruby community over to 1.9.

Rails 3.1 will support Ruby 1.8, but probably won’t turn on this optimization.

I might’ve missed something here, but what about @variables in the templates? For instance, I might set my title in the template using @title, and then access @title in the layout, since the template is run first. Will this break with the new implementation?

Interesting. Are there any potential issues for web servers or browsers with this new flushing system? Or is this pretty standard stuff for most web servers and modern browsers?

It’s pretty much universally supported

You are quite right. Passing content between template and layout will need to be done via content_for and provide with this optimization enabled. It should result in an extremely large client-side improvement, so it’ll be worth making the (optional) change.

Our priority with this feature was to allow the most possible flushing without removing functionality.

We were able to achieve that goal, but not with exactly the same API in all cases.

So, since we don’t know what the content of the accumulated mass is (I don’t remember eating corn!), as long as we use fiber, the autoflushing will work. :)

its good. its good. its GOOD!

What happens when an exception occurs after one or more flushes have already occurred?

Brilliant approach. And very clearly explained. I eagerly look forward to it.

This encourages having SQL queries initiated by the view, after the header has rendered.
This seems antithetical to MVC to me.

Perhaps there is a way to have an around_filter for your controller that can generate and flush the the contents of the header, leaving the body to be filled by the normal render() call. Of course, then you’ll have to be able to discern what the layout should be before action processing, so it isn’t a great design, either.

- Aaron

out of interest, where do these discussions on the topic take place?

And with queries in rails 3 executing lazily, the server should be able to do its first flush pretty snappily. Brilliant!

One thing that’s not clear though: how will this work with Rack?
A middleware that prepended execution time to the response wouldn’t be compatible with the advantages of flushing.
Will middlewares have to be rewritten to be flush-friendly, or what?

@Aaron: In rails3 SQL queries already are initiated by the view in most cases – the query is only triggered when you try to iterate over the collection or whatever.

@Joseph: Good question. I suspect the standard will be to halt rendering at that point, although continuing on and showing the exception inline is an interesting, if ugly, possibility.

Perhaps it would be possible to send the browser something to redirect it to a clean error page?

@Aaron B
When you’re using the rails scoped finders they are lazy-evaluated. So you could set the query up in your controller but it won’t actually run it until you try to iterate over it in the view. So that fits this model perfectly.

There are a few options:

<script>document.write("error page")</script>

<script>window.location = http://path/to/error</script>

and possibly a meta tag. There are enough options that I’m not too worried about it.

Ok, but does anyone even put JavaScript in the head anymore? Ever since YSlow and Page Speed, most devs have been putting them just before .

…just before </body>

@arthur, @bracken, lazy-evaluated queries are only those that happen for view-only purposes. A big portion of time is spent on business logic and authentication, and it would be great to have the headers sent before any of that takes place.

Why not override << on @buffer to flush at certain lengths? That would give the added benefit of being able to control the chunk size, like 2k. I would prefer to manually call flush after the html head and have Rails flush at certain lengths than wait on yields, which may be very long for reports, long lists, etc.

Just a side-question, since this post is about templating in Rails, is it also possible to do inline partial similar to this plugin: http://github.com/acunote/template_inliner? For partial-heavy application, inlining simple partial would be a speed boost as well, since Rails don’t need to re-evaluate the partial all the time.

@Alex: In the example cases shown in the Template Inliner README, it would be more appropriate to use

‘partial’, :collection => objects %>

This way the partial template is looked up only once and is compiled only once (unless I’m much mistaken). Have a look at the code yourself, it’s in ActionView::Template and ActionView::Partials.

Yes, does this limit the JS to appear in the header?, as Christian Romney asks.

Footer JS is almost standard now

We recently implemented flushing in our Rails 2.2 install for http://www.songkick.com. It’s made a big difference to the apparent performance of the site. Looking forward to seeing it in 3.1

@Arthur Gunn

Was thinking of the same Rack issue,
a Rack response looks somethink like this [200, { 'Content-Type' => 'text/html' }, 'Hello World']

In my world any rack middleware asumes a string as the third element.

Not sure if its allowed to be a stream instead.

Comments?

I’ve been doing some research for this earlier, and my conclusion was: This is *very hard*, if not impossible, to implement automatically. The main problem is that it’s impossible to handle exceptions correctly without making the whole stack aware of it.

Currently, when an exception occurs, the system can simply change the response (since the response hasn’t been sent to the client yet, but is only buffered inside the system). With this approach, a response can be in x different states: before flushing, after the 1st flushing, … and after the xth flushing. And after the 1st flushing, the status, headers and some content has been sent to the client.

Imagine that something raises an exception after the 1st flushing. Then a 200 status has already been sent, togeher with some headers and some content. First of all, the system has to make sure the HTML is valid and at least give the user some feedback. It’s not impossible, but still a quite hard problem (because ERB doesn’t give us any hint of where tags are open/closed). The system also need to take care of all the x different state and return correct HTML in all of them.

Another issue is that we’re actually sending an error page with a 200 status. This means that the response is cacheable with whatever caching rules you decied earlier in the controller (before you knew that an error will occur). Suddenly you have your 500.html cached all over the placed, at the client-side, in your reverse proxy and everywhere.

Let’s not forget that exceptions don’t always render the error page, but do other things as well. For instance, sometimes an exception is raised to tell the system that the user needs to be authenticated or doesn’t have permission to do something. These are often implemented as Rack middlewares, but with automatic flushing they *also* need to take care of each x states. And if it for instance needs to redirect the user, it can’t change the status/headers to a 302/Location if it’s already in the 1st state, and therefore needs to inject a window.location=’foo’ in a cacheable 200 response.

Of course, the views *shouldn’t* really raise any exceptions because it should be dumb. However, in Rails it’s very usual in Rails to defer the expensive method calls to the view. The controllers sets everything up, but it’s not until it needs to be rendered that it’s actually called. This increases the possibilty that an exception is raised in the rendering phrase.

Maybe I’m just not smart enough, but I just can’t come up with a way to tackle all of these problems (completely automated) without requiring any changes in the app.

Did you started working on it, or only desining it? I can hardly see any branch on github releted to this feature. There is branch streaming, but it is not updated since July.

great feature, but I think that exception/error issue will be some kind of a problem. It reminds me of the Java days where you almost always have direct streaming back to the client. I had some issues with a proper error handling back then.

And JRuby will have a hard time implementing fibers in a performant way (I assume).

Will this cause problems for people using nested layout techniques?

Isn’t this essentially a very clever and roundabout reimplementation of normal method calls?

I.e., if content_for(:key) { … } could be compiled into a rendering method that the layout could simply call when needed, that would avoid the need to use fibers to accomplish this, and performance would not depend on having the optimal order of calls to content_for in the template.

It does depend on the template compiler having special semantics for content_for, and means that content_for blocks are called as needed rather than in the order of definition, which perhaps makes this approach unsuitable for template languages like ERB and HAML. But, gosh, it makes me appreciate a pure-Ruby template library like erector all the more.

I don’t think this approach is the correct direction. I miss streaming flush in Rails for a long time, but I think that automatic flushing is such a dangerous approach…

How about rescue_from with automatic flush enabled by default? Even if the automatic flush feature will be provided in Rails 3.1 I don’t think it should be enabled by default.

But even the automatic flush doesn’t help that much for lots of cases where streaming is a concern. It would be great to think in some API for allowing manual flush statements for situations like rendering a big report table that may need some time to process each line.

Another feature a miss a lot is the possibility to tell Rails in some action that “this will take a lot of time to finish. Please fork this process and process next request, but don’t allow more than n concurrent requests of this kind”…

But anyway, I’m happy you are already considering flush rendering for 3.1!

P.S: It took me the whole morning to read this post and I ended-up re-reading several Wikipedia pages such as Fibers, coroutines, semi-coroutines, continuations and several other related subjects :) And I’m pretty sure I’ll re-read all of them again some day, as I’ve already done in the past! ;)

Very interesting. This is something I remember from my .net days it can be very useful in certain situations. Good to see you guys working on a simple solution to what had the potential to be a very complex topic.

+1 on Christian Romney.
I have also mentioned that about during RailsConf 2010 announcement on javascript on head… it makes our sites slower, not faster.

But still, the fiber enhancement on flushing seems to provide a boost on performance. Is there any real life comparison on this changes and performance effect already?

@arthur it seems like the middleware should be updated, as you noticed. The middleware loses its transparency when it requires understading the html (and javascript according to yehuda suggetion) in order to decide what to do.

@yehuda, any suggestion on how to handle that error case in case its not html+javascript? it would also break the http protocol if I return a 200 OK for an error page. The result of breaking http compatibility are several. One I can think of right now: imagine google thinking it successfuly indexed my page with a “window.location” js response and a dumb response because I have answered 200 and a js redirect while it previously had a real successful copy of my page already cached: some seo power lost here.
Because we are dealing with an http problem (headers first, body afterwards), with a html+js solution, it would be nice to think of an http solution first.

Regarding error handling, we’ve been using a trick in production at
patch.com which would fit perfectly with progressive rendering.

#render is modified to swallow exceptions and add them to the Rack
environment. That means partials that have errors in them simply
render nothing. The exceptions are injected at the end of the body in
a middleware, and are styled with CSS so the message is nicely
positioned and styled in non-production modes.

In production, we simply don’t render the errors, which means an error
in the home page doesn’t bring down the site for us – it just causes
the offending partial to be omitted. We still receive an email alert
so the problem gets fixed.

I’ll be adding this option to template_streaming shortly.

For those concerned about putting javascript at the head of the page,
you can actually get better results this way as long as it doesn’t
block script loading. I describe this in the example at
http://github.com/oggy/template_streaming (which I admit is nowhere
near as beautifully explained as Yehuda did here…).

About automatic flushing, I think it will be fantastic as long as
there is a (configurable) throttling option, so I could ensure data
isn’t flushed more than, say, every 50ms. I also plan to add this to
template_streaming shortly.

@george on javascript on the head, you are right – and we agree, using ajax is the solution, not script srcing it as it is done by default in rails (and most of its plugins). So in order to benefit from it, one could change the default behavior from script srcing it to ajax loading it – breaking current existing apps that relies on pre srcing those scripts

Great plugin George

I’d be lying if I said I ever looked into flushing before this post, or that I really had a grasp on what flushing was. I appreciated the explanation about how Rails currently handles sending the response to the browser, and how core hopes to change it for gains.

How does this play together with etag http caching? As far as I understand the default behaviour of rails is to take the MD5 sum of the rendered page as the etag.

That Valim fella is a smart cookie!

Teh featurz I wantz. You haz dem.

This looks really cool, and as you say nice performance benefits, but the comments give me an equal amount of information to contemplate. I’m sure you guys will figure it all out.

Erik Fonselius: The rack spec says the third element has to respond to #each which must yield strings. Using a string itself works in 1.8 but breaks in 1.9. If you hit up the debugger, you’ll see in Rails it’s an ActionDispatch::Response (I think), which of course isn’t a string, but responds to #each, yielding the content.

In my case I have some middleware that basically invalidates the flushing, but I bet if I thought about it some more I could sort that out… I’ll just have to wait for 3.1!

Great level of information here. There is so much data around about this topic that sometimes you can’t see the wood for that trees but you have pitched this at just the best degree so that the lay individual can understand – thank you!

That’s some sick code dude. Need to explore Rails more.

Error handling indeed seems to be an issue, but you have to wonder when an existing page should ever return anything but a 200 in a human interface? I would hope any modern app is equipped with some kind of friendly error capture and feedback.

@Michel – It is a show-stopper for sites that rely on SEO to return a 200 if anything but the full page renders. You typically don’t want Google publishing a partial page. This is why a Javascript document.write solution won’t work.

@Michel – It is a show-stopper for sites that rely on SEO to return a 200 if anything but the full page renders. You typically don’t want Google publishing a partial page. This is why a Javascript document.write solution won’t work.

This is mostly a non-issue, not a show-stopper…

IF you get an error in a templete, and IF it’s Google-bot that happens to index that specific template at the time of the error, and IF the partial state affects SEO, a whole lotta ifs…

That’s a …show stopper? Really?

Since like you get your priorities from some dumb manager type…

Great news! Will be looking forward to this improvement in Rails 3.1.

I realize it’s not the same thing but those of you who’re interested to speeding up JS performance for the time being, checkout headjs. http://github.com/headjs/headjs

Check out the comparison page. Blew me away.
http://headjs.com/test/script.html

Whats the status on this? Is this features still coming in rails 3.1?

Leave a Reply

Archives

Categories

Meta