SafeBuffers and Rails 3.0

As you may have read, Rails adds XSS protection by default in Rails 3. This means that you no longer have to manually escape user input with the h helper, because Rails will automatically escape it for you.

However, it's not as simple as all that. Consider the following:

Hello <strong>friends</strong>!

<%= tag(:p, some_text) %>
<%= some_text %>

In the above example, we have a few different scenarios involving HTML tags. First off, Rails should not escape the strong tag surrounding "friends", because it is unambiguously not user input. Second, Rails should escape some_text in the

tag, but not the

tag itself. Finally, the some_text in the final tag should be escaped.

If some_text is , the above should output:

Hello <strong>friends</strong>!

<p>&lt;script&gt;evil_js&lt;/script&gt;</p>
&lt;script&gt;evil_js&lt;/script&gt;

In order to make this happen, we have introduced a new pervasive concept called html_safe into Rails applications. If a String is html_safe (which Rails determines by calling html_safe? on the String), ERB may insert it unaltered into the output. If it is not safe, ERB must first escape it before inserting it into the output.

def tag(name, options = nil, open = false, escape = true)
  "<#{name}#{tag_options(options, escape) if options}#{open ? ">" : " />"}".html_safe
end

Here, Rails creates the tag, telling tag_options to escape the contents, and then marks the entire body as safe. As a result, the

and

will emerge unaltered, while Rails will escape the user-supplied content.

The first implementation of this, in Koz's rails-xss plugin, accomplished the above requirements by adding a new flag to all Strings. Rails, or Rails applications, could mark any String as safe, and Rails overrode + and << to mark the resulting String appropriately based on the input Strings.

However, during my last performance pass of Rails, I noticed that overriding every String concatenation resulted in quite a bit of performance overhead. Worse, the performance overhead was linear with the number of <%= %> in a template, so larger templates didn't absorb the cost (as they would if the problem was once-per-template).

Thinking about the problem more, I realized (and confirmed with Koz, Jeremy, and Evan Phoenix of Rubinius), that we could implement roughly the same feature-set in a more performant way with a smaller API impact on Ruby. Because the problem itself is reasonably complex, I won't go into a lot of detail about the old implementation, but will explain how you should use the XSS protection with the new implementation. If you already used Koz's plugin or are working with the prereleases of Rails, you'll notice that today's commit changes very little.

SafeBuffer

In Rails 3, the ERB buffer is an instance of ActiveSupport::SafeBuffer. SafeBuffer inherits from String, overriding +, concat and << so that:

Calling html_safe on a plain String returns a SafeBuffer wrapper. Because SafeBuffer inherits from String, Ruby creates this wrapper extremely efficiently (just sharing the internal char * storage).

As a result of this implementation, I was starting to see a lot of the following idiom in the codebase:

buffer << other_string.html_safe

Here, Rails is creating a new SafeBuffer for the other_string, then passing it to the << method of the original SafeBuffer, which then checks to see if it is safe. For cases like this, I created a new safe_concat method on the buffer which uses the original, native concat method, skipping both the need to create a new SafeBuffer and the need to check it.

Similarly, concat and safe_concat in ActionView proxy to the concat and safe_concat on the buffer itself, so you can use safe_concat in a helper if you have some HTML you want to concatenate to the buffer with no checks and without escaping.

ERB uses safe_concat internally on the parts of the template outside of <% %> tags, which means that with the changes I pushed today, the XSS protection code adds no performance impact to those cases (basically, all of the plain text in your templates).

Finally, ERB can now detect the raw helper at compile time, so if you do something like <%= raw some_stuff %>, ERB will use safe_concat internally, skipping the runtime creation of a SafeBuffer and checks for html_safety.

Summary

In summary, the XSS protection has the following characteristics:

In comparison, the initial implementation of XSS impacted each concatenation or + of String, had impact even if the app used the raw helper, and even on plain Strings in templates.

That said, I want to extend personal thanks to Koz for getting the first draft out the door. It worked, demonstrated the concept, and let the community test it out. All in all, an excellent first pass.