Archive for May, 2008

By Thor’s Hammer!

For the past few months, I’ve become more and more disillusioned by the current state of Ruby’s scripting support. Sure, we have optparse, and a gamut of other solutions, but there’s no full-stack package for writing robust binaries.

Enter thor.

The idea behind thor initially came from my work on a textmate binary (more on that later today), which would manage installed Textmate bundles. Sure, there’s the getbundle bundle, and you can manually get bundles from subversion, but I wanted a single binary that handled all of that.

While I was building it, I decided to map the commands that people could enter to a class and its methods. I created a small file that I included with the textmate binary called class_cli, and created some syntax for mapping things, like so (note that this is not the final syntax):

class MyApp
  include CLI

  desc "list [list]“, “list some items”
  def list(list = “stuff”)
    puts list.split(/,\s*/).join(”\n”)
  end
end

MyApp.start

Assuming the binary is called app, you could then do app list "one,two,three", which would print:

one
two
three

In the case of the textmate binary, I created methods for list, install, installed, and uninstall.

This worked nicely, so I wanted to extract it out for use in other tools. I started the Hermes project on github, but while drinking with Chris Wanstrath, I was convinced that the name Thor would be more appealing. Oh, the things that alcohol can do. I was also convinced of a couple other things (again, in a slightly inebriated state):

  • rake and sake needed to be replaced for scripts, not as a replacement for make
  • nobody was going to use hermesthor if they need extra boilerplate like MyApp.start

Long story short, Chris convinced me to make Thor a full-fledged scripting solution. So I did.

Thor, as it exists today, has two components:

  • The Thor superclass, which works exactly like CLI/Hermes, except that you inherit from it instead of including it (class MyApp < Thor).
  • The Thor runner, which can run Thortasks that are in a local directory or installed from a remote location

The thor runner allows you to make files like:

# module: random

class Amazing < Thor
  desc "describe NAME", "say that someone is amazing"
  method_options :forcefully => :boolean
  def describe(name, opts)
    ret = “#{name} is amazing”
    puts opts["forcefully"] ? ret.upcase : ret
  end

  desc “hello”, “say hello”
  def hello
    puts “Hello”
  end
end

If you call the file *.thor or Thorfile and place it in your current directory, any directory above you, or tasks/*.thor, you can then invoke the thorfile in any of the following ways:

$ thor -T
Tasks
-----
amazing:describe NAME [--forcefully]   say that someone is amazing
amazing:hello                          say hello

$ thor amazing:hello
Hello

$ thor amazing:describe “This blog reader”
This blog reader is amazing

$ thor amazing:describe “This blog reader” –forcefully
THIS BLOG READER IS AMAZING

You can also install local tasks or remote tasks to your system thor cache and make them available anywhere:

$ thor install task.thor
Your Thorfile contains:
# module: random

class Amazing < Thor
  desc "describe NAME", "say that someone is amazing"
  method_options :forcefully => :boolean
  def describe(name, opts)
    ret = “#{name} is amazing”
    puts opts["forcefully"] ? ret.upcase : ret
  end

  desc “hello”, “say hello”
  def hello
    puts “Hello”
  end
end
Do you wish to continue [y/N]? y
Storing thor file in your system repository

$ thor installed
Name      Modules
—-      ——-
random    amazing

Tasks
—–
amazing:describe NAME [--forcefully]   say that someone is amazing
amazing:hello                          say hello

$ thor amazing:hello
Hello

… same as above …

You can also specify a URL instead of a file name; like sake, thor uses open-uri to get the files. You uninstall or update thor modules based on the short name that was provided; if # module: name exists at the top of the file, thor will use that by default. You can also use thor install task.thor --as my_short_name. If you don’t provide a short name, thor will ask for one.

Later, you can do thor update short_name and thor will remember where you got the module from and try to update it. thor uninstall short_name will remove the module from your list of installed modules.

Of course, thor -T (or thor list) will list local tasks and system-wide tasks in the resulting list, so you don’t need a separate tool to track your thortasks, and a local task can be made into a system task trivially.

Finally, thor itself is self-hosting; the thor runner uses the Thor superclass. As a result, I added some more features to the superclass as I built the runner:

  • You can map short names to their full name: map "-T" => :list, which is why thor -T and thor list are identical
  • You can provide additional options that get passed in as a Hash (method_options :as => :required). The available option types are :required, :optional, and :boolean. The resulting hash is passed in as the final parameter to your method and the pretty-printed help automatically includes them in the usage screen.

Take a look at the thor repository on github, and specifically, the thor runner for more information on how it all fits together.

Best Things Come in Threes

I was at the Dallas Tech Fest this past weekend, and had the opportunity to meet up with some cool guys in the DataMapper community (Sam, Adam, Ben and Bryan). While I was there, I ended up hacking out a few unrelated tools that I thought I’d share with the community.

benchwarmer

Benchwarmer is an improved DSL for doing benchmarks (hat tip for the name to br0nette). It provides options for grouping, and produces output like:

Running the benchmarks 100000 times each...

                         Option 1 |   TWO | Option 3 |
 -----------------------------------------------------
 Squeezing with #squeeze     0.15 |  0.15 |     0.14 |
              with #gsub     0.38 |  0.35 |     0.36 |
 -----------------------------------------------------
 Spliting    with #split     0.43 |  0.51 |     0.61 |
             with #match     0.29 |  0.35 |     0.38 |
 -----------------------------------------------------

You get that output by doing:

  Benchmark.warmer(TIMES) do
    columns :one, :two, :three
    titles :one => "Option 1", :three => "Option 3"      

    group("Squeezing") do
      report "with #squeeze" do
        one { "abc//def//ghi//jkl".squeeze("/") }
        two { "abc///def///ghi///jkl".squeeze("/") }
        three { "abc////def////ghi////jkl".squeeze("/") }
      end
      report "with #gsub" do
        one { "abc//def//ghi//jkl".gsub(/\/+/, "/") }
        two { "abc///def///ghi///jkl".gsub(/\/+/, "/") }
        three { "abc////def////ghi////jkl".gsub(/\/+/, "/") }
      end
    end

    group("Spliting") do
      report "with #split" do
        one { "aaa/aaa/aaa.bbb.ccc.ddd".split(".") }
        two { "aaa//aaa//aaa.bbb.ccc.ddd.eee".split(".") }
        three { "aaa///aaa///aaa.bbb.ccc.ddd.eee.fff".split(".") }
      end
      report "with #match" do
        one { "aaa/aaa/aaa.bbb.ccc.ddd".match(/\.([^\.]*)$/) }
        two { “aaa//aaa//aaa.bbb.ccc.ddd.eee”.match(/\.([^\.]*)$/) }
        three { “aaa///aaa///aaa.bbb.ccc.ddd.eee.fff”.match(/\.([^\.]*)$/) }
      end
    end
  end

Most of that is optional; you can get usable benchmarks by stripping the DSL down to:

  Benchmark.warmer(TIMES) do
    report "squeezing with #squeeze" do
      "abc//def//ghi//jkl".squeeze("/")
    end
    report "squeezing with #gsub" do
      "abc//def//ghi//jkl".gsub(/\/+/, "/")
    end
  end

which produces:


                         Results |
----------------------------------
 squeezing with #squeeze    0.15 |
    squeezing with #gsub    0.34 |
----------------------------------

It is available at github. I will add the appropriate stuff so you can do
gem install wycats-benchwarmer. For now, you can just check out the git repo and do rake install.

I extracted this out of the benchmarks I was doing as I was building…

10x Faster Rails and Merb Inflector

On the plane over to Dallas Tech Fest, I thought it would be nice to try and improve the performance of the Rails Inflector, which is currently pretty slow (albeit probably not a bottleneck). Merb already uses the Facets English Inflector, which is 2x or so faster, but I was pretty sure I could do even better.

When I analyzed the English Inflector, I noticed a few things:

  • They were doing something similar to Rails looping over a list of regexen and picking the correct ones
  • Without exception, the correct choice was the longest string that matched the end of the word.
  • Matching a string (like fooses) against a regex with (foo|foos|fooses) will always match the longest string
  • Neither Rails nor English were caching the resulting words, which don’t change, and in the case of Inflecting Rails models and controllers, are a small universe of total words

As a result, I did two optimizations:

  • I packed all of the regexen into a single regex, and got rid of the sort-by-longest-string code. I then did a simple sub! against the word, pulling the results out of the rules Hash (that already existed in English)
  • English already cached irregular words (its first step was to look in the irregular words hash for the word in question), so I simply extended this cache to include any word already found.

Between these two optimizations, I was able to get around 10x over Rails, and got a huge boost for simple pluralization words (not so much for things like “person” => “people”). Rails also caught a bunch of cases in their tests that were not supported by English, so I added support for things like capital versions (”Person” => “People”) and partial words (”foo_child” => “foo_children”).

Here are the benchwarmer results:

                                         OLD  | NEW   | RAILS
Simple: account => accounts    Singular  0.16 |  0.02 |  0.25
                                 Plural  0.14 |  0.03 |  0.24
-------------------------------------------------------------
Simple: American => Americans  Singular  0.16 |  0.02 |  0.27
                                 Plural  0.16 |  0.31 |  0.27
-------------------------------------------------------------
Abnormal: dwarf => dwarves     Singular  0.07 |  0.04 |  0.17
                                 Plural  0.06 |  0.03 |  0.17
-------------------------------------------------------------
Abnormal: hero => heroes       Singular  0.05 |  0.03 |  0.20
                                 Plural  0.06 |  0.02 |  0.21
-------------------------------------------------------------
One Way: cactus => cactuses    Singular  0.11 |  0.02 |  0.26
                                 Plural  0.07 |  0.02 |  0.26
-------------------------------------------------------------
One Way: wife => wives         Singular  0.11 |  0.02 |  0.16
                                 Plural  0.13 |  0.03 |  0.14
-------------------------------------------------------------
Uncountable: fish => fish      Singular  0.03 |  0.02 |  0.03
                                 Plural  0.02 |  0.02 |  0.03
-------------------------------------------------------------
Exception: person => people    Singular  0.02 |  0.03 |  0.09
                                 Plural  0.03 |  0.02 |  0.11
-------------------------------------------------------------

It does break back-compat a bit with Rails, as the mechanism for adding new rules is simpler in order to be compatible with English’s faster inflector algorithm. I also had to remove Inflector#clear, which I wasn’t sure anyone was actually using (it allowed the clearing of very specific types of rules, which can no longer be supported as irregular words and regular singularization/pluralization rules are dumped in the same list for efficiency).

The new way of defining rules is:

Inflector.inflections do
  # One argument means singular and plural are the same.

  word 'equipment'
  word 'information'
  word 'money'
... snip ...
  word 'Swiss'     , 'Swiss'
  word 'virus'     , 'viri'
  word 'octopus'   , 'octopi'
... snip ...
  rule 'person' , 'people', true
  rule 'shoe'   , 'shoes', true
  rule 'hive'   , 'hives', true
  rule 'man'    , 'men', true
  rule 'rf'     , 'rves'
  rule 'ero'    , 'eroes'
... snip ...
  singular_rule 'of' , 'ofs' # proof
  singular_rule 'o'  , 'oes' # hero, heroes
... snip ...
  plural_rule 's'   , 'ses'
  plural_rule 'ive' , 'ives' # don't want to snag wife
  plural_rule 'fe'  , 'ves'  # don't want to snag perspectives

All of the inflector tests still pass in Rails, with the exception of the #clear tests, which were too coupled to the old implementation to salvage (they actually introspected into specific ivars).

The Rails modifications are available on my Rails branch on github. I hope to be able to push
the changes upstream if the Rails core folks are amenable :).

OSX Window Resizing Tools

I was going crazy trying to figure out a way to cordon off a section of my screen for screencasts (the center 800×600 for instance), and ended up in crazy AppleScript-land. I ended up with two useful scripts:

  • center, which takes any window, resizes it to a provided size, and centers it on the screen
  • maximize, which takes any window and maximizes it to fill the screen (even for crazy-zoom-windows like Safari)

Both scripts take drawers into consideration, so you can resize a window like TextMate to the correct size.

It is available at github and comes with a convenient raketask for installing
that will compile and copy your scripts into the appropriate folders. It also comes with a raketask that will install FastScripts Lite, which will
allow you to bind keys to these scripts (see the README for full details).