Yehuda Katz is a member of the Ember.js, Ruby on Rails and jQuery Core Teams; he spends his daytime hours at the startup he founded, Tilde Inc.. Yehuda is co-author of best-selling jQuery in Action and Rails 3 in Action. He spends most of his time hacking on open source—his main projects, along with others, like Thor, Handlebars and Janus—or traveling the world doing evangelism work. He can be found on Twitter as @wycats.
I’m Running to Reform the W3C’s TAG
December 7th, 2012
Elections for the W3C’s Technical Architecture Group are underway, and I’m running!
There are nine candidates for four open seats. Among the nine candidates, Alex Russell, Anne van Kesteren, Peter Linss, and Marcos Cáceres are running on a reform platform. What is the TAG, and what do I mean by reform?
What is the TAG?
According to the TAG’s charter, it has several roles:
- to document and build consensus around principles of Web architecture
- to interpret and clarify these principles when necessary
- to resolve issues involving general Web architecture brought to the TAG
- to help coordinate cross-technology architecture developments inside and outside W3C
As Alex has said before, the existing web architecture needs reform that would make it more layered. We should be able to explain the declarative parts of the spec (like markup) in terms of lower level primitives that compose well and that developers can use for other purposes.
And the W3C must coordinate much more closely with TC39, the (very active) committee that is designing the future of JavaScript. As a member of both TC39 and the W3C, I believe that it is vital that as we build the future of the web platform, both organizations work closely together to ensure that the future is both structurally coherent and pleasant for developers of the web platform to use.
Developers
I am running as a full-time developer on the web platform to bring that perspective to the TAG.
For the past several years, I have lobbied for more developer involvement in the standards process through the jQuery organization. This year, the jQuery Foundation joined both the W3C and ECMA, giving web developers direct representatives in the consensus-building process of building the future.
Many web developers take a very cynical attitude towards the standards process, still burned from the flames of the first browser wars. As a group, web developers also have a very pragmatic perspective: because we can’t use new features in the short-term, it’s very costly to take an early interest in standards that aren’t even done yet.
Of course, as a group, we developers don’t hesitate to complain about standards that didn’t turn out the way we would like.
(The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHOULD”, “SHOULD NOT”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC2119.)
The W3C and its working groups MUST continue to evangelize to developers about the importance of participating early and often. We MUST help more developers understand the virtues of broad solutions and looking beyond specific present-day scenarios. And we MUST evolve to think of web developers not simply as “authors” of content, but as sophisticated developers on the most popular software development platform ever conceived.
Layering
When working with Tom Dale on Ember.js, we often joke that our APIs are layered, like a delicious cake.
What we mean by layering is that our high-level features are built upon publicly exposed lower-level primitives. This gives us the freedom to experiment with easy-to-use concise APIs, while making it possible for people with slightly different needs to still make use of our hard implementation work. In many cases, such as in our data abstraction, we have multiple layers, making it possible for people to implement their requirements at the appropriate level of abstraction.
It can be tempting to build primitives and leave it up to third parties to build the higher level APIs. It can also be tempting to build higher level APIs only for particular scenarios, to quickly solve a problem.
Both approaches are prevalent on the web platform. Specs like IndexedDB are built at a very low level of abstraction, leaving it up to library authors to build a higher level of abstraction. In contrast, features like App Cache are built at a high level of abstraction, for a particular use-case, with no lower level primitives to use if a user’s exact requirements do not match the assumptions of the specification.
Alex’s effort on this topic is focused on Web Components and Shadow DOM, an effort to explain the semantics of existing tags in terms of lower-level primitives. These primitives allow web developers to create new kinds of elements that can have a similar level of sophistication to the built-in elements. Eventually, it should be possible to describe how existing elements work in terms of these new primitives.
Here’s another example a layer deeper: many parts of the DOM API have magic behavior that are extremely difficult to explain in terms of the exposed API of ECMAScript 3. For example, the innerHTML property has side-effects, and ES3 does not provide a mechanism for declaring setters. The ECMAScript 5 specification provides some additional primitives that make it possible to explain more of the existing DOM behavior in terms of JavaScript. While designing ECMAScript 6, the committee has repeatedly discussed how certain new features could help explain more of the DOM API.
Today, the web platform inherits a large number of existing specifications designed at one of the ends of the layering spectrum. I would like to see the TAG make an explicit effort to describe how the working groups can reform existing APIs to have better layering semantics, and to encourage them to build new specifications with layering in mind.
TC39 and JavaScript
Today, developers of the web platform increasingly use JavaScript to develop full-blown applications that compete with their native counterparts.
This has led to a renaissance in JavaScript implementations, and more focus on the ECMAScript specification itself by TC39. It is important that the evolution of JavaScript and the DOM APIs take one another into consideration, so that developers perceive them as harmonious, rather than awkward and ungainly.
Any developer who has worked with NodeList and related APIs knows that the discrepancies between DOM Array-likes and JavaScript Arrays cause pain. Alex has talked before about how standardizing subclassing of built-in object would improve this situation. This would allow the W3C to explicitly subclass Array for its Array-like constructs in a well-understood, compatible way. That proposal will be strongest if it is championed by active members of both TC39 and the HTML working group.
Similarly, TC39 has worked tirelessly on a proposal for loading JavaScript in an environment-agnostic way (the “modules” proposal). That proposal, especially the aspects that could impact the network stack, would be stronger with the direct involvement of an interested member of a relevant W3C working group.
As the web’s development picks up pace, the W3C cannot see itself as an organization that interacts with ECMA at the periphery. It must see itself as a close partner with TC39 in the development and evolution of the web platform.
Progress
If that (and Alex’s similar post) sounds like progress to you, I’d appreciate your organization’s vote. My fellow reformers Alex Russell, Anne van Kesteren, Peter Linss and Marcos Cáceres are also running for reform.
AC reps for each organization can vote here and have 4 votes to allocate in this election. Voting closes near the end of the month, and it’s also holiday season, so if you work at a member organization and aren’t the AC rep, please, find out who that person in your organization is and make sure they vote.
As Alex said:
The TAG can’t fix the web or the W3C, but I believe that with the right people involved it can do a lot more to help the well-intentioned people who are hard at work in the WGs to build in smarter ways that pay all of us back in the long run.
Follow Me to Google+
September 10th, 2012
I wrote my first post on this blog in January 2007.
In 2007, this blog was the easiest way I had to write my thoughts down for people who cared to read them. I wrote long posts and short post (but mostly long posts). I wrote deeply technical posts. I wrote proposals. I wrote introductory posts.
I did not post often.
In 2012, there are many more ways to write and reach an audience. I write whimsically on Twitter. I write personally on Facebook. More and more, I find that I write casually on Google+.
Without the 140-character constraint of Twitter, I can start writing and stop when I reach the end of a thought. Unlike the long-form nature of my blog, I find myself writing often, whenever something is on my mind. If you’re interested in reading that sort of thing, follow my Google+ profile. Because I never remember to include one person’s Google+ account in my reading rotation, I made it easy: plus.yehudakatz.com.
I’ll keep posting long-form pieces here. To keep things simple, I’ll always link to them from Google+. If you follow me there, I’ll make sure you always know when I post something, wherever that happens to be. If you care, join me on Google+.
August Tokaido Update
August 6th, 2012
It’s been a while since I posted anything on my blog, and I figured I’d catch everyone up on the work I’ve been doing on Tokaido.
Components
Tokaido itself is made up of a number of components, which I am working on in parallel:
- Ruby binary build, statically compiled
- A logging and alerting UI
- Remote Notifications for Rails 3+
- Integration with Puma Express
- Integration with code quality tools
- Resolve bundler issue related to binary builds and deployment to Heroku
- Work with community members to start shipping binary builds of popular gems (gates on fixing the bundler bug)
A number of people are doing parts of the work to make Tokaido a reality. I specifically want to thank:
- Michal Papis of the rvm team for using the sm framework to make the Tokaido build maintainable over time and doing the heavy lifting to take the initial spike I did and get a reproducible binary build
- Terence Lee of Heroku for packaging up these early binary builds for use at several Rails Girls events, with great success!
Ruby Binary Build, Statically Compiled
This is the first work I did, a few months ago, with help from Michal Papis. I detailed the hard parts of making Ruby statically compiled in June’s status update. Since then, Terence Lee (@hone02) has used the binary build at several Rails Girls events, dramatically reducing the time needed for Rails installation on OSX. In addition to the Tokaido binary build, Terence also precompiled a number of gems, and put built a script to download the entire thing as a zip, install it into the user’s home directory, and modify their ~/.profile to include it.
This strategy works well for trainings, but has a number of limitations:
- Since it relies on precompiled gems already existing in the gem home, the gems cannot be removed or upgraded without requiring a C compiler. The correct solution is what gems authors do for Windows: ship versions of binary gems with specific OSX designations. Currently, binary gems do not interact well with Heroku deploys (or anyone using
bundle install --deployment), so we have been working to resolve those issues before foisting a bunch of new binary gems on the world. See below. - It relies on modifying all instances of the Terminal, which means that some system edge-cases leak into this solution. It also relies on modifying
~/.profile, which may not work depending on what shell the user is using and what other startup scripts the user is running. For Tokaido, we will have a way to launch a Terminal that reliably injects itself without these problems, and without polluting all instances of Terminal. - It is difficult to upgrade components, like patch levels of Ruby or versions of C dependencies like libyaml.
Tokaido.appwill store its copy of Ruby and gems in a sandbox, loadable from the UI (as I described in the second bullet), which makes it easy for the.appto upgrade patch levels of Ruby or whatever components it needs.
Because I understand that many people WANT the ability to have their Ruby take over their terminal, Tokaido will integrate with rvm (and rbenv, if possible) to mount Tokaido.app as the default version of Ruby in your shell.
A Logging and Alerting UI
The primary UI for Tokaido will be a logging and alerting UI for your Rails application. I expect that you will spend the most time in the “Requests” UI, which will show you a list of the previous requests, and the log of the current request. My goal is to use the logging UI to improve on the typical “tail the log” experience. I’ve been working with Austin Bales, a designer at do.com to develop some mockups, and hope to have something for people to look at in the next few weeks.
In addition to an improved logging experience, Tokaido will alert you when something has gone wrong in your request. This may include exceptions (5xx) or information provided by plugins (bullet can let you know when your code is triggering an N+1 query; see the README for more information). The list of prior requests will highlight requests with issues, and problems in the notifications tray will directly link you to the request where the error occurred.
If everything goes well, the Tokaido UI will replace your logging workflow, without impacting the rest of the tasks you perform from the command-line.
Remote Notifications for Rails 3+
Rails 3 shipped with a new instrumentation API, which provides detailed information about many events that happen inside of Rails. This system is used by Rails’ own logging system, and I would like to use it in Tokaido as well.
The instrumentation API provides all of the information we need, but no built-in way to communicate those notifications across processes. Because Tokaido.app will not run in the same process as the Rails app, I have been working on a gem (remote_notifications) that provides a standard way to send these notifications to another process.
This gem also backports Aaron’s Rails instrumentation work (see his talk at Railsberry for more information) to Rails 3.0 and 3.1, which makes it possible to build a reliable tree from notifications, even on systems with low clock resolution.
It also includes a mechanism for subscribing to notifications sent from another process using the regular notifications API, and pluggable serializers and deserializers. It isn’t quite done yet, but I should have an initial release soon.
Integration with Puma Express
Puma is a new threaded web server built by Evan Phoenix of Rubinius fame. Puma Express manages Puma servers, automatically setting up the DNS (appname.dev) for you. This is similar to the approach used by Pow, but without a Node dependency.
Tokaido will integrate with Puma Express, so you will not need to manually boot and shut down your server. Because Tokaido also takes care of logging, there shouldn’t be any need for dedicated tabs with persistent running processes when you use Tokaido.
You will also be able to install an executable into your system (a la GitX and Textmate) to make it easy to boot up a Tokaido in the context of the current application:
$ tokaido . |
This will be especially useful for people using the rvm “mounted” Tokaido.
Integration with Code Quality Tools
By default, Tokaido will integrate with flog, flay and rails-best-practices. It will periodically scan your Rails app for problems and notify you via the app’s notification tray. This mechanism will be extensible, so a future version of Tokaido might integrate with Code Climate or support code quality plugins.
Binary Gems and Heroku
Bundler provides a mechanism for deployment (bundle install --deployment) that specifically rejects deploys that require changes to the Gemfile.lock. When deploying a Gemfile.lock that was built using binary gems from a different platform, this mechanism rejects the deploy. At present, only Windows makes heavy use of binary gems, and Heroku’s solution to date has been to simply remove the Gemfile.lock and re-resolve dependencies.
Unfortunately, this solution eliminates the most important guarantee made by bundler, and is untenable in the long-term. If OS X users started to use binary gems more broadly, this would bring this problem into much wider circulation. Also, because platform-specific gems can contain alternate dependencies (see, for example Nokogiri 1.4.2), it is important that bundle install support alternate platforms more extensively than with a naïve solution.
There are a few possible solutions:
- Do not use
--deploymenton Heroku. This would allow bundler to update theGemfile.lockfor the new platform. It would also mean that if a developer didn’t update theirGemfile.lockbefore deploying, they might run unexpected code. This is a short-term fix, but would probably alleviate most of the symptoms without introducing a lot of new caveats. - Fix
--deploymentto allow changes to theGemfile.lock, but only in response to a genuine platform change. Unfortunately, because of cases likeNokogiri 1.4.2, this could theoretically result in totally different code running in production. - Provide a mechanism for developers to explicitly add a deployment environment:
bundle platform add x86_64-linux. This would pre-resolve gems across all available platforms and ensure that the same gem versions could be used everywhere. - Improve the bundler resolution mechanism to allow gems with the same name, version and dependencies to be treated the same. Because dependencies can also exhibit this problem (a dependency of some JRuby gem could be another JRuby gem with its own non-standard dependencies), this would need to ensure that the entire subtree was the same.
The last solution is most promising, because the platform-specific gems Tokaido is concerned with are simply precompiled variants of the same gem. In the vast majority of cases, this simply means that the gem author is saving the end-developer the step of using a C compiler, but doesn’t actually change a lot else.
Unfortunately, Rubygems itself doesn’t distinguish between precompiled variants of gems and gems with different dependencies, so we will have to add smarts into bundler to detect the difference. My guess is that we will use a short-term fix like the first option while working on a longer-term fix like the last option.
Work Time
As I said in the original Kickstarter, I planned to take time off from work to work on the project. I have structured that time by working mornings on Tokaido, and shifting client work to the afternoons and evenings.
I don’t have a specific ship-date in mind for Tokaido yet, but you should start seeing more work-product as the project progresses, starting over the next few weeks.
Thanks for your patience. It shall be rewarded!
Tokaido Status Update: Implementation Details
June 5th, 2012
Hey guys!
Since my last update, Tokaido was fully funded, and I’ve been hard at work planning, researching and working on Tokaido.
So far, we have a working binary build of Ruby, but no setup chrome. Because the binary build already exists, Terence Lee was able to experiment with it at a recent Rails Girls event, with great success:
Our @hone02 made such improvements to the installment last night with @wycats that everyone’s done and mingling! #awesome #railsgirls_ams
— Rails Girls (@railsgirls) May 29, 2012
Great thanks to Terence to put together a simple installer script that we could use to test whether the core build worked on a wide variety of OSX systems.
One thing that I mentioned in my original proposal was a desire to work closely with others working on related projects. Very soon after my project was announced, I teamed up with Michal Papis of the rvm team to make the core statically compiled distribution something that would work outside of the GUI part of Tokaido.
We decided to use the sm scripting framework to build Tokaido, to make it easy to share code between rvm2, Tokaido, and the Unix Rails Installer. The majority of the work I have done so far has been in researching how to properly build a portable Ruby, and working with Michal to build the solution in terms of the sm framework. The rest of this blog post discusses the results of that research, for those interested.
The discussion in this blog post is specific to Mac OSX.
Portable Build
The hardest technical part of the project is creating a portable binary build of Ruby that can be moved around to various machines. What do I mean by that?
When you compile Ruby using the normal ./configure && make, the resulting binary is not portable between machines for a number of reasons.
Hard-Coded Paths
By default, the compiled Ruby comes with a binary .bundle file for each compiled part of the standard library. For example, Aaron Patterson wrote the psych library as a C library. When you compile Ruby, you get psych.bundle as part of the distribution. When a Ruby program invokes require "psych", the system’s dynamic loader will load in psych.bundle.
By default, Ruby hard-codes the path to the Ruby dynamic library (libruby.1.9.1.dylib) into psych.bundle. Since Psych uses C functions from Ruby, this path is used by the dynamic loader to make sure that the require Ruby dependency is available. We can use a tool called otool to see all of the dependencies for a particular library:
$ otool -L psych.bundle /Users/wycats/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/x86_64-darwin11.3.0/psych.bundle: /Users/wycats/.rvm/rubies/ruby-1.9.3-p194/lib/libruby.1.9.1.dylib (compatibility version 1.9.1, current version 1.9.1) /Users/wycats/.rvm/usr/lib/libyaml-0.2.dylib (compatibility version 3.0.0, current version 3.2.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 159.1.0) /usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0) |
The second line in the output references the libruby.1.9.1.dylib using an absolute path on my local machine. If I take these binaries and give them to you, the linker won’t be able to find libruby and won’t load Psych.
In addition to the problem with the compiled .bundle files, the compiler also hardcodes the paths in the Ruby binary itself:
$ otool -L `which ruby` /Users/wycats/.rvm/rubies/ruby-1.9.3-p194/bin/ruby: /Users/wycats/.rvm/rubies/ruby-1.9.3-p194/lib/libruby.1.9.1.dylib (compatibility version 1.9.1, current version 1.9.1) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 159.1.0) /usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0) |
Finally, the location of the standard library is hardcoded into the Ruby binary:
$ strings /Users/wycats/.rvm/rubies/ruby-1.9.3-p194/lib/libruby.1.9.1.dylib | grep rvm /Users/wycats/.rvm/rubies/ruby-1.9.3-p194 /Users/wycats/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/site_ruby/1.9.1 /Users/wycats/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/site_ruby/1.9.1/x86_64-darwin11.3.0 /Users/wycats/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/site_ruby /Users/wycats/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/vendor_ruby/1.9.1 /Users/wycats/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/vendor_ruby/1.9.1/x86_64-darwin11.3.0 /Users/wycats/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/vendor_ruby /Users/wycats/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1 /Users/wycats/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/x86_64-darwin11.3.0 |
Fortunately, the C Ruby folks know about this problem, and include an (undocumented?) flag that you can pass to ./configure, --enable-load-relative. This flag fixes the problems with hardcoded paths:
Instead of creating a separate libruby.1.9.1.dylib that the ruby executable links to, this flag includes the compiled binary code inside of the ruby executable.
$ ./configure --enable-load-relative --prefix=/Users/wycats/Code/ruby/build ... snip ... $ make && make install ... snip ... $ otool -L build/bin/ruby build/bin/ruby: /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 159.1.0) /usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0) |
You can see that Ruby still links against a few system dynamic libraries. These dynamic libraries are extremely stable in OSX, and aren’t a problem for binary distributions.
In order to enable compilation of native extensions, this build of Ruby distributes an archive file instead of a dylib. As we will see later, the OSX linker knows how to automatically handle this.
This flag also affects psych.bundle:
$ otool -L build/lib/ruby/1.9.1/x86_64-darwin11.4.0/psych.bundle build/lib/ruby/1.9.1/x86_64-darwin11.4.0/psych.bundle: /usr/local/lib/libyaml-0.2.dylib (compatibility version 3.0.0, current version 3.2.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 159.1.0) /usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0) |
External Dependencies
In addition to the general problem of hardcoded paths, there’s another issue lurking in the above otool output for Psych. After eliminating the hardcoded path to a local Ruby, we are still left with a link to /usr/local/lib/libyaml-0.2.dylib. Unfortunately, libyaml doesn’t come with OSX, so if I take this distribution of Ruby and hand it off to a fresh system, Psych will fail to find libyaml at runtime and fail to load.
A number of the .bundle files that Ruby ships with have similar external dependencies. In general, these dependencies ship with OSX, but some, like openssl, may not last more than another release or two. In addition, the specific versions of these dependencies shipped with OSX may change over time, possibly resulting in different behavior on different systems.
In general, we can eliminate these problems by including the binaries we need into the .bundle files, instead of asking the operating system’s dynamic loader to find them at runtime.
The OSX linker’s (ld) behavior in this respect is interesting:
- The linker starts with a list of paths to search for libraries
- When compiling a program, it may need a particular dependency (
psychneedslibyaml) - It searches through the path for that library. Both
libyaml.dylibandlibyaml.awill suffice. - If the linker finds a
.dylibfirst, it will dynamically link that dependency. - If the linker finds a
.afirst, it will statically link that dependency. By statically linking, we mean that it simply includes the binary into the outputted compiled file - If the linker finds a directory containing both a
.aand a.dylib, it will dynamically link the dependency
In this case, our goal is to get the linker to statically link libyaml. In order to do this, we will need to build a libyaml.a and get the directory containing that file to the front of the linker's path.
In the case of libyaml, getting a .a looks like this:
$ wget http://pyyaml.org/download/libyaml/yaml-0.1.4.tar.gz ... snip ... $ tar -xzf yaml-0.1.4.tar.gz $ cd yaml-0.1.4 $ ./configure --disable-shared ... snip ... $ make ... snip ... $ otool -L src/.libs/libyaml.a Archive : src/.libs/libyaml.a src/.libs/libyaml.a(api.o): src/.libs/libyaml.a(reader.o): src/.libs/libyaml.a(scanner.o): src/.libs/libyaml.a(parser.o): src/.libs/libyaml.a(loader.o): src/.libs/libyaml.a(writer.o): src/.libs/libyaml.a(emitter.o): src/.libs/libyaml.a(dumper.o): |
We now have a libyaml.a. Note that the configure flag for getting a .a for a given library is not particularly standardized. Three popular ones: --static, --enable-static, --disable-shared.
Next, we need to move libyaml.a into a directory with any other .a files we want to use and pass them to the compilation process:
$ LDFLAGS="-L/Users/wycats/Code/ruby/deps" ./configure --enable-load-relative --prefix=/Users/wycats/Code/ruby/build ... snip ... $ make && make install ... snip ... $ otool -L build/lib/ruby/1.9.1/x86_64-darwin11.4.0/psych.bundlebuild/lib/ruby/1.9.1/x86_64-darwin11.4.0/psych.bundle: /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 159.1.0) /usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0) |
And voila! We now have a psych.bundle that does not depend on libyaml.dylib. Instead, libyaml is now included in psych.bundle itself. This moves us a step closer to having a portable Ruby build.
We will want to repeat this process for every part of the Ruby standard library with external dependencies (openssl, readline, and zlib are some others). Even though libyaml is the only library that does not ship with OSX, eliminating external dependencies on the operating system insulates our build from changes that Apple makes in the future. Of the dependencies, OpenSSL is the most problematic, as it has already been deprecated in Lion.
The sm Framework
This is where the sm (scripting management) framework comes into play. The goal of the sm framework is to encapsulate solutions to these concerns into reusable libraries. In particular, it abstracts the idea of downloading and compiling a package, and common requirements, like static compilation.
For example, let's take a look at the libyaml library.
The first important file here is config/defaults:
version=0.1.4 base_url=http://pyyaml.org/download/libyaml configure_flag_static=--disable-shared |
This specifies the current version, the URL to download the tarball from, and importantly for us, the configure flag that libyaml expects in order to build a .a file. We added that third line because we needed it for Tokaido. This satisfies one of the major goals of the project: to get as much of the code as possible into shared code instead of code that is specific to Tokaido.
The other important file in the libyaml library is shell/functions:
#!/bin/sh libyaml_prefetch() { package define \ file "yaml-${package_version}.${archive_format}" \ dir "yaml-${package_version}" } libyaml_preconfigure() { os is darwin || autoreconf -is --force > autoreconf.log 2>&1 || __sm.package.error "Autoreconf of ${package_name} ${package_version} failed! " "$PWD/autoreconf.log" } |
The sm framework defines a series of steps that a package install goes through:
# preinstall # # prefetch # fetch # postgetch # preextract # extract # prepatch # patch # preconfigure # configure # postconfigure # prebuild # build # preinstall # install # preactivate # activate # postactivate # # postinstall |
The indented functions above are user-defined. Functions like fetch and configure are defined by sm.
In our case, the libyaml library defines two of those steps: prefetch and preconfigure. The prefetch function allows us to provide extra information to the fetch method, which specifically allows the prefetch to override the package_file (${package_file:="${package_name}-${package_version}.${archive_format}"}). In our case, even though the package name is libyaml, we want to download the file yaml-1.1.4.tar.gz.
The openssl library is somewhat more complicated. As with libyaml, we needed to teach sm how to install openssl statically. You can check out the commit to see how easy that was.
The great thing about getting this stuff into sm is that there is now a shared resource to answer questions like "how do you statically build openssl". The work I did with Michal to improve the information for the libaries that Ruby depends on can now be used by anyone else trying to build Ruby (or anything else with those dependencies for that matter).
Tokaido is an sm library
Tokaido itself is an sm library! This means that the core Tokaido build will work on Linux, so it can be used to create a standalone distribution of Ruby for Linux, and maybe even the core of a Tokaido for Linux!
The Tokaido package implements a lot of the sm hooks, so check it out to learn more about what you can do using sm's package API.
In my next post, I'll talk about the architecture of the Tokaido UI component.
Tokaido: My Hopes and Dreams
April 13th, 2012
A few weeks ago, I started a kickstarter project to fund work on a project to make a long-term, sustainable binary build of Ruby. The outpouring of support was great, and I have far exceeded my original funding goal. First, I’d like to thank everyone in the community who contributed large and small donations. This kickstarter couldn’t have been as successful as it has been without the hundreds (650 at latest count!) of individual donations by interested Rubyists.
In this post, I want to talk about what my goals are for this project, and why I think it will be a tool that everyone, myself included, will use.
What is Tokaido
The name “Tokaido” (東海道 in Japanese) comes from the Tōkaidō Shinkansen bullet train line in Japan.
Precompiled, Static Ruby
At its core, Tokaido is a binary distribution of Ruby without any external dependencies on your system. This means that Ruby itself, as well as all of the compiled elements of the standard library, will come in a self-contained directory with no additional requirements.
The binary will also have no hardcoded paths, so it will be possible to move the directory anywhere on the file system and have it continue to work as long as its bin directory is on the $PATH. This is an important goal, as Tokaido will ship as a downloadable .app which should work wherever it is dropped on the file system.
Precompiled Binary Gems
Tokaido will come with all of the necessary gems to use Rails and common Rails extensions. Because some of these gems have external dependencies (for example, nokogori depends on libxml2), Tokaido will come with precompiled versions of these gems built for OSX that do not depend on external libraries (they will be statically compiled, not dynamically linked).
As part of this project, I plan to work with people who ship common native extensions to help them build and ship binary versions of their gems for OSX without external dependencies. Luis Lavena has been doing amazing work with rake-compiler to make this process easy on gem developers, and Wayne E. Seguin and Michał Papis have been doing great work with sm, which makes it easy to precompile the needed dependencies for inclusion in the precompiled gems. These tools will be essential in the effort to make dependency-free precompiled gems a standard part of the OSX Ruby ecosystem.
I anticipate that gem authors will, in general, start distributing precompiled binary versions of their gems. If, by the time I ship the first version of Tokaido, some important gems do not yet ship as precompiled binaries, Tokaido will bootstrap the process by including the binaries.
Terminal-Based Workflow
Tokaido does not aim to replace the Terminal as the main way to work with a Rails app. It will ship an isolated environment with no external dependencies designed for use in the Terminal. The application UI will supplement, rather than replace, the normal application workflow. This is a crucial part of the overall goal to make Tokaido an excellent tool for both experienced Rails developers, intermediate Rails developers, and totally new Rails developers.
Because many gems come with executables, and Tokaido couldn’t abstract every possible executable even if it wanted to, it is essential that new developers get used to using the Terminal as early as possible in their Rails experience, but in a way that minimizes unnecessary errors.
Extras: Code and App Health
A number of really great tools exist to help Rails applications remain healthy:
bundle outdatedlets you know when new versions of gems are availablerails_best_practiceshelps find common mistakes in Rails applicationsreek,flog,flay,roodiand other metrics tools help identify Ruby smells and prioritizes areas where you can reduce your technical debtsimplecovdoes coverage analysis on your test suite to identify areas lacking good coveragebulletidentifies N+1 queries and unused eager loadingActiveSupport::Notificationsprovides introspection information about Rails applications
The Tokaido UI will attempt to centralize this useful information into a health check, and help you drill in if you want to do more with the information. Because they are so useful, it will encourage you to use these tools in new application, and provide immediate rewards if you do.
Extras: yourapp.dev
The pow dev server and PassengerPane make it possible to avoid needing to manually load a Rails server and gives you a friendly alias for your application at yourapp.dev instead of having to type in a port on localhost.
Tokaido will come with integrated support for this, so every app you manage through Tokaido will have a aliased development server out of the box.
Extras: rails:// (or ruby://) Protocol Handler
Tokaido will come with a protocol handler that will allow .gem files hosted on rubygems.org or other locations to be added to an app managed by Tokaido. It may also allow web sites to host installation templates that could execute against a maintained application. This will require coordination with rubygems.org, and the specifics of it may change over time, but the basic idea is to enable communication between web sites and the Tokaido app.
This idea came from a number of community members, and the specifics (especially around security and the protocol specification) definitely need fleshing out, but it seems extremely promising.
Extras: Ruby Toolbox Integration?
Ruby Toolbox has emerged as an amazing directory of Ruby libraries, categorized by type. It also has up-to-date information on Github activity and other useful indicators when evaluating tools. Several people have asked for integration with Ruby Toolbox, and assuming the author is willing, Tokaido will make it easy to use the information in Ruby Toolbox to bootstrap an app and add new functionality as your app grows.
Finding the right gem for the job is easy if you know what you’re looking for, but even experienced developers find the Ruby Toolbox useful when researching tools in an unfamiliar area.
Goals of Tokaido
Eliminate Failure Scenarios
The primary goal of Tokaido is to build a distribution of Ruby that eliminates the main failure scenarios that people encounter when trying to install Ruby. In general, things fail because of conflicts with the existing system environment (or the user’s environment). These failures do not happen to everyone, but when they happen, they are extremely difficult to recover from. This has happened to me personally and to people I have tried to get started with Ruby.
This sort of thing does not necessarily happen on every installation, but once you start going down the rabbit hole, it can be very difficult to find your way out. The environment difficulties can be caused by anything from an old MacPorts installation to a mistaken attempt to install something to the system once upon a time to something failing during a previous step in the installation process.
It also may not be enough to install a precompiled Ruby into the system, because the system environment itself may be corrupted with things like erroneous executables on the $PATH or bad dynamic libraries that get linked during native compilation. Also, later installations may do damage to the system-installed Ruby. Tokaido is a standalone environment that is loaded on top of an existing session, and therefore minimizes the possible damage that load order or subsequent installs can do.
Precompile Everything
In order to eliminate a certain class of failure scenarios, Tokaido will ship with a precompiled Ruby, which will eliminate the possibility of compilation errors when installing Ruby. This precompiled Ruby will also come with all of the dependencies it needs, like zlib, yaml and others, instead of allowing the system to try to find them at runtime. This will eliminate the possibility that changes to the system environment will cause a normally working version of Ruby to fail in some scenarios.
As above, Tokaido will also use this technique for commonly used native gems. For example, Tokaido will ship with a precompiled version of Nokogiri that comes with libxml, instead of relying on the system’s copy of libxml (incidentally, the system’s libxml has occasionally been subtly broken, necessitating installation via homebrew). I expect that this will happen because gem authors will start shipping precompiled versions of their gems for OSX. If there are still a few common gems straggling by the time Tokaido ships, we’ll bootstrap the process by shipping with binary gems we compile ourselves.
Use Tokaido Myself
Since Tokaido does not fundamentally alter a developer’s relationship with Ruby, I expect to be able to start using it for day-to-day Rails development. Some of the additional extras, like app health and built-in server support, are things I already do manually and will enjoy having as part of a larger tool. I’m also extremely interested in ideas for other extras that can add value for experienced developers without altering how people work with Ruby today.
Integration Testing
One of the coolest, unheralded things about the Rails project is the integration suite for Rails put together by Sam Ruby. It essentially transcribes the Agile Web Development With Rails book into a series of very complete integration tests for Rails. When refactoring Rails for Rails 3, this suite was extremely useful, and helped keep the number of unintentional changes to Rails to a tiny number. It also kept us honest about intentional changes.
This suite is perfect for Tokaido, as it tests every aspect of Rails, which is itself touches a wide swath of Ruby itself. To start, Tokaido will use this suite for integration testing.
Collaboration
There are several other projects, notably rvm, trying to solve problems in a similar space. Especially when it comes to precompilation, there is no reason not to work together on the core infrastructure that powers these efforts. I have already started to work with Michał Papis, who also works on sm and rvm on shared work that can be useful for both projects. He has been teaching me all about the work that folks associated with rvm have done to simplify things like downloading and compiling libz.a, a prerequisite for precompiled Rubies.
I will also work with authors of native gems to start shipping precompiled binary gems for OSX. I have already started working with Aaron Patterson, who works on the sqlite and nokogiri gems, and have started reaching out to several others.
I am very interested in working together with anyone working on a similar area so that the tools we are all working on can be shared between projects. This is a core mission of the Tokaido project, and something that the extra funding I got will allow me to prioritize.
Migration to “System”
Also through Michał Papis, I learned about an awesome new (still experimental) feature in rvm called mount that will mount an external Ruby into the rvm system. I will make sure that the Tokaido ruby can be added to an rvm installation. This will mean that if someone wants to migrate from Tokaido to a more advanced rvm setup, they can take their Ruby environment with them with no trouble.
I would be open to other approaches to migrating a Tokaido environment to the system as well. The goal would be to seamlessly migrate an environment to the system without introducing opportunities for failure during that process.
Looking Forward
I’m really excited to the work I’ve been doing to prepare for this project, and looking forward to shipping a great environment for OSX Ruby users. I have also really enjoyed reaching out to others working in similar areas and the prospect of collaborating with so many smart people on a shared goal.
Thanks so much!
JavaScript Needs Blocks
January 10th, 2012
While reading Hacker News posts about JavaScript, I often come across the misconception that Ruby’s blocks are essentially equivalent to JavaScript’s “first class functions”. Because the ability to pass functions around, especially when you can create them anonymously, is extremely powerful, the fact that both JavaScript and Ruby have a mechanism to do so makes it natural to assume equivalence.
In fact, when people talk about why Ruby’s blocks are different from Python‘s functions, they usually talk about anonymity, something that Ruby and JavaScript share, but Python does not have. At first glance, a Ruby block is an “anonymous function” (or colloquially, a “closure”) just as a JavaScript function is one.
This impression, which I admittedly shared in my early days as a Ruby/JavaScript developer, misses an important subtlety that turns out to have large implications. This subtlety is often referred to as “Tennent’s Correspondence Principle”. In short, Tennent’s Correspondence Principle says:
“For a given expression
expr,lambda exprshould be equivalent.”
This is also known as the principle of abstraction, because it means that it is easy to refactor common code into methods that take a block. For instance, consider the common case of file resource management. Imagine that the block form of File.open didn’t exist in Ruby, and you saw a lot of the following in your code:
begin f = File.open(filename, "r") # do something with f ensure f.close end |
In general, when you see some code that has the same beginning and end, but a different middle, it is natural to refactor it into a method that takes a block. You would write a method like this:
def read_file(filename) f = File.open(filename, "r") yield f ensure f.close end |
And you’d refactor instances of the pattern in your code with:
read_file(filename) do |f| # do something with f end |
In order for this strategy to work, it’s important that the code inside the block look the same after refactoring as before. We can restate the correspondence principle in this case as:
# do something with fshould be equivalent to:
do # do something with end
At first glance, it looks like this is true in Ruby and JavaScript. For instance, let’s say that what you’re doing with the file is printing its mtime. You can easily refactor the equivalent in JavaScript:
try { // imaginary JS file API var f = File.open(filename, "r"); sys.print(f.mtime); } finally { f.close(); } |
Into this:
read_file(function(f) { sys.print(f.mtime); }); |
In fact, cases like this, which are in fact quite elegant, give people the mistaken impression that Ruby and JavaScript have a roughly equivalent ability to refactor common functionality into anonymous functions.
However, consider a slightly more complicated example, first in Ruby. We’ll write a simple class that calculates a File’s mtime and retrieves its body:
class FileInfo def initialize(filename) @name = filename end # calculate the File's +mtime+ def mtime f = File.open(@name, "r") mtime = mtime_for(f) return "too old" if mtime < (Time.now - 1000) puts "recent!" mtime ensure f.close end # retrieve that file's +body+ def body f = File.open(@name, "r") f.read ensure f.close end # a helper method to retrieve the mtime of a file def mtime_for(f) File.mtime(f) end end |
We can easily refactor this code using blocks:
class FileInfo def initialize(filename) @name = filename end # refactor the common file management code into a method # that takes a block def mtime with_file do |f| mtime = mtime_for(f) return "too old" if mtime < (Time.now - 1000) puts "recent!" mtime end end def body with_file { |f| f.read } end def mtime_for(f) File.mtime(f) end private # this method opens a file, calls a block with it, and # ensures that the file is closed once the block has # finished executing. def with_file f = File.open(@name, "r") yield f ensure f.close end end |
Again, the important thing to note here is that we could move the code into a block without changing it. Unfortunately, this same case does not work in JavaScript. Let’s first write the equivalent FileInfo class in JavaScript.
// constructor for the FileInfo class FileInfo = function(filename) { this.name = filename; }; FileInfo.prototype = { // retrieve the file's mtime mtime: function() { try { var f = File.open(this.name, "r"); var mtime = this.mtimeFor(f); if (mtime < new Date() - 1000) { return "too old"; } sys.print(mtime); } finally { f.close(); } }, // retrieve the file's body body: function() { try { var f = File.open(this.name, "r"); return f.read(); } finally { f.close(); } }, // a helper method to retrieve the mtime of a file mtimeFor: function(f) { return File.mtime(f); } }; |
If we try to convert the repeated code into a method that takes a function, the mtime method will look something like:
function() { // refactor the common file management code into a method // that takes a block this.withFile(function(f) { var mtime = this.mtimeFor(f); if (mtime < new Date() - 1000) { return "too old"; } sys.print(mtime); }); } |
There are two very common problems here. First, this has changed contexts. We can fix this by allowing a binding as a second parameter, but it means that we need to make sure that every time we refactor to a lambda we make sure to accept a binding parameter and pass it in. The var self = this pattern emerged in JavaScript primarily because of the lack of correspondence.
This is annoying, but not deadly. More problematic is the fact that return has changed meaning. Instead of returning from the outer function, it returns from the inner one.
This is the right time for JavaScript lovers (and I write this as a sometimes JavaScript lover myself) to argue that return behaves exactly as intended, and this behavior is simpler and more elegant than the Ruby behavior. That may be true, but it doesn’t alter the fact that this behavior breaks the correspondence principle, with very real consequences.
Instead of effortlessly refactoring code with the same start and end into a function taking a function, JavaScript library authors need to consider the fact that consumers of their APIs will often need to perform some gymnastics when dealing with nested functions. In my experience as an author and consumer of JavaScript libraries, this leads to many cases where it’s just too much bother to provide a nice block-based API.
In order to have a language with return (and possibly super and other similar keywords) that satisfies the correspondence principle, the language must, like Ruby and Smalltalk before it, have a function lambda and a block lambda. Keywords like return always return from the function lambda, even inside of block lambdas nested inside. At first glance, this appears a bit inelegant, and language partisans often accuse Ruby of unnecessarily having two types of “callables”, in my experience as an author of large libraries in both Ruby and JavaScript, it results in more elegant abstractions in the end.
Iterators and Callbacks
It’s worth noting that block lambdas only make sense for functions that take functions and invoke them immediately. In this context, keywords like return, super and Ruby’s yield make sense. These cases include iterators, mutex synchronization and resource management (like the block form of File.open).
In contrast, when functions are used as callbacks, those keywords no longer make sense. What does it mean to return from a function that has already returned? In these cases, typically involving callbacks, function lambdas make a lot of sense. In my view, this explains why JavaScript feels so elegant for evented code that involves a lot of callbacks, but somewhat clunky for the iterator case, and Ruby feels so elegant for the iterator case and somewhat more clunky for the evented case. In Ruby’s case, (again in my opinion), this clunkiness is more from the massively pervasive use of blocks for synchronous code than a real deficiency in its structures.
Because of these concerns, the ECMA working group responsible for ECMAScript, TC39, is considering adding block lambdas to the language. This would mean that the above example could be refactored to:
FileInfo = function(name) { this.name = name; }; FileInfo.prototype = { mtime: function() { // use the proposed block syntax, `{ |args| }`. this.withFile { |f| // in block lambdas, +this+ is unchanged var mtime = this.mtimeFor(f); if (mtime < new Date() - 1000) { // block lambdas return from their nearest function return "too old"; } sys.print(mtime); } }, body: function() { this.withFile { |f| f.read(); } }, mtimeFor: function(f) { return File.mtime(f); }, withFile: function(block) { try { var f = File.open(this.name, "r"); block(f); } finally { f.close(); } } }; |
Note that a parallel proposal, which replaces function-scoped var with block-scoped let, will almost certainly be accepted by TC39, which would slightly, but not substantively, change this example. Also note block lambdas automatically return their last statement.
Our experience with Smalltalk and Ruby show that people do not need to understand the SCARY correspondence principle for a language that satisfies it to yield the desired results. I love the fact that the concept of “iterator” is not built into the language, but is instead a consequence of natural block semantics. This gives Ruby a rich, broadly useful set of built-in iterators, and language users commonly build custom ones. As a JavaScript practitioner, I often run into situations where using a for loop is significantly more straight-forward than using forEach, always because of the lack of correspondence between the code inside a built-in for loop and the code inside the function passed to forEach.
For the reasons described above, I strongly approve of the block lambda proposal and hope it is adopted.
Amber.js (formerly SproutCore 2.0) is now Ember.js
December 12th, 2011
After we announced Amber.js last week, a number of people brought Amber Smalltalk, a Smalltalk implementation written in JavaScript, to our attention. After some communication with the folks behind Amber Smalltalk, we started a discussion on Hacker News about what we should do.
Most people told us to stick with Amber.js, but a sizable minority told us to come up with a different name. After thinking about it, we didn’t feel good about the conflict and decided to choose a new name.
Henceforth, the project formerly known as SproutCore 2.0 will be known as Ember.js. Our new website is up at http://www.emberjs.com
(and yes, we know this is pretty ridiculous)
Announcing Amber.js
December 8th, 2011
A little over a year ago, I got my first serious glimpse at SproutCore, the JavaScript framework Apple used to build MobileMe (now iCloud). At the time, I had worked extensively with jQuery and Rails on client-side projects, and I had never found the arguments for the “solutions for big apps” very compelling. At the time, most of the arguments (at least within the jQuery community) focused on bringing more object orientation to JavaScript, but I never felt that they offered the layers of abstraction you really want to manage complexity.
When I first started to play with SproutCore, I realized that the bindings and computed properties were what gave it its real power. Bindings and computed properties provide a clean mechanism for building the layers of abstractions that improve the structure of large applications.
But even before I got involved in SproutCore, I had an epiphany one day when playing with Mustache.js. Because Mustache.js was a declarative way of describing a translation from a piece of JSON to HTML, it seemed to me that there was enough information in the template to also update the template when the underlying data changed. Unfortunately, Mustache.js itself lacked the power to implement this idea, and I was still lacking a robust enough observer library.
Not wanting to build an observer library in isolation (and believing that jQuery’s data support would work in a pinch), I started working on the first problem: building a template engine powerful enough to build automatically updating templates. The kernel of the idea for Handlebars (helpers and block helpers as the core primitives) came out of a discussion with Carl Lerche back when we were still at Engine Yard, and I got to work.
When I met SproutCore, I realized that it provided a more powerful observer library than anything I was considering at the time for the data-binding aspect of Handlebars, and that SproutCore’s biggest weakness was the lack of a good templating solution in its view layer. I also rapidly became convinced that bindings and computed properties were a significantly better abstraction, and allowed for hiding much more complexity, than manually binding observers.
After some months of retooling SproutCore with Tom Dale to take advantage of an auto-updating templating solution that fit cleanly into SproutCore’s binding model, we reached a crossroads. SproutCore itself was built from the ground up to provide a desktop-like experience on desktop browsers, and our ultimate plan had started to diverge from the widget-centric focus of many existing users of SproutCore. After a lot of soul-searching, we decided to start from scratch with SproutCore 2.0, taking with us the best, core ideas of SproutCore, but leaving the large, somewhat sprawling codebase behind.
Since early this year, we have worked with several large companies, including ZenDesk, BazaarVoice and LivingSocial, to iterate on the core ideas that we started from to build a powerful framework for building ambitious applications.
Throughout this time, though, we became increasingly convinced that calling what we were building “SproutCore 2.0″ was causing a lot of confusion, because SproutCore 1.x was primarily a native-style widget library, while SproutCore 2.0 was a framework for building web-based applications using HTML and CSS for the presentation layer. This lack of overlap causes serious confusion in the IRC room, mailing list, blog, when searching on Google, etc.
To clear things up, we have decided to name the SproutCore-inspired framework we have been building (so far called “SproutCore 2.0″) “Amber.js”. Amber brings a proven MVC architecture to web applications, as well as features that eliminate common boilerplate. If you played with SproutCore and liked the concepts but felt like it was too heavy, give Amber a try. And if you’re a Backbone fan, I think you’ll love how little code you need to write with Amber.
In the next few days, we’ll be launching a new website with examples, documentation, and download links. Stay tuned for further updates soon.
UPDATE: The code for Amber.js is still, as of December 8, hosted at the SproutCore organization. It will be moved and re-namespaced within a few days.
How to Marshal Procs Using Rubinius
November 19th, 2011
The primary reason I enjoy working with Rubinius is that it exposes, to Ruby, much of the internal machinery that controls the runtime semantics of the language. Further, it exposes that machinery primarily in order to enable user-facing semantics that are typically implemented in the host language (C for MRI, C and C++ for MacRuby, Java for JRuby) to be implemented in Ruby itself.
There is, of course, quite a bit of low-level functionality in Rubinius implemented in C++, but a surprising number of things are implemented in pure Ruby.
One example is the Binding object. To create a new binding in Rubinius, you call Binding.setup:
def self.setup(variables, code, static_scope, recv=nil) bind = allocate() bind.self = recv || variables.self bind.variables = variables bind.code = code bind.static_scope = static_scope return bind end |
This method takes a number of more primitive constructs, which I will explain as this article progresses, but we can describe the constructs that make up the high-level Ruby Binding in pure Ruby.
In fact, Rubinius implements Kernel#binding itself in terms of Binding.setup.
def binding return Binding.setup( Rubinius::VariableScope.of_sender, Rubinius::CompiledMethod.of_sender, Rubinius::StaticScope.of_sender, self) end |
Yes, you’re reading that right. Rubinius exposes the ability to extract the constructs that make up a binding, one at a time, from a caller’s scope. And this is not just a hack (like Binding.of_caller for a short time in MRI). It’s core to how Rubinius manages eval, which of course makes heavy use of bindings.
Marshalling Procs
For a while, I have wanted the ability to Marshal.dump a proc in Ruby. MRI has historically disallowed it, but there’s nothing conceptually impossible about it. A proc itself is a blob of executable code, a local variable scope (which is just a bunch of pointers to other objects), and a constant lookup scope. Rubinius exposes each of these constructs to Ruby, so Marshaling a proc simply means figuring out how to Marshal each of these constructs.
Let’s take a quick detour to learn about the constructs in question.
Rubinius::StaticScope
Rubinius represents Ruby’s constant lookup scope as a Rubinius::StaticScope object. Perhaps the easiest way to understand it would be to look at Ruby’s built-in Module.nesting function.
module Foo p Module.nesting module Bar p Module.nesting end end module Foo::Bar p Module.nesting end # Output: # [Foo] # [Foo::Bar, Foo] # [Foo::Bar] |
Every execution context in Rubinius has a Rubinius::StaticScope, which may optionally have a parent scope. In general, the top static scope (the static scope with no parent) in any execution context is Object.
Because Rubinius allows us to get the static scope of a calling method, we can implement Module.nesting in Rubinius:
def nesting scope = Rubinius::StaticScope.of_sender nesting = [] while scope and scope.module != Object nesting << scope.module scope = scope.parent end nesting end |
A static scope also has an addition property called current_module, which is used during class_eval to define which module the runtime should add new methods to.
Adding Marshal.dump support to a static scope is therefore quite easy:
class Rubinius::StaticScope def marshal_dump [@module, @current_module, @parent] end def marshal_load(array) @module, @current_module, @parent = array end end |
These three instance variables are defined as Rubinius slots, which means that they are fully accessible to Ruby as instance variables, but don’t show up in the instance_variables list. As a result, we need to explicitly dump the instance variables that we care about and reload them later.
Rubinius::CompiledMethod
A compiled method holds the information necessary to execute a blob of Ruby code. Some important parts of a compiled method are its instruction sequence (a list of the compiled instructions for the code), a list of any literals it has access to, names of local variables, its method signature, and a number of other important characteristics.
It’s actually quite a complex structure, but Rubinius has already knows how to convert an in-memory CompiledMethod into a String, as it dumps compiled Ruby files into compiled files as part of its normal operation. There is one small caveat: this String form that Rubinius uses for its compiled method does not include its static scope, so we will need to include the static scope separately in the marshaled form. Since we already told Rubinius how to marshal a static scope, this is easy.
class Rubinius::CompiledMethod def _dump(depth) Marshal.dump([@scope, Rubinius::CompiledFile::Marshal.new.marshal(self)]) end def self._load(string) scope, dump = Marshal.load(string) cm = Rubinius::CompiledFile::Marshal.new.unmarshal(dump) cm.scope = scope cm end end |
Rubinius::VariableScope
A variable scope represents the state of the current execution context. It contains all of the local variables in the current scope, the execution context currently in scope, the current self, and several other characteristics.
I wrote about the variable scope before. It’s one of my favorite Rubinius constructs, because it provides a ton of useful runtime information to Ruby that is usually locked away inside the native implementation.
Dumping and loading the VariableScope is also easy:
class VariableScope def _dump(depth) Marshal.dump([@method, @module, @parent, @self, nil, locals]) end def self._load(string) VariableScope.synthesize *Marshal.load(string) end end |
The synthesize method is new to Rubinius master; getting a new variable scope previously required synthesizing its locals using class_eval, and the new method is better.
Rubinius::BlockEnvironment
A Proc is basically nothing but a wrapper around a Rubinius::BlockEnvironment, which wraps up all of the objects we’ve been working with so far. Its scope attribute is a VariableScope and its code attribute is a CompiledMethod.
Dumping it should be quite familiar by now.
class BlockEnvironment def marshal_dump [@scope, @code] end def marshal_load(array) scope, code = *array under_context scope, code end end |
The only thing new here is the under_context method, which gives a BlockEnvironment its variable scope and compiled method. Note that we dumped the static scope along with the compiled method above.
Proc
Finally, a Proc is just a wrapper around a BlockEnvironment, so dumping it is easy:
class Proc def _dump(depth) Marshal.dump(@block) end def self._load(string) block = Marshal.load(string) self.__from_block__(block) end end |
The __from_block__ method constructs a new Proc from a BlockEnvironment.
So there you have it. Dumping and reloading Proc objects in pure Ruby using Rubinius! (the full source is at https://gist.github.com/1378518).
A Proposal for ES.next Proposals
September 13th, 2011
Over the past few years, I have occasionally expressed frustration (in public and private) about the process for approving new features to the next edition of ECMAScript. In short, the process is extremely academic in nature, and is peppered with inside baseball terms that make it nearly impossible for lay developers to provide feedback about proposed new features. In general, this frustration was met with a general assumption that the current process works the way it does for good reason, and that academic descriptions of the new features was the correct (and only) way to properly discuss them.
I have nothing against new features being described in the language of implementors, but I would like to propose some additions to the current process that would make it significantly easier for language consumers (especially framework and library implementors) to provide timely feedback about the proposal in the hope of making an impact before it’s too late.
I would like proposals for new features to have the following elements, in addition to whatever elements are already normally included (such as BNF for any new syntax).
What Problem Are We Solving?
At a high level, what can language users do now that they could not do before. In some cases, proposals may provide simpler or more convenient ways to achieve already-possible goals. These kinds of proposals are often just as important. For example, there is a current proposal to provide a new syntax for class creation. In this case, the new class syntax significantly improves the experience of building a common JavaScript construct.
Provide Non-Trivial Example Code Showing the Problem
If the proposal is solving a problem that exists in the wild, it should be possible to identify non-trivial examples of the problem rearing its head. At the very least, the process of identifying or synthesizing these examples will help language users understand what real-world problems the proposal is attempting to solve. At best, finding real-world examples will help refine the proposal at an early stage.
Show How the Example Code is Improved With the Proposal
After identifying or synthesizing example code to illustrate the problem, show how the problem would be improved if the proposal was accepted. In some cases, the problem is as simple as “needing a large library to perform this operation” and the solution is “building common functionality into the language”. In an example from a related field, the DOM library, the problem addressed by querySelectorAll was “many incompatible implementations of a CSS3 selector library”. The solution was to build the functionality into the browser.
In this case, a mistake in the querySelectorAll specification, which was resolved by the addition of queryScopedSelectorAll in the Selectors API Level 2 could have been addressed ahead of time by evaluating real-life code using selector engine libraries. Of course, the DOM library is not the same as the language specification, so the example is merely an analogy.
What are Alternate ES3 (or ES5) Solutions to the Same Problem?
Simply enumerate the ways that existing JavaScript developers have attempted to resolve the problem without language support. In small language changes, this overlaps considerably with the previous questions. By having proposal authors enumerate existing solutions to the problem, it will be easier for language users to identify the scope of the solution.
This will allow language users to provide feedback about how the solution on offer stacks up compared to existing pure-JS solutions.
Are there any restrictions that do not exist in original pure-JS solutions?
Are there any restrictions in the proposal that limit its utility as a solution to the problem in question, especially if those restrictions do not apply to solutions currently used by language users.
If the Proposal is For New Syntax, What if Anything Does it Desugar To?
Also, if the proposal desugars, why choose this particular desugaring as opposed to alternatives?
If New Syntax, Can it Be Compiled to ES3 (or ES5)?
If the proposal can desugar to an older version of the specification, can a source-to-source translator be written? If so, is there a reference implementation of a source-to-source translator written in that version?
By writing such a source-to-source translator, existing language users can experiment with the new syntax easily in a browser environment without requiring a separate compilation pass. This also allows users to build an in-browser translation UI (similar to try CoffeeScript), which can improve general understanding of the new syntax and produce important feedback.
To be more specific, what I would like to see here is a general-purpose source-to-source translation engine written in ES3 with a mechanism for plugging in translation passes for specific features. If new features come with translation passes, it would be trivial for language users to try out new features incrementally in production applications (with a nice development-time workflow). This would provide usability feedback at an early enough stage for it to be useful.
If a New Standard Library, Can it Be Polyfilled to ES3 (or ES5)?
If the proposal is for a new library whose syntax is valid in an earlier version of the specification, can it be implemented in terms of the existing primitives available in that version. If necessary, primitives not defined by the language, but provided historically by web browsers, can be used instead. The goal is to provide shims for older browsers so that a much broader group of people can experiment with the APIs and provide feedback.
In general, new libraries that parse in older versions of the specifications should also come with a well-defined mechanism to detect whether the feature is present, to make it easy for library and framework vendors, as well as the general public, to opt their users into the new features where appropriate.
Even if a fully backwards-compatible shim cannot be provided, it is still useful to provide a partial implementation together with a feature detection mechanism. At the very least, error-checking shims can be useful, so language users can easily understand the interface to the proposed library.
Does the Proposal Change any Existing Semantics?
In some cases, the proposal unavoidably changes existing semantics. For example, ES5 changed the semantics of an indirect call to eval (a call to an alias to eval, such as x = eval; x('some code') to use the global environment for the evaluated code. In ES3, indirect calls to eval behaved the same as direct calls to eval.
These cases are rare, and in most cases, require an explicit opt-in (such as the directive "use strict";).
When such changes are made, especially when they do not require an opt-in, they should be explicitly called out in the proposal to gather feedback about their likely impact on existing code. Even when they require an opt-in, information about the frequency of their use could be useful to assess the difficulty of opting in.
Since these opt-ins often enable new features as well as changing existing semantics, understanding the impact of the opt-in on existing code would help language users assess their overall utility and timeframe for adoption. This information could also help drive these decisions.
