Infrequently Noted

Dojo: Twice As Fast When It Matters Most

January 28, 2010

Some folks have noticed a new landing page for dojotoolkit.org, one that includes hard numbers about the performance of Dojo vs. jQuery. Every library makes tradeoffs for speed in order to provide better APIs, but JavaScript toolkit performance shootouts obscure that reality more often than not. After all, there would hardly be a need for toolkits if the built in APIs were livable. Our new site isn't arguing that Dojo gives you the fastest possible way to do each of the tasks in the benchmark, all we argue is that we provide the fastest implementation that you'll love using.

Smaller is better.

I gathered the numbers and stand behind them, so let me quickly outline where they come from, why they're fair, and why they matter to your app.

I took the average of three separate runs of the TaskSpeed benchmark in comparing the latest versions of both Dojo and jQuery. The numbers were collected on isolated VM's on a system doing little else. You may not be able to reproduce the exact numbers, but across a similar set of runs, the relative timings should be representative.

So why is TaskSpeed a fair measuring stick? First, it does representative tasks and the runtime harness is calibrated to ensure statistically significant results. Secondly, the versions of the code for each library are written by the library authors themselves. The Dojo team contributed the Dojo versions of the baseline tasks and the jQuery team contributed theirs. If any library wants to take issue with the tests or the results, they only need to send Pete a patch. Lastly, the tests run in relative isolation in iframes. This isn't bulletproof -- GC interactions can do strange things and I've argued for longer runs -- but it's pretty good as these things go. I took averages of multiple runs in part to hedge against these problems.

The comparison to jQuery is fair on the basis of syntax and market share. If you compare the syntax used for Dojo's tests with the jQuery versions, you'll see that they're similarly terse and provide analogous conveniences for DOM manipulation, but the Dojo versions lose the brevity race in a few places. That's the price of speed, and TaskSpeed makes those design decisions clear. As for market share, I'll let John do the talking. It would be foolish of me to suggest that we should be comparing Dojo to some other library without simultaneously suggesting that his market share numbers are wrong; and I doubt they are.

Given all of that, do the TaskSpeed numbers actually matter for application performance? I argue that they do for two reasons. First, TaskSpeed is explicitly designed to capture common-case web development tasks. You might argue that the weightings should be different (a discussion I'd like to see happen more openly), but it's much harder to argue that the tests do things that real applications don't. Because the toolkit teams contributed the test implementations, they provide a view to how developers should approach a task using a particular library. It's also reasonable to suspect that they demonstrate the fastest way in each library to accomplish each task. It's a benchmark, after all. This dynamic makes plain the tradeoffs between speed and convenience in API design, leaving you to make informed decisions based on the costs and benefits of convenience. The APIs, after all, are the mast your application will be lashed to.

I encourage you to go run the numbers for yourself, investigate each library's contributed tests to get a sense for the syntax that each encourages, and get involved in making the benchmarks and the libraries better. That's the approach that the Dojo team has taken, and one that continues to pay off for Dojo's users in the form of deliberately designed APIs and tremendous performance.

View-Source Is Good? Discuss.

January 7, 2010

I've been invited by Chris Messina and some kindly folks at MSFT to participate in a panel at this year's SxSW regarding the value and/or necessity of view-source, and so with apologies to my fellow panelists, I want to get the conversation started early.

First, my position: ceteris paribus, view-source was necessary (but not sufficient) to make HTML the dominant application platform of our times. I also hold that it is under attack — not least of all from within — and that losing view-source poses a significant danger to the overall health of the web.

That's a lot to hang on the shoulders of a relatively innocent design decision, and I don't mean to imply that any system that has a view-source like feature will become dominant. But I do argue that it helps, particularly when coupled with complementary features like reliable parsing, semantic-ish markup, and plain-text content. Perhaps it's moving the goal line a bit, but when I talk about the importance of view-source, I'm more often than not discussing these properties together.

To understand the importance of view-source, consider how people learn. Some evidence exists that even trained software engineers chose to work with copy-and-pasted example code. Participants in the linked study even expressed guilt over the copy-paste-tweak method of learning, but guilt didn't change the dynamic: a blank slate and abstract documentation doesn't facilitate learning nearly as well as poking at an example and feeling out the edges by doing. View-source provides a powerful catalyst to creating a culture of shared learning and learning-by-doing, which in turn helps formulate a mental model of the relationship between input and output faster. Web developers get started by taking some code, pasting it into a file, saving, loading it in a browser and hitting ctrl-r. Web developers switch between editor and browser between even the most minor changes. This is a stark contrast with technologies that impose a compilation step where the process of seeing what was done requires an intermediate step. In other words, immediacy of output helps build an understanding of how the system will behave, and ctrl-r becomes a seductive and productive way for developers to accelerate their learning in the copy-paste-tweak loop. The only required equipment is a text editor and a web browser, tools that are free and work together instantly. That is to say, there's no waiting between when you save the file to disk and when you can view the results. It's just a ctrl-r away.

With that hyper-productive workflow as the background, view-source helps turn the entire web into a giant learning lab, and one that's remarkably resilient to error and experimentation. See an interesting technique or layout? No one can tell you "no" to figuring out how it was done. Copy some of it, paste it into your document, and you'll get something out the other side. Browsers recovering from errors gracefully create a welcome learning environment, free of the inadequacy that a compile failure tends to evoke. You can see what went wrong as often as not. The evolutionary advantages of reliable parsing have helped to ensure that strict XML content comprises roughly none of the web, a decade after it was recognized as "better" by world+dog. Even the most sophisticated (or broken) content is inspectable at the layout level and tools like Firebug and the Web Inspector accelerate the copy-paste-tweak cycle by inspecting dynamic content and allowing live changes without reloads, even on pages you don't "own". The predicate to these incredibly powerful tools is the textual, interpreted nature of HTML. There's much more to say about this, but lets instead turn to the platform's relative weaknesses as a way of understanding how view-source is easily omitted from competing technologies.

The first, and most obvious, downside to the open-by-default nature of the web is that it encourages multiple renderers. Combined with the ambiguities of reliable parsing and semantics that leave room for interpretation, it's no wonder that web developers struggle through incompatibilities. In a world where individual users each need to be convinced to upgrade to the newest version of even a single renderer, differences only in version can wreak havoc in the development process. Things that work in one place may not look exactly the same in another. This is both a strength and a weakness for the platform, but at the level of sophisticated applications, it's squarely a liability. Next, ambiguities in interpretation and semantics mean that the project of creating tooling for the platform is significantly more complex. If only one viewer is prevalent (for whatever reason), then tools only need to consume and generate code that understands the constraints, quirks, and performance of a single runtime. Alternate forms of this simplification include only allowing code (not markup) so as to eliminate parsing ambiguity. The code-not-markup approach yields a potentially more flexible platform and one that can begin to execute content more quickly (as Flash does). These advantages, taken together, can create an incredibly productive environment for experts in the tools that generate content: no output ambiguity, better performance, and tools that can deliver true WYSIWYG authoring. These tools can sidestep the ctrl-r cycle entirely.

But wait, I hear you shout, It's possible to do code-only, toolable, full fidelity development in JavaScript! Tools like GWT and Cappuccino generate code that generates UI, ensuring that only those who can write code or have tools that can will participate; removing the potential value of view-source for those apps. But lets be honest: view source is nearly never locally beneficial. I can hardly count the number of times I've seen the "how do I hide my code?" question from a web n00b who (rightly or wrongly) imagines there's value in it. For GWT the fact that the output is an HTML DOM that's styled with CSS is as much annoyance as benefit. The big upside is that browsers are the dominant platform and you don't have to convince users to install some new runtime.

Similarly Flex, Laszlo, GWT's UI Binder, and Silverlight have discovered the value in markup as a simple declarative way for developers to understand the hierarchical relationships between components, but they correspond to completely unambiguous definitions of components they rely on compiled code — not reliably parsed markup — for final delivery of the UI. These tight contracts turn into an evolutionary straightjacket. Great if you're shipping compiled code down the wire that can meet the contract, but death if those tags and attributes are designed to live for long periods of time or across multiple implementations. You might be able to bolt view-source into the output, but it'll always be optional and ad-hoc, features that work against it being pervasive. Put another way, the markup versions of these systems are leaky abstractions on the precise, code-centric system that under-girds both the authoring and runtime environments. This code-centric bias is incredibly powerful for toolmakers and "real" developers, but it cuts out others entirely; namely those who won't "learn to program" or who want to build tools that inject content into the delivery format.

Whatever the strengths of code-based UI systems, they throw web crawlers for a loop. Today, most search engines deal best with text-based formats, and those search engines help make content more valuable in aggregate than it is on its own. Perhaps it's inevitable that crawlers and search engines will need to execute code in order to understand the value of content, but I remain unconvinced. As a thought experiment, consider a web constructed entirely of Flash content. Given that Flash bytecode lacks a standard, semantic way to denote a relationship between bits of Flash content, what parts of the web wouldn't have been built? What bits of your work would you do differently? What would the process be? There's an alternate path forward that suggests that we can upgrade the coarse semantics of the web to deal with ever-more-sophisticated content requirements. Or put another way, use the features of today's toolkits and code generators as a TODO list for markup driven features. But the jury is still out on the viability that approach; the same dynamic that makes multiple renderers possible ensures that getting them to move in a coordinated way is much harder than the unilateral feature roadmap that plugin vendors enjoy. HTML 5 and CSS 3 work is restarting those efforts, but only time will tell if we can put down the code and pick markup back up as a means to express ourselves.

I've glossed over a lot of details here, and I haven't discussed implications for the server side of a non-text format as our lingua-franca, nor have I dug into the evolution metaphor. Many of the arguments are likewise conditional on economic assumptions. There's lots of discussion yet to have, so if you've got links to concrete research in either direction or have an experience that bears on the debate post in the comments! Hopefully my fellow panelists will respond in blog form and I'll update this post when they do.

OSCON '10 RFP Is Open!

December 21, 2009

It's that time of year again....and this year OSCON is back in Portland! The deadline for submitting your talk is Feb 1st, so if you've been building or learning awesome Open Source technology, don't hesitate to get your proposal in. Remember that the process is competitive, so write your proposal with an eye toward what works and make sure to get it in before the deadline. Good luck!

The Browser Wars: A Style Guide

December 21, 2009

Dear Tech Journalist and/or Editor:
Thank you for covering the browser market. Many users don't understand that they have a choice of browser and by discussing the alternatives you help promote a healthy ecosystem and honest competition. In covering this important topic it's easy to be loose with terms, but some shortcuts cross a bridge too far. A few are listed here here along with a rubric to help you understand why they make you (and your esteemed publication) seem less interested in hard facts than I'm sure you are.
"JavaScript rendering"As I'm sure you know, JavaScript (aka "ECMAScript", aka "JScript") is a programming language, not a UI toolkit or rendering technology. Yes, JavaScript drives the UI of many modern web apps like GMail and Google Maps, but it does so through a technology called DOM. DOM is not a part of JavaScript, it is instead bolted on to JavaScript by the browsers. "JavaScript rendering" would be a non-sensical thing to say even if you were describing the time it takes to build up a user interface. But I rarely (if ever) see such a story. Instead, this rhetorical abomination most often shows up in discussions of JavaScript benchmarks. These benchmarks work very hard to ensure that they aren't affected by any DOM or UI operations. They test everything but rendering. In your defense, there is a strong correlation between faster JavaScript execution and faster rendering. But they are not the same thing. Best to just stay out of this particular gutter.
Acceptable alternatives: "javascript execution", "javascript performance", "DOM rendering" (but only when discussing things that measure DOM performance).
"Plugin"Strictly speaking, a browser plugin is a bit of native code (written in C or C++) that speaks a particular set of ActiveX and NPAPI interfaces and registers itself with browsers in a particular way. This definition might as well be written as "plugins are magic". The best known items of this class are Flash and Silverlight.
What you need to know is that there is an emerging class of things that users can install into their browsers which are similarly magical but which are not plugins. These things go by different names: "extensions", "add-ons", and (confusingly) "toolbars". I'm sure there will be others. You can think of these things as being interchangeable with each other but not with "plugins". So how do you tell which is which? A good rule of thumb is that if a web page works fine without you installing it, it's an extension. Otherwise, if you need to install something for the page to work, it's a plugin.
Acceptable alternatives: "extensions" (preferred), "add-ons", "toolbars" (overly specific, may confuse).
"HTML 5 support"This is one for the nag file since you'll need to revisit this topic in the future. The important thing for now is to be cognizant that there isn't yet a real "HTML 5". Yes, there are various drafts, and yes, some browsers are doing a great job of implementing these new features ahead of formal standardization. But it's not done yet. Saying today that something is an "HTML 5 application" or that a browser has "HTML 5 support" will cause you problems. Nobody wants to explain how what was touted as being "standard" one day became "proprietary" the next. The safest course of action here is to simply talk about "the upcoming HTML 5 standard" or "advanced web applicatons". HTML 5 is a powerful brand and there's going to be an enormous amount of haggling over its meaning for years to come. Best that discussion not include references to your stories.Regards,
Alex Russell

SPDY: The Web, Only Faster

November 12, 2009

Of all the exciting stuff that's happening at Google, one of the things I've been most excited about is SPDY, Mike Belshe and Roberto Peon's new protocol that upgrades HTTP to deal with many of the new use-cases that have strained browsers and web servers in the last couple of years.

There are some obvious advantages to SPDY; header compression means that things like cookies get gzipped, not just content, and mutliplexing over a single connection with priority information will allow clients and servers to cooperate to accelerate page layout based on what's important, not only what got requested first.

But the the really interesting stuff from my perspective is the way SPDY enables server push both for anticipated content and for handling Comet-style workloads. The first bit is likely to have the largest impact for the largest set of apps. Instead of trying to do things like embed images in data: URLs -- which punishes content by making it uncacheable -- SPDY allows the server to anticipate that the client will need some resource and preemptively begin sending it without changing the HTTP-level semantics of the request/response. The result is that even for non-cached requests, many fewer full round trips are required when servers are savvy about what the client will be asking for. Another way to think about it is that it allows the server to help pre-fill an empty cache. Application servers like RoR and Django can know enough about what resources a page is likely to require to begin sending them preemptively in a SPDY-enabled environment. The results in terms of how we optimize apps are nothing short of stunning. Today we work hard to tell browsers as early as possible that they'll need some chunk of CSS (per Steve's findings) and try to structure our JavaScript so that it starts up late in the game because the penalty for waiting on JS is so severe (blocked rendering, etc.). At the very edge of the envelope, this often means inlining CSS and accepting the penalty of not being able to cache for things that should likely be reusable across pages. On most sites, the next page looks a lot like the previous one, after all. When implemented well, SPDY will buy us a way out of this conundrum.

And then there are the implications for Comet workloads. First, SPDY multiplexes. One socket, many requests. Statefully. By default. Awwwww yeah. That means that a client that wants to hang on to an HTTP connection (long polling, "hanging GET", <term of the week here>) isn't penalized at the server since SPDY servers are expected to be handling stateful, long-lived connections. At an architectural level, SPDY forces the issue. No one will be fielding a SPDY server that doesn't handle Comet workloads out of the box because it'll often be harder to do so than not. SPDY finally brings servers into architectural alignment with how many clients want to use them.

Beyond that, SPDY allows clients to set priority information, meaning that real-time information that's likely to be small in size can take precedence on the wire over a large image request. Similarly, because it multiplexes, SPDY could be used as an encapsulation format for WebSockets, allowing one TCP socket to service multiple WebSockets. The efficiency gains here are pretty obvious: less TCP overhead and lowered potential for unintentional DoS (think portals with tons of widgets all making socket requests). There's going to need to be some further discussion about how to make new ideas like WebSockets work over SPDY, but the direction is both clear and promising. SPDY should enable a faster web both now and in the future.

Older Posts

Newer Posts