Closures, Javascript And The Arrow Of Time
Introduction
In most written media, time progresses as you move down a page: mainstream computing languages are no different. Anonymous Closures are a language mechanism that, effectively, lets programmers create new control structures. Although people associate this power with exotic dynamic languages such as FORTH, Scheme and TCL, closures are becoming a feature of mainstream languages such as Javascript and PHP (and even static languages such as C#.)
Although this article talks about issues that you’ll encounter in languages such as C# and Scheme, I’m going to focus on Javascript written on top of the popular JQuery library: I do that because JQuery is a great toolkit that lets programmers and designers of all skill levels do a lot by writing very little code. Because JQuery smooths away low-level details, it lets us clearly illustrate little weirdnesses of its programming model. Although things often work “just right” on the small scale, little strange things snowball in larger programs — a careful look atthe little problems is a key step towards the avoidance and mitigation of problems in big RIA projects.
Time In Programming Languages
Absent control structures, execution proceeds from top to bottom in a computer program, consider:
01 document.write('Hello'); 02 document.write(' '); 03 document.write('World!');
Today’s languages are based on structured programming, which encourages to use a limited number of control structures, such as conditional expressions
04 if(happy=true) { 05 document.write('Hello'); 06 } else { 07 document.write('Goodbye Cruel'); 08 } 09 10 document.write(' '); 11 document.write('World!');
and loops
12 var sum=0; 13 for(i=0;i<x.length;i++) { 14 sum += x[i]; 15 }
With GOTO expunged from the programming canon, execution proceeds from top to down, except for a limited number of control structures that let us return to the beginning or jump to the end of a block. It’s a simple and intuitive model that we don’t often think about, but it’s easy to write code that breaks this model when programming with closures.
Emulating Foreach() in Javascript with Closures
Popular languages such as PHP and C# have a foreach() statement that loops over the elements of a collection without the need to keep track of a counter variable. Popular Javascript frameworks such as JQuery and Dojo implement foreach functions; with the magic of closures, we can get an effect similar to a built-in foreach construct. For instance, in JQuery, we can write
16 var sum=0; 17 $.each(x,function(i,val) { 18 sum += arg; 19 });
In this case, the second argument of the $.each (JQuery.each) function is a anonymous function. The special property it has, as a closure, is that the anonymous function has access to variables in the enclosing scope, namely sum. This allows the code in 16-19 to look and work a lot like the code in 12-15, despite the fact that line 18 is in a different function than line 16.
As an aside, this kind of higher-order function is often simple to implement; although the real $.each() function is capable of iterating over a wide variety of types (thus more complex), a function that can iterate over arrays is as simple as
20 function foreachArrayElement(arr,loopBody) { 21 for(i=0;i<x.length;i++) { 22 loopBody(i,x[i]) 23 } 24 }
Breaking The Conventions of Time
Although you’ve got other choices, closures are a popular and concise way to write callbacks for aysnchronous communication. Suppose that we want to GET a packet of JSON data from a server and update a user inteface. In JQuery we could write:
25 $.get("JsonService", {}, function(data, textStatus) { 26 var obj=JSON.parse(data); 27 someUIElement.update(obj); 28 }); 29 30 ... do more stuff ...
Note that time isn’t progressing downward in this script anymore. We got from line 25 directly to line 30. The code between 26-28 only executes when data gets back from the server. In this case, this behavior is caused by the asynchronous nature of XMLHttpRequest, for which $.get() is wrapper. Similar behavior can be had by attaching an event handler to a DOM object, or even by adding a function reference to an object or the global scope, causing the anonymous function to execute when some other function, defined elsewhere, executes.
“Race Conditions” In Asynchronous Communication
Although it’s not wrong to break the conventions of time, doing so risks the introduction of tricky and subtle errors. Suppose we’re building an AJAX application where we’d like to cache an object, retrieving the object only if we don’t have a copy in cache. One obvious approach is to write
31 if (cachedCopy==undefined) { 32 $.get("DataService",{},function(data,textStatus) { 33 cachedCopy=data; 34 updateUIOne(data); 35 } 36 } else { 37 updateUIOne(cachedCopy); 38 } 39 40 updateUITwo();
UpdateUIOne(data) populates the user interface with the data. UpdateUITwo() makes some other change to the user interface.
Unfortunately, this code has a potential bug that’s hidden by the conventions of time. When data is in the cache, line 37 executes before line 40, so that updateUIOne(data) is called before updateUITwo(). When data is not in the cache, line 40 executes before 33 (communication callbacks don’t run in Javascript until the code being run returns to the browser.) It’s all fine if the order in which you run updateUIOne and updateUITwo doesn’t matter — and it’s a disaster if it does… This kind of code does one thing sometimes and does another thing other times: code of this type is difficult to test, and leads to the kind of bug that drives people to drink.
The real answer to these problems is to take a systematic approach to asynchronous communication: any strategy based on band-aids that work here or these is going will eventally break down under the weight of growing requirements. That said, I’ll offer a few strategies for patching this kind of problem:
- If we could move updateUITwo() from line 40 to before line 31, updateUITwo() and updateUIOne() could be run in a consistent order.
- Modifying updateUIOne(data) to call updateUITwo() after it completes also results in a consistent order
- We can schedule UpdateUIOne to run AFTER the executing code returns to the browser, by replacing line 34 with
41 setTimeout(function() { updateUIOne(data) },1);
Structural Instability
Let’s consider another example where the conventions of time are treacherous: suppose we need to retrieve three chunks of data from the server
42 $.get("Chunk1Server",{},function(data1,textStatus1) { 43 UpdateUIOne(data1); 44 $.get("Chunk2Server",{},function(data2,textStatus2) { 45 UpdateUITwo(data2); 46 $.get("Chunk3Server",{},function(data3,textStatus3) { 47 UpdateUIThree(data3); 48 }); 49 }); 50 });
Note that this specific chunk of code executes exactly the way that it looks. Although there are some gaps, execution proceeds progressively from line 42 to 47. The “nested closure” idiom is common in Javascript code, probably because it looks a lot the synchronous code that does the same job, but it’s treacherous: a tiny change in the code can break the conventions of time, causing strange things to happen. This property is called structural instability.
For example, the above code might return directly to the browser, eliminating the possibility of the “race conditions” seen in the last section. If we add the following line:
51 UpdateUIFour();
we’ll find that UpdateUIFour() runs ~before~ the other other functions. This isn’t bad behavior if we expect it, but could have spooky effects if we don’t. This example is trivial, but similar mistakes can be made in any of the closures, resulting in errors that can be quite baffling.
The structural instability of nested closures pushes complex AJAX applications towards other styles of coding, such as a state-machine organization.
Order Of Arguments
This is small compared to the other issues, but the order of arguments of callback functions can add to the the confusions around the time: The $.get() function provides a good example, since it support four parameters:
51 $.get(url,parameters,callback,type);
all of the parameters, except for the url, are optional. It’s generally good to put less used optional parameters towards the right, but placing the type declaration after the callback disturbs the flow of time and hurts readability. If you choose to have the return data parsed as JSON format,
52 $.get(targetUrl,{},function(data,textStatus) { 53 ... handler code ... 54 },"json");
you’ll see that the type specification occurs after the callback. This, by itself, adds a small amount of confusion, but when you’ve got multiple nested closures or if you were computing the values defined after the callback, code becomes immediately difficult to understand.
Event Handlers And “COMEFROM”
In 1973 R. Lawrence Clark kiddingly introduced a COMEFROM statement in response to Edgser Dijkstra’s famous “GO TO Considered Harmful”, from the associated Wikipedia Article, I take an example of a program in a hypothetical BASIC dialect that uses COMEFROM:
55 COMEFROM 58 56 INPUT "WHAT IS YOUR NAME? "; A$ 57 PRINT "HELLO, "; A$ 58 REM
COMEFROM is a bit disturbing because it involves an “action at a distance:” line 55 modifies the behavior of line 58. A person looking a line 58 in isolation would have a hard time understanding what happens in the program when execution reaches line 58.
The use of event handlers, both built in and custom, has an effect similar to COMEFROM. Consder the common example
59 $("#activeButton").click(function() {
60 ... handle button click...
61 }
Once more, the code at line 60 will execute after the code that follows line 61. Line 59 modifies the behavior of a built-in UI element, which might be defined like
62 <input type="button" id="activeButton" value="Button X">
Looking at line 62 alone, it’s impossible to tell what what happens when you click on the button; an “onClick” handler in line 62 would be more immediately obvious in function. That said, events are a proven model for GUI programming that introduces a useful decoupling between the parts of a system — a looser coupling that lets us build bigger systems. In the context of JQuery, it’s great to be able to write something like
63 <div class="SpecialArea">... content ...</div>
Javascript code can then use the $() function to attach styles, event handlers and to manipulate DOM elements inside the .SpecialArea to give the area specific behaviors and appearance. This lets a developer provide a powerful but easy-to-use abstraction to a designer: adding a single class specification lets us reuse tens of CSS specifications and several event handlers. This is a big win, but we can’t forget the hidden COMEFROM that’s implied.
Events are dynamically added and removed, even in static languages such as Java and C#. Although this dynamism adds flexibility, it comes at a cost. IDEs such as Eclipse and Visual Studio can do a great job of helping programmers understand static method calls: for instance, you can click on the name of a method and see a list of places where this method is called. Because events aren’t amenable to static analysis, inappropriate use (or even appropriate use) of events impair some of the tools we have for understanding programs.
Events are a great tool, but programmers need to understand the weirdness that underlies them. Used deliberately, events can provide a concise and scalable way to move information and direct control in an application. Poorly used events can cause software to degenerate to “spaghetti code”.
Conclusion
Closures are a powerful and concise way to express your intentions to a computer: however, closures break some of the intuitive assumptions that people use to understand software — specifically, the idea that time moves downward through the execution of a procedure. This leads to a kind of structural instability, where software that’s perfectly simple and clear at a simple level can suddenly get much harder to understand when several complications converge. This article uses JQuery-Javascript as an example not because JQuery is bad, but because it’s good: it’s a tool that helps beginning (and advanced) programmers accomplish a lot with little code. Yet, as we build increasingly complex applications, we need a clear understand of the big and little weirdnesses of RIA programming.
Paul Houle on June 23rd 2009 in Asynchronous Communications