Closures, Javascript And The Arrow Of Time

time-machine

Introduction

In most written media, time progresses as you move down a page: mainstream computing languages are no different. Anonymous Closures are a language mechanism that, effectively, lets programmers create new control structures. Although people associate this power with exotic dynamic languages such as FORTH, Scheme and TCL, closures are becoming a feature of mainstream languages such as Javascript and PHP (and even static languages such as C#.)

Although this article talks about issues that you’ll encounter in languages such as C# and Scheme, I’m going to focus on Javascript written on top of the popular JQuery library: I do that because JQuery is a great toolkit that lets programmers and designers of all skill levels do a lot by writing very little code. Because JQuery smooths away low-level details, it lets us clearly illustrate little weirdnesses of its programming model. Although things often work “just right” on the small scale, little strange things snowball in larger programs — a careful look atthe little problems is a key step towards the avoidance and mitigation of problems in big RIA projects.

Time In Programming Languages

Absent control structures, execution proceeds from top to bottom in a computer program, consider:

01 document.write('Hello');
02 document.write(' ');
03 document.write('World!');

Today’s languages are based on structured programming, which encourages to use a limited number of control structures, such as conditional expressions

04 if(happy=true) {
05    document.write('Hello');
06 } else {
07    document.write('Goodbye Cruel');
08 }
09
10 document.write('  ');
11 document.write('World!');

and loops

12 var sum=0;
13 for(i=0;i<x.length;i++) {
14   sum += x[i];
15 }

With GOTO expunged from the programming canon, execution proceeds from top to down, except for a limited number of control structures that let us return to the beginning or jump to the end of a block. It’s a simple and intuitive model that we don’t often think about, but it’s easy to write code that breaks this model when programming with closures.

Emulating Foreach() in Javascript with Closures

Popular languages such as PHP and C# have a foreach() statement that loops over the elements of a collection without the need to keep track of a counter variable. Popular Javascript frameworks such as JQuery and Dojo implement foreach functions; with the magic of closures, we can get an effect similar to a built-in foreach construct. For instance, in JQuery, we can write

16 var sum=0;
17 $.each(x,function(i,val) {
18    sum += arg;
19 });

In this case, the second argument of the $.each (JQuery.each) function is a anonymous function. The special property it has, as a closure, is that the anonymous function has access to variables in the enclosing scope, namely sum. This allows the code in 16-19 to look and work a lot like the code in 12-15, despite the fact that line 18 is in a different function than line 16.

As an aside, this kind of higher-order function is often simple to implement; although the real $.each() function is capable of iterating over a wide variety of types (thus more complex), a function that can iterate over arrays is as simple as

20 function foreachArrayElement(arr,loopBody) {
21    for(i=0;i<x.length;i++) {
22       loopBody(i,x[i])
23    }
24 }

Breaking The Conventions of Time

Although you’ve got other choices, closures are a popular and concise way to write callbacks for aysnchronous communication. Suppose that we want to GET a packet of JSON data from a server and update a user inteface. In JQuery we could write:

25 $.get("JsonService", {}, function(data, textStatus) {
26     var obj=JSON.parse(data);
27     someUIElement.update(obj);
28  });
29
30 ... do more stuff ...

Note that time isn’t progressing downward in this script anymore. We got from line 25 directly to line 30. The code between 26-28 only executes when data gets back from the server. In this case, this behavior is caused by the asynchronous nature of XMLHttpRequest, for which $.get() is wrapper. Similar behavior can be had by attaching an event handler to a DOM object, or even by adding a function reference to an object or the global scope, causing the anonymous function to execute when some other function, defined elsewhere, executes.

“Race Conditions” In Asynchronous Communication

Although it’s not wrong to break the conventions of time, doing so risks the introduction of tricky and subtle errors. Suppose we’re building an AJAX application where we’d like to cache an object, retrieving the object only if we don’t have a copy in cache. One obvious approach is to write

31 if (cachedCopy==undefined) {
32  $.get("DataService",{},function(data,textStatus) {
33       cachedCopy=data;
34       updateUIOne(data);
35   }
36 } else {
37   updateUIOne(cachedCopy);
38 }
39
40 updateUITwo();

UpdateUIOne(data) populates the user interface with the data. UpdateUITwo() makes some other change to the user interface.

Unfortunately, this code has a potential bug that’s hidden by the conventions of time. When data is in the cache, line 37 executes before line 40, so that updateUIOne(data) is called before updateUITwo(). When data is not in the cache, line 40 executes before 33 (communication callbacks don’t run in Javascript until the code being run returns to the browser.) It’s all fine if the order in which you run updateUIOne and updateUITwo doesn’t matter — and it’s a disaster if it does… This kind of code does one thing sometimes and does another thing other times: code of this type is difficult to test, and leads to the kind of bug that drives people to drink.

The real answer to these problems is to take a systematic approach to asynchronous communication: any strategy based on band-aids that work here or these is going will eventally break down under the weight of growing requirements. That said, I’ll offer a few strategies for patching this kind of problem:

If we could move updateUITwo() from line 40 to before line 31, updateUITwo() and updateUIOne() could be run in a consistent order.
Modifying updateUIOne(data) to call updateUITwo() after it completes also results in a consistent order
We can schedule UpdateUIOne to run AFTER the executing code returns to the browser, by replacing line 34 with

41    setTimeout(function() { updateUIOne(data) },1);

Structural Instability

Let’s consider another example where the conventions of time are treacherous: suppose we need to retrieve three chunks of data from the server

42 $.get("Chunk1Server",{},function(data1,textStatus1) {
43   UpdateUIOne(data1);
44   $.get("Chunk2Server",{},function(data2,textStatus2) {
45       UpdateUITwo(data2);
46        $.get("Chunk3Server",{},function(data3,textStatus3) {
47            UpdateUIThree(data3);
48       });
49    });
50 });

Note that this specific chunk of code executes exactly the way that it looks. Although there are some gaps, execution proceeds progressively from line 42 to 47. The “nested closure” idiom is common in Javascript code, probably because it looks a lot the synchronous code that does the same job, but it’s treacherous: a tiny change in the code can break the conventions of time, causing strange things to happen. This property is called structural instability.

For example, the above code might return directly to the browser, eliminating the possibility of the “race conditions” seen in the last section. If we add the following line:

51 UpdateUIFour();

we’ll find that UpdateUIFour() runs ~before~ the other other functions. This isn’t bad behavior if we expect it, but could have spooky effects if we don’t. This example is trivial, but similar mistakes can be made in any of the closures, resulting in errors that can be quite baffling.

The structural instability of nested closures pushes complex AJAX applications towards other styles of coding, such as a state-machine organization.

Order Of Arguments

This is small compared to the other issues, but the order of arguments of callback functions can add to the the confusions around the time: The $.get() function provides a good example, since it support four parameters:

51 $.get(url,parameters,callback,type);

all of the parameters, except for the url, are optional. It’s generally good to put less used optional parameters towards the right, but placing the type declaration after the callback disturbs the flow of time and hurts readability. If you choose to have the return data parsed as JSON format,

52 $.get(targetUrl,{},function(data,textStatus) {
53    ... handler code ...
54 },"json");

you’ll see that the type specification occurs after the callback. This, by itself, adds a small amount of confusion, but when you’ve got multiple nested closures or if you were computing the values defined after the callback, code becomes immediately difficult to understand.

Event Handlers And “COMEFROM”

In 1973 R. Lawrence Clark kiddingly introduced a COMEFROM statement in response to Edgser Dijkstra’s famous “GO TO Considered Harmful”, from the associated Wikipedia Article, I take an example of a program in a hypothetical BASIC dialect that uses COMEFROM:

55 COMEFROM 58
56 INPUT "WHAT IS YOUR NAME? "; A$
57 PRINT "HELLO, "; A$
58 REM

COMEFROM is a bit disturbing because it involves an “action at a distance:” line 55 modifies the behavior of line 58. A person looking a line 58 in isolation would have a hard time understanding what happens in the program when execution reaches line 58.

The use of event handlers, both built in and custom, has an effect similar to COMEFROM. Consder the common example

59 $("#activeButton").click(function() {
60    ... handle button click...
61 }

Once more, the code at line 60 will execute after the code that follows line 61. Line 59 modifies the behavior of a built-in UI element, which might be defined like

62 <input type="button" id="activeButton" value="Button X">

Looking at line 62 alone, it’s impossible to tell what what happens when you click on the button; an “onClick” handler in line 62 would be more immediately obvious in function. That said, events are a proven model for GUI programming that introduces a useful decoupling between the parts of a system — a looser coupling that lets us build bigger systems. In the context of JQuery, it’s great to be able to write something like

63 <div class="SpecialArea">... content ...</div>

Javascript code can then use the $() function to attach styles, event handlers and to manipulate DOM elements inside the .SpecialArea to give the area specific behaviors and appearance. This lets a developer provide a powerful but easy-to-use abstraction to a designer: adding a single class specification lets us reuse tens of CSS specifications and several event handlers. This is a big win, but we can’t forget the hidden COMEFROM that’s implied.

Events are dynamically added and removed, even in static languages such as Java and C#. Although this dynamism adds flexibility, it comes at a cost. IDEs such as Eclipse and Visual Studio can do a great job of helping programmers understand static method calls: for instance, you can click on the name of a method and see a list of places where this method is called. Because events aren’t amenable to static analysis, inappropriate use (or even appropriate use) of events impair some of the tools we have for understanding programs.

Events are a great tool, but programmers need to understand the weirdness that underlies them. Used deliberately, events can provide a concise and scalable way to move information and direct control in an application. Poorly used events can cause software to degenerate to “spaghetti code”.

Conclusion

Closures are a powerful and concise way to express your intentions to a computer: however, closures break some of the intuitive assumptions that people use to understand software — specifically, the idea that time moves downward through the execution of a procedure. This leads to a kind of structural instability, where software that’s perfectly simple and clear at a simple level can suddenly get much harder to understand when several complications converge. This article uses JQuery-Javascript as an example not because JQuery is bad, but because it’s good: it’s a tool that helps beginning (and advanced) programmers accomplish a lot with little code. Yet, as we build increasingly complex applications, we need a clear understand of the big and little weirdnesses of RIA programming.

280Comments2009-06-23+15%3A24%3A40Paul+Houle

Paul Houle on June 23rd 2009 in Asynchronous Communications

Comments to “Closures, Javascript And The Arrow Of Time”

Keith Braithwaite responded on 23 Jun 2009 at 6:15 pm #

"although people associate this power with exotic dynamic languages such as FORTH, Scheme and TCL, closures are becoming a feature of mainstream languages such as Javascript and PHP "

Javascript is an exotic dynamic language–but this fact has been carefully concealed. Which is a shame.

Paul Houle responded on 23 Jun 2009 at 12:10 pm #

Well, it all depends on how you define “exotic.”

The actual architecture of the FORTH, Scheme and (older generation) TCL interpreters are archaic. You’ve got a lexical analyzer, but not a parser, and generally not a bytecode interpreter in the traditional sense. More recent TCL versions have moved in the bytecode direction to improve performance, but classic TCL was essentially LISP with lists implemented as space-separated strings.

Javascript is basically an ALGOL-type language with a conventional implementation, but it certainly radical in quite a few ways.

Personally I miss ECMAScript 4; I would have liked to have seen a Javascript-like language with stronger typing, better IDE support and more support for programming in the large. It would be appealing to have a programming environment where we could share code on the client and the server, even if it would be a terrible temptation for people to make mistakes

Trackback URI | Comments RSS Leave a Reply

Name (required)

Mail (hidden) (required)

Website