Ookaboo is a collection of images that are indexed by terms from the semantic web, or, the web of linked data. Ookaboo provides both a human interface and a semantic API. Ookaboo has two goals: (i) to dramatically improve the state of the art in image search for both humans and machines, and (ii) to construct a knowledge base about the world that people live in that can be used to help information systems better understand us.
Semantic Web, Linked Data
In the semantic web, we replace the imprecise words that we use everyday with precise terms defined by URLs. This is linked data because it creates a universal shared vocabulary.
For an example, in conventional image search, a person might use the word “jaguar” to search for
Note in the cases above, there are pages in Wikipedia about each of the topics above: it’s reasonable, therefore, that we could use these URLs as a shared vocabulary for referring to these things. However, we get some benefits when we use URLs that are linked to machine-readable pages, such as
http://dbpedia.org/resource/Jaguar, or http://rdf.freebase.com/rdf/biology.itis.180593
Pages on Ookaboo are marked up with RDFa, a standard that lets semantic web tools extract machine readable information from the same pages that people view.
Named entities
Ookaboo is oriented around named entities, particularly ‘concrete’ things such as places, people and creative works. With current technology, it’s more practical to create a taxonomy of things like “Manhattan”, “Isaac Asimov” and “The Catcher In the Rye” than it is to tackle topics like “eating”, “digestion” and “love”. We believe that a comprehensive exploration of named entities will open pathways to an understanding of other terms, and hope to extend Ookaboo’s capabilities as technology advances.
Ookaboo semantic API
The Ookaboo semantic API is a simple mechanism to query Ookaboo for photographs about named entities defined by URLs. Based on JSON, it makes it easy for automated systems to find photographs about topics. You can get started in a few minutes by reading the documentation and creating and API Key.
Creative Commons
All images in Ookaboo are either public domain or under Creative Commons licenses. As does Wikimedia Commons, we exclude images with a “noncommercial” clause, but unlike Wikimedia, we refuse non-free images with a claim of “fair use” and we permit images that contain a “nonderivative” clause.
Visitors to Ookaboo and users of the API are invited to use images they find in a manner consistent with their licensing. We believe that a link to the picture metadata page on Ookaboo (example here) for an image satisfies the “BY” requirement in creative commons, because our pages trace the provenance of images, however, we advise you to contact the creator of an image if you have any questions — for instance, the creator of an image can grant you the right to use an image under terms different than the creative commons license.
Geospatial Reasoning
Ookaboo is initially focused on things that are located in space: things like countries, cities, administrative divisions, monuments, buildings and bridges. This part of the Ontology2 strategy of exploiting “unreasonably effective” strategies for organizing information. As Ookaboo evolves, expect to see spatial knowledge reflected in both the U.I. and API.
]]>Anonymous functions and closures are a language feature that, in many ways, allow programmers to reshape the syntax of a language. Although often associated with highly dynamic languages such as Lisp and TCL and moderately dynamic languages, such as Javascript, C# shows that closures retain their power in statically typed languages. In this article, I identify a language feature that I like from the Pascal language and ‘clone’ it using closures in C#.
Pascal (a historically important teaching language that’s as statically typed as you can get) allows programmers to define a named function inside the scope of an enclosing function. This is a mechanism for encapsulation, much like the mechanisms of object-oriented programming. Here’s an example of how nested functions work in Pascal, borrowed from Wikipedia:
01 function E(x: real): real; 02 function F(y: real): real; 03 begin 04 F := x + y 05 end; 06 begin 07 E := F(3) 08 end;
Note that you return values in Pascal by assigning to a ‘variable’ with the same name as the function. The F function is nested inside E, so (i) F can only be called inside function E and (ii) F has access to variables defined inside function E. Now, this example is contrived (it’s an obfuscated way to write E(x)=x+3, but this is a useful tool if you’ve got a chunk of code that is reused several times inside a function.
Although C# doesn’t explicitly support nested functions, it’s easy to simulate this feature using an anonymous function (a lambda.) To do this, I use an idiom which is common in Javascript, the naming of an anonymous function by assigning it to a named variable:
09 static double E(double x) { 10 Func<double,double> F = (double y) => x+y; 11 return F(3.0); 12 }
I was writing an algorithm over trees the other day, and noticed that there was a chunk of code that I was repeating in multiple places inside a single function; as this chunk was increasing in complexity, I felt alarmed by the duplication. I could have spun the “inner” function into a named method of the class the “outer” function was in, but that would have meant moving certain variables out of method scope into class scope — I didn’t like this, because the rest of the class had no business accessing these variables.
I could have split out the tree algorithm into a separate class, but that bulks up the code and creates more artifacts to maintain. Splitting the algorithm into a separate class might let me enable reuse by adding extension points and using the Visitor pattern, but I could have ended up creating an interface and several new classes… while never getting around in the future to take advantage of that promised reuse.
Object-functional languages like C# offer programmers new choices when it comes to encapsulation, inheritance and reuse: the study of patterns and idioms used in languages such as LISP and Javascript can be fruitful for the C# programmer, and proves that the many of the strengths that people associate with dynamically typed languages can be enjoyed in statically typed languages.
]]>In most written media, time progresses as you move down a page: mainstream computing languages are no different. Anonymous Closures are a language mechanism that, effectively, lets programmers create new control structures. Although people associate this power with exotic dynamic languages such as FORTH, Scheme and TCL, closures are becoming a feature of mainstream languages such as Javascript and PHP (and even static languages such as C#.)
Although this article talks about issues that you’ll encounter in languages such as C# and Scheme, I’m going to focus on Javascript written on top of the popular JQuery library: I do that because JQuery is a great toolkit that lets programmers and designers of all skill levels do a lot by writing very little code. Because JQuery smooths away low-level details, it lets us clearly illustrate little weirdnesses of its programming model. Although things often work “just right” on the small scale, little strange things snowball in larger programs — a careful look atthe little problems is a key step towards the avoidance and mitigation of problems in big RIA projects.
Absent control structures, execution proceeds from top to bottom in a computer program, consider:
01 document.write('Hello'); 02 document.write(' '); 03 document.write('World!');
Today’s languages are based on structured programming, which encourages to use a limited number of control structures, such as conditional expressions
04 if(happy=true) { 05 document.write('Hello'); 06 } else { 07 document.write('Goodbye Cruel'); 08 } 09 10 document.write(' '); 11 document.write('World!');
and loops
12 var sum=0; 13 for(i=0;i<x.length;i++) { 14 sum += x[i]; 15 }
With GOTO expunged from the programming canon, execution proceeds from top to down, except for a limited number of control structures that let us return to the beginning or jump to the end of a block. It’s a simple and intuitive model that we don’t often think about, but it’s easy to write code that breaks this model when programming with closures.
Popular languages such as PHP and C# have a foreach() statement that loops over the elements of a collection without the need to keep track of a counter variable. Popular Javascript frameworks such as JQuery and Dojo implement foreach functions; with the magic of closures, we can get an effect similar to a built-in foreach construct. For instance, in JQuery, we can write
16 var sum=0; 17 $.each(x,function(i,val) { 18 sum += arg; 19 });
In this case, the second argument of the $.each (JQuery.each) function is a anonymous function. The special property it has, as a closure, is that the anonymous function has access to variables in the enclosing scope, namely sum. This allows the code in 16-19 to look and work a lot like the code in 12-15, despite the fact that line 18 is in a different function than line 16.
As an aside, this kind of higher-order function is often simple to implement; although the real $.each() function is capable of iterating over a wide variety of types (thus more complex), a function that can iterate over arrays is as simple as
20 function foreachArrayElement(arr,loopBody) { 21 for(i=0;i<x.length;i++) { 22 loopBody(i,x[i]) 23 } 24 }
Although you’ve got other choices, closures are a popular and concise way to write callbacks for aysnchronous communication. Suppose that we want to GET a packet of JSON data from a server and update a user inteface. In JQuery we could write:
25 $.get("JsonService", {}, function(data, textStatus) { 26 var obj=JSON.parse(data); 27 someUIElement.update(obj); 28 }); 29 30 ... do more stuff ...
Note that time isn’t progressing downward in this script anymore. We got from line 25 directly to line 30. The code between 26-28 only executes when data gets back from the server. In this case, this behavior is caused by the asynchronous nature of XMLHttpRequest, for which $.get() is wrapper. Similar behavior can be had by attaching an event handler to a DOM object, or even by adding a function reference to an object or the global scope, causing the anonymous function to execute when some other function, defined elsewhere, executes.
Although it’s not wrong to break the conventions of time, doing so risks the introduction of tricky and subtle errors. Suppose we’re building an AJAX application where we’d like to cache an object, retrieving the object only if we don’t have a copy in cache. One obvious approach is to write
31 if (cachedCopy==undefined) { 32 $.get("DataService",{},function(data,textStatus) { 33 cachedCopy=data; 34 updateUIOne(data); 35 } 36 } else { 37 updateUIOne(cachedCopy); 38 } 39 40 updateUITwo();
UpdateUIOne(data) populates the user interface with the data. UpdateUITwo() makes some other change to the user interface.
Unfortunately, this code has a potential bug that’s hidden by the conventions of time. When data is in the cache, line 37 executes before line 40, so that updateUIOne(data) is called before updateUITwo(). When data is not in the cache, line 40 executes before 33 (communication callbacks don’t run in Javascript until the code being run returns to the browser.) It’s all fine if the order in which you run updateUIOne and updateUITwo doesn’t matter — and it’s a disaster if it does… This kind of code does one thing sometimes and does another thing other times: code of this type is difficult to test, and leads to the kind of bug that drives people to drink.
The real answer to these problems is to take a systematic approach to asynchronous communication: any strategy based on band-aids that work here or these is going will eventally break down under the weight of growing requirements. That said, I’ll offer a few strategies for patching this kind of problem:
41 setTimeout(function() { updateUIOne(data) },1);
Let’s consider another example where the conventions of time are treacherous: suppose we need to retrieve three chunks of data from the server
42 $.get("Chunk1Server",{},function(data1,textStatus1) { 43 UpdateUIOne(data1); 44 $.get("Chunk2Server",{},function(data2,textStatus2) { 45 UpdateUITwo(data2); 46 $.get("Chunk3Server",{},function(data3,textStatus3) { 47 UpdateUIThree(data3); 48 }); 49 }); 50 });
Note that this specific chunk of code executes exactly the way that it looks. Although there are some gaps, execution proceeds progressively from line 42 to 47. The “nested closure” idiom is common in Javascript code, probably because it looks a lot the synchronous code that does the same job, but it’s treacherous: a tiny change in the code can break the conventions of time, causing strange things to happen. This property is called structural instability.
For example, the above code might return directly to the browser, eliminating the possibility of the “race conditions” seen in the last section. If we add the following line:
51 UpdateUIFour();
we’ll find that UpdateUIFour() runs ~before~ the other other functions. This isn’t bad behavior if we expect it, but could have spooky effects if we don’t. This example is trivial, but similar mistakes can be made in any of the closures, resulting in errors that can be quite baffling.
The structural instability of nested closures pushes complex AJAX applications towards other styles of coding, such as a state-machine organization.
This is small compared to the other issues, but the order of arguments of callback functions can add to the the confusions around the time: The $.get() function provides a good example, since it support four parameters:
51 $.get(url,parameters,callback,type);
all of the parameters, except for the url, are optional. It’s generally good to put less used optional parameters towards the right, but placing the type declaration after the callback disturbs the flow of time and hurts readability. If you choose to have the return data parsed as JSON format,
52 $.get(targetUrl,{},function(data,textStatus) { 53 ... handler code ... 54 },"json");
you’ll see that the type specification occurs after the callback. This, by itself, adds a small amount of confusion, but when you’ve got multiple nested closures or if you were computing the values defined after the callback, code becomes immediately difficult to understand.
In 1973 R. Lawrence Clark kiddingly introduced a COMEFROM statement in response to Edgser Dijkstra’s famous “GO TO Considered Harmful”, from the associated Wikipedia Article, I take an example of a program in a hypothetical BASIC dialect that uses COMEFROM:
55 COMEFROM 58 56 INPUT "WHAT IS YOUR NAME? "; A$ 57 PRINT "HELLO, "; A$ 58 REM
COMEFROM is a bit disturbing because it involves an “action at a distance:” line 55 modifies the behavior of line 58. A person looking a line 58 in isolation would have a hard time understanding what happens in the program when execution reaches line 58.
The use of event handlers, both built in and custom, has an effect similar to COMEFROM. Consder the common example
59 $("#activeButton").click(function() {
60 ... handle button click...
61 }
Once more, the code at line 60 will execute after the code that follows line 61. Line 59 modifies the behavior of a built-in UI element, which might be defined like
62 <input type="button" id="activeButton" value="Button X">
Looking at line 62 alone, it’s impossible to tell what what happens when you click on the button; an “onClick” handler in line 62 would be more immediately obvious in function. That said, events are a proven model for GUI programming that introduces a useful decoupling between the parts of a system — a looser coupling that lets us build bigger systems. In the context of JQuery, it’s great to be able to write something like
63 <div class="SpecialArea">... content ...</div>
Javascript code can then use the $() function to attach styles, event handlers and to manipulate DOM elements inside the .SpecialArea to give the area specific behaviors and appearance. This lets a developer provide a powerful but easy-to-use abstraction to a designer: adding a single class specification lets us reuse tens of CSS specifications and several event handlers. This is a big win, but we can’t forget the hidden COMEFROM that’s implied.
Events are dynamically added and removed, even in static languages such as Java and C#. Although this dynamism adds flexibility, it comes at a cost. IDEs such as Eclipse and Visual Studio can do a great job of helping programmers understand static method calls: for instance, you can click on the name of a method and see a list of places where this method is called. Because events aren’t amenable to static analysis, inappropriate use (or even appropriate use) of events impair some of the tools we have for understanding programs.
Events are a great tool, but programmers need to understand the weirdness that underlies them. Used deliberately, events can provide a concise and scalable way to move information and direct control in an application. Poorly used events can cause software to degenerate to “spaghetti code”.
Closures are a powerful and concise way to express your intentions to a computer: however, closures break some of the intuitive assumptions that people use to understand software — specifically, the idea that time moves downward through the execution of a procedure. This leads to a kind of structural instability, where software that’s perfectly simple and clear at a simple level can suddenly get much harder to understand when several complications converge. This article uses JQuery-Javascript as an example not because JQuery is bad, but because it’s good: it’s a tool that helps beginning (and advanced) programmers accomplish a lot with little code. Yet, as we build increasingly complex applications, we need a clear understand of the big and little weirdnesses of RIA programming.
A collection of car pictures under a Creative Commons license, it was built from a taxonomy that was constructed from Dbpedia and Freebase. My editor and I then used a scalable process to clean up the taxonomy, find matching images and enrich image metadata. As you can imagine, this is a process that can be applied to other problem domains: we’ve tested some of the methodology on our site Animal Photos! but this is the first site designed from start to finish based on the new technology.
Car Pictures was made possible by data sets available on the semantic web, and it’s soon about to export data via the semantic web. We’re currently talking with people about exporting our content in a machine-readable linked data format — in our future plans, this will enable a form of semantic search that will revolutionize multimedia search: rather than searching for inprecise keywords, it will become possible to look up pictures, video and documents about named entities found in systems such as Freebase, Wikipedia, WordNet, Cyc and OpenCalais.
In the next few days we’re going to watch carefully how Car Pictures interacts with users and with web crawlers. Then we’ll be adding content, community features, and a linked data interface. In the meantime, we’re planning to build something at least a hundred times bigger.
Quite literally, thousands of people have contributed to Car Pictures, but I’d like to particularly thank Dan Brickley, who’s been helpful in the process of interpreting Dbpedia with his ARC2 RDF tools, and Kingsley Idehen, who’s really helped me sharpen my vision of what the semantic web can be.
Anyway, I’d love any feedback that you can give me about Car Pictures and any help I can get in spreading the word about it. If you’re interested in making similar site about some other subject, please contact me: it’s quite possible that we can help.
]]>Controversy persists to this day about the relative merits of dynamic languages such as PHP and Python versus static languages such as C# and Java. We’re finding more and more that the difference isn’t so much about static or dynamic typing, but more about the cultures of different languages. In this article, I discuss an efficient representations of SQL trees in a database, an algorithm for creating that representation, and a PHP implementation. The PHP implementations uses objects in a way foreign to many developers: rather than using objects to represent nouns (data), it uses a class to represent a verb (an algorithm.) I make the case that programmers shouldn’t feel compelled to create new classes to represent every data item: that verb objects often provide the right level of abstraction for many tasks.
Lately I’ve been collecting pictures of animals, and decided that incorporating the taxonomic database from ITIS would be a big help. I’m interested in asking questions like “What are all the species underneath the Tapiridae family?” The ITIS database uses the adjacency list representation, where each row contains a column that references the primary key of a parent row. Algorithms for the adjacency list are well known, but are awkward to implement in SQL since it takes multiple SQL statements to traverse a tree.
Nested sets are an alternative representation that makes it simple to write fast queries on trees. Like the parts explosion diagram below, components of the hierarchy are represented with contiguous numbers (parts 1-3 form one end piece of the steering knuckle.)
This article discusses the adjacency list and nested set models and presents a simple algorithm for converting an adjacency list into nested sets.
A common representation of a tree in a relational database is like this:
[01] create table obviousTree ( [02] id varchar(255), [03] parent varchar(255), [04] primary key(id), [05] foreign key(parent) references tree(id) [06] )
The parent column is allowed to be null, so the root of the tree has a null parent. A typical tree in this representation might look like
[07] sql> SELECT * FROM obviousTree; [08] +--------------+ [09] | id | parent | [10] |-----|--------| [11] | 'a' | null | [12] | 'b' | 'a' | [13] | 'c' | 'a' | [14] | 'd' | 'b' | [15] | 'e' | 'b' | [16] | 'f' | 'b' | [17] +--------------+
It’s simple in this representatation to find out what the parent of a node is,
[18] SELECT parent FROM obviousTree WHERE id=@Child
or to find the direct children of a node,
[19] SELECT id FROM obviousTree WHERE parent=@Parent
It’s not possible, however, to write a pure SQL statement that traverses all of the descendants of a node (at least not in standard SQL.) You need to either write a stored procedure or write a program in a language like C# or PHP to implements a breadth-first or depth-first traversal of the tree. This isn’t conceptually hard, but it’s inefficient and doesn’t take full advantage of the querying power of SQL.
Joe Celko has promoted the Nested Set representation of trees in several of his books. In the nested set representation, we represent the position of each node in the tree with two numbers: the left value and the right value. We’ll call them lft and rgt in our code, since LEFT and RIGHT are reserved words in SQL.
The left and right values of a parent node enclose the left and right values of the children, so it’s easy to ask for the the descendants of a node
[20] SELECT * FROM fastTree WHERE [21] lft>@AncestorLft [22] AND lft<@AncestorRgt
to count the children
[23] SELECT (rgt-lft-1)/2 FROM fastTree WHERE lft=@AncestorLeft
or to find all the ancestors of a node
[24] SELECT * FROM fastTree WHERE [25] lft<@DescendentLft [26] AND rgt>@DescendentRgt
Granted, update operations are slower and more complex in the nested set representation, but for my application, where I’m accessing a nearly 500,000-node tree that never changes, nested sets are the clear choice.
Looking at the tree above, you can see a straightforward algorithm for creating the nested set representation. We traverse the tree depth-first, keeping a counter, which we’ll call cnt as we go — it works a lot like a thumb operated tally counter:
When we first encounter a node (going down), we write cnt into the lft field of that node, then we increment cnt. When we encounter it again (going up), we write cnt into the rgt field of the node an increment cnt.
Joe Celko makes a good case for implementing mutating operations on nested sets in stored procedures. (see his book on Trees and Hiearchies and SQL) I think he’s particularly right when it comes to adding, removing and moving nodes, where the operation requires several statements that should be done in a transaction. Transactional integrity is less important, however, in a one-time import procedure which is done in a batch; although Mysql 5.0 (which I’m using) supports stored procedures, it’s got less of a culture of stored procedure use than other databases, so I felt comfortable writing the conversion script in PHP. This script converted a tree with 477,797 nodes in less than a minute, so performance was adequate.
I’m going to store the nested set in a table that looks like
[27] create table fastTree ( [28] lft integer not null, [29] rgt integer not null, [30] id varchar(255), [31] primary key(lft), [32] unique(rgt), [33] unique(id) [34] )
I include “_Config.php”, which initializes the database connection, $conn, using the third-party ADODB library. After that I retrieve the list of parent-child relationships from the database:
[35] require_once("./_Config.php"); [36] $rawLink=$conn->Execute("SELECT id,parent FROM obviousTree"); [37] [38] $link=array(); [39] foreach($rawLink as $k=>$row) { [40] $parent=$row["parent"]; [41] $child=$row["id"]; [42] if (!array_key_exists($parent,$link)) { [43] $link[$parent]=array(); [44] } [45] $link[$parent][]=$child; [46] }
Note that we’re building a complete copy of the adjacency list in RAM: the value of $link[$parent] is an array that contains the id’s of all the children of the $parent. I’m doing this for two reasons: (i) the adjacency list is small enough to fit in RAM, and (ii) minimize the cost of I/O: if you expect to access all rows in a table, it’s a lot cheaper to do a full table scan than it is to do hundreds of thousands of index queries.
Next we define an object, TreeTransformer, that represents the algorithm. In particular, it provides a scope for the $cnt variable, which has a lifespan apart from the recursive function that represents the tree:
[47] class TreeTransformer { [48] function __construct($link) { [49] $this->count=1; [50] $this->link=$link; [51] } [52] [53] function traverse($id) { [54] $lft=$this->count; [55] $this->count++; [56] [57] $kid=$this->getChildren($id); [58] if ($kid) { [59] foreach($kid as $c) { [60] $this->traverse($c); [61] } [62] } [63] $rgt=$this->count; [64] $this->count++; [65] $this->write($lft,$rgt,$id); [66] }
Traverse() is the core of the algorithm: it works in three phases: (i) it assigns the $lft value, (ii) it loops over all children, calling itself recursively for each child, and (iii) assigns the $rgt value and writes an output record. The scope of $this->count is exactly the scope we want for the variable, and saves the need of passing $count back and forth between different levels of traverse(). Traverse calls two functions:
[67] function getChildren($id) { [68] return $this->link[$id]; [69] } [70] [71] function write($lft,$rgt,$id) { [72] global $conn; [74] $conn->Execute(" [75] INSERT INTO fastTree [76] (lft,rgt,id) [77] VALUES [78] (?,?,?) [79] ",array($lft,$rgt,$id)); [80] } [81] }
This script uses a single class: instead of using classes to represent data, it uses a class to represent an algorithm, a verb. Rather than create objects to create data structures, I reuse data structures that come with PHP.
This is an extensible design.
TreeTransformer provides two extension points: getChildren() and write(). Most of the objections a person could have to this implementation could be addressed here: for instance, getChildren() could be modified to support a different data structure for the adjacency list, or even to operate in a streaming mode that does an SQL query for each node. write(), on the other hand, could be modified to avoid the limitations of the global $conn, or to change the output format. If TreeTransformer were to evolve in the future, it would make sense to push traverse() up to an AbstractTreeWriter and define getChildren(), write() and $this->link in a subclass.
Noun objects (that represent things) can be useful, but a compulsion to create noun objects can lead to an entanglement between algorithms and data structures, poor code reuse, and a proliferation of artifacts that makes for a high defect count and expensive maintainance. Even if were using noun objects in this program, it still makes sense to implement this algorithm as a verb object: by creating interfaces for the adjacancy list and for the output, I could keep the algorithm reusable while keeping the noun objects simple.
Many of the fashionable languages that people claim improve productivity, such as Lisp, Python and Ruby, put powerful data structures at the programmer fingertips: they encourage you to reuse data structures provided by the language rather than to create a new object in response to every situation. Functional languages such as CAML and F# show the power of programming by composition, enabled by the use of standard data structures (and interfaces.)
What’s really exciting is that these methods are becoming increasingly available and popular in mainstream languages, such as C# and PHP.
Generic classes and methods, as available in Java and C#, let you have both reusable data structures and type safety. Although there are trade offs, you seriously consider using a List<Thing> rather than creating a CollectionOfThings. Yes, a collection of things that you create can present exactly the interface you need, however, without a lot of care, you might find that another member of your team wrote an incompatible JohnsCollectionOfThings, and you end up writing more glue code because you’re interfacing with a third party vendor that uses a MagicBoxFullOfThings instead.
Although there advantages to encapsulation, and advantages to creating noun objects (sometimes they do provide the right places for extension points,) programmers need to be careful when they create new classes. Every class you write, every method, and every line of code is like a new puppy: you not only need to write it, but you’ll need to maintain it in the future.
Image sources: Parts Explosion Diagram of Hummer Steering Knuckle from CHEMAXX, Handheld Tally Counter from Different Roads To Learning.
]]>One of the challenges in writing programs in today’s RIA environments (Javascript, Flex, Silverlight and GWT) is expressing the flow of control between multiple asynchronous XHR calls. A “one-click-one-XHR” policy is often best, but you don’t always have control over your client-server protocols. A program that’s simple to read as a synchronous program can become a tangle of subroutines when it’s broken up into a number of callback functions. One answer is program translation: to manually or automatically convert a synchronous program into an asynchronous program: starting from the theoretical foundation, this article talks about a few ways of doing that.
Thibaud Lopez Schneider sent me a link to an interesting paper he wrote, titled “Writing Effective Asynchronous XmlHttpRequests.” He presents an informal proof that you can take a program that uses synchronous function calls and common control structures such as if-else and do-while, and transform it a program that calls the functions asynchronously. In simple language, it gives a blueprint for implementing arbitrary control flow in code that uses asynchronous XmlHttpRequests.
In this article, I work a simple example from Thibaud’s paper and talk about four software tools that automated the conversion of conventional control flows to asynchronous programming. One tool, the Windows Workflow Foundation, lets us compose long-running applications out of a collection of asynchronous Activity objects. Another two tools are jwacs and Narrative Javascript, open-source translators that translated pseudo-blocking programs in a modified dialect of JavaScript into an asynchronous program in ordinary JavaScript that runs in your browser.
I’m going to lift a simple example from Thibaud’s paper, the case of sequential execution. Imagine that we want to write a function f(), that follows the following logic
[01] function f() { [02] ... pre-processing ... [03] result1=MakeRequest1(argument1); [04] ... think about result1 ... [05] result2=MakeRequest2(argument2); [06] ... think about result2 ... [07] result3=MakeRequest3(argument3); [08] ... think about result3 ... [09] return finalResult; [10] }
where functions of the form MakeRequestN are ordinary synchronous functions. If, however, we were working in an environment like JavaScript, GWT, Flex, or Silverlight, server requests are asynchronous, so we’ve only got functions like:
[11] function BeginMakeRequestN(argument1, callbackFunction);
It’s no longer possible to express a sequence of related requests as a single function, instead we need to transform f() into a series of functions, like so
[12] function f(callbackFunction) { [13] ... pre-processing ... [14] BeginMakeRequest1(argument,f1); [15] } [16] [17] function f1(result1) { [18] ... think about result1 ... [19] BeginMakeRequest2(argument2,f2); [20] } [21] [22] function f2(result2) { [23] ... think about result2 ... [24] BeginMakeRequest3(argument3,f3); [25] } [26] [27] function f3(result3) { [28] ... think about result 3 ... [29] callbackFunction(finalResult); [30] }
My example differs from the example of on page 19 of Thibaud’s paper in a few ways… In particular, I’ve added the callbackFunction that f() uses to “return” a result to the program that calls it. Here the callbackFunction lives in a scope that’s shared by all of the fN functions, so it’s available in f3. I’ve found that when you’re applying Thibuad’s kind of thinking, it’s useful for f() to correspond to an object, of which the fN() functions are methods. [1] [2] [3]
Thibaud also works the implementation of if-then-else, switch, for, do-while, parallel-do and other common patterns — read his paper!
There are things missing from Thibaud’s current draft: for instance, he doesn’t consider how to implement exception handling in asynchronous applications, although it’s quite possible to do.
Thinking about things systematically helps you do things by hand, but it really comes into it’s own when we use systematic thinking to develop tools. I can imagine two kinds of tools based on Thibaud’s ideas:
Windows Workflow Foundation is an example of the first approach.
Although it’s not designed for use in asynchronous RIA’s, Microsoft’s Windows Workflow Foundation is an new approach to writing reactive programs. Unfortunately, like a lot of enterprise technologies, WWF is surrounded by a lot of hype that obscures a number of worthwhile ideas: the book Essential Windows Workflow Foundation by Shukla and Schmidt is a lucid explanation of the principles behind it. It’s good reading even if you hate Microsoft and would never use a Microsoft product, because it could inspire you to implement something similar in your favorite environment. (I know someone who’s writing a webcrawler in PHP based on a similar approach)
What does it do?
In WWF, you create an asynchronous program by composing a set of asynchronous Activities. Ultimately your program is a tree of Activity objects that you can assemble any way you like, but typically you’d build them with a XAML (XML) file that might look like
[31] <Interleave x:Name="i1"> [32] <Sequence x:Name="s1"> [33] <ReadLine x:Name="r1" /> [34] <WriteLine x:Name="w1" [35] Text="{wf:ActivityBind r1,path=Text}" /> [36] <ReadLine x:Name="r2" /> [37] <WriteLine x:Name="w2" [38] Text="{wf:ActivityBind r2,path=Text}" /> [39] </Sequence> [40] <Sequence x:Name="s2"> [41] <ReadLine x:Name="r3" /> [42] <WriteLine x:Name="w3" [43] Text="{wf:ActivityBind r3,path=Text}" /> [44] <ReadLine x:Name="r4" /> [45] <WriteLine x:Name="w4" [46] Text="{wf:ActivityBind r4,path=Text}" /> [47] </Sequence> [48] </Interleave>
(The above example is based on Listing 3.18 on Page 98 of Shukla and Schmidt, with some namespace declarations removed for clarity)
This defines a flow of execution that looks like:
The <Interleave> activity causes two <Sequence> activities to run simultaneously. Each <Sequence>, in turn, sequentially executes two alternating pairs of <ReadLine> and <WriteLine> activities. Note that the attribute values that look like {wf: ActivityBind r3,path=Text} wire out the output of a <ReadLine> activity to the input of a <WriteLine> activity.
Note that <Interleave>, <Sequence>, <ReadLine> and <WriteLine> are all asynchronous activities defined by classes Interleave, Sequence, ReadLine And WriteLine that all implement Activity. An activity can invoke other activities, so it’s possible to create new control structures. Activities can wait for things to happen in the outside world (such as a web request or an email message) by listening to a queue. WWF also defines an elaborate model for error handling.
Although other uses are possible, WWF is intended for the implementation of server applications implementations that implement workflows. Imagine, for instance, a college applications system, which must wait for a number of forms from the outside, such as
and that needs to solicit internal input from
The state of a workflow can be serialized to a database, so the workflow can be something that takes place over a long time, such as months or weeks — multiple instances of the workflow can exist at the same time.
WWF looks like a fun environment to program for, but I don’t know if I’d trust it for a real business application. Why? I’ve been building this sort of application for years using relational databases, I know that it’s possible to handle the maintenance situations that occur in real life with a relational representation: both the little tweaks you need to make to a production system from time to time, and the more major changes required when your process changes. Systems based on object serialization, such as WWF, tend to have trouble when you need to change the definition of objects over time.
I can say, however, that the Shukla and Schmidt book is so clear that an ambitious programmer could understand enough of the ideas behind WWF to develop a similar framework that’s specialized for developing asynchronous RIAs in Javascript, Java, or C# pretty quickly. Read it!
Another line of attack on asynchronous programming is the creation of compilers and translators that transform a synchronous program into a synchronous program. This is particularly popular in Javascript, where open-source tools such as jwacs (Javascript With Advanced Continuation Syntax) let you write code like this:
[49] function main() { [50] document.getElementById('contentDiv').innerHTML = [51] '<pre>' [52] + JwacsLib.fetchData('GET', 'dataRows.txt') [53] + '</pre>'; [54] }
Jwacs adds four new keywords to the Javascript language: internally, it applies transformations like the ones in the Thibaud paper. Although it looks as if the call to JwacsLib.fetchData blocks, in reality, it splits the main() function into two halves, executing the function by a form of cooperative multitasking.
Narrative Javascript is a similar open-source translator that adds ->, a “yielding” operator to Javascript. This signals the translator to split the enclosing function, and works for timer and UI event callbacks as well as XHR. Therefore, it’s possible to write a pseudo-blocking sleep() function like:
[55] function sleep(millis) { [56] var notifier = new EventNotifier(); [57] setTimeout(notifier, millis); [58] notifier.wait->(); [59] }
Narrative Javascript doesn’t remember the special nature of the the sleep() function, so you need to call it with the yielding operator too. With it, you can animate an element like so:
[60] for (var i = 0; i < frameCount - 1; i++) { [61] var nextValue = startValue + (jumpSize * i); [62] element.style[property] = nextValue + "px"; [63] sleep->(frequency); [64] }
You can use the yielding operator to wait on user interface events as well. If you first define
[65] function waitForClick(element) { [66] var notifier = new EventNotifier(); [67] element.onclick = notifier; [68] notifier.wait->(); [69] }
you can call it with the yielding operator to wait for a button press
[70] theButton.innerHTML = "go right"; [71] waitForClick->(theButton); [72] theButton.innerHTML = "-->"; [73] ... continue animation ...
The RIFE Continuation Engine implements something quite similar in Java, but it translates at the bytecode level instead of at the source code level: it aims to transform the server-side of web applications, rather than the client, by allowing the execution of a function to span two separate http requests.
It’s possible to systematically transform a function that’s written in terms of conventional control structures and synchronous function calls into a collection of functions that performs the same logic using asynchronous function calls. A paper by Thibaud Lopez Schneider points the way, and is immediately useful for RIA programmers that need to convert conventional control structures in their head into asynchronous code.
A longer-term strategy is to develop frameworks and languages that make it easier to express desired control flows for asynchronous program. The Windows Workflow Foundation from Microsoft is a fascinating attempt to create a specialized language for assembling asynchronous programs from a collection of Activity objects. jwacs and Narrative Javascript are bold attempts to extend the Javascript language so that people can express asynchronous logic as pseudo-threaded programs. The RIFE Continuation Engine demonstrates that this kind of behavior can be implemented in more static languages such as Java and C#. Although none of these tools are ready for production-quality RIA work, they may lead to something useful in the next few years.
]]>Many people have independely discovered a new design pattern, the “Multiton”, which, like the “Singleton” is an initialization pattern in the style of the Design Patterns book. Like the Singleton, the Multiton provides a method that controls the construction of a class: instead of maintaining a single copy of an object in an address space, the Multiton maintains a Dictionary that maps keys to unique objects.
The Multiton pattern can be used in systems that store persistent data in a back-end store, such as a relational databases. The Multiton pattern can be used to maintain a set of objects are mapped to objects (rows) in a persistent store: it applies obviously to object-relational mapping systems, and is also useful in asynchronous RIA’s, which need to keep track of user interface elements that are interested in information from the server.
An alternate use case of Mulitons, seen in the “Multicore” version of the PureMVC framework, is the extension of the Singleton pattern to support multiple instances of a system in a single address space.
As useful as the Multiton pattern is, this article explains how Multitons use references in a way that doesn’t work well with conventional garbage collection. Multitons are a great choice when the number of Multitons is small, but they may leak memory unacceptablely when more than a few thousand are created. Future posts will describe patterns, such as the Captive Multiton, that provide the same capabilities with more scalable memory management — subscribe to our RSS feed to keep informed.
In our last article on Model-View Separation in Asynchronous RIA’s, we used a Singleton object that represented an entire table in a relational database. This object maintained a list of listerners that were interested in the contents of a table. In this case, the amount of information in the table was small, and often used in the aggregate, so retreiving a complete copy of the table was a reasonable level of granularity. We could imagine a situation, however, where the number of records and size of the records is enough that we need to transfer records individually. (This specific case is an outline of an implementation for Silverlight: a GWT implementation would be similar — details specific to GWT are talked about in a previous post.)
Imagine, for instance, a BlogPosting object, which represents a post in a blog, which in turn has an integer primary key. The BlogPosting object is a multiton, so you’d write
[01] var posting=BlogPosting.GetInstance(postId);
to get the instance of BlogPosting that corresponds to postId. Client objects can’t really write something like
[02] TitleField.Text=posting.Title
because the operation of retrieving text from an the server is asynchronous, and won’t return in time to return a value, either on line [01] or [02]. More reasonably, a BlogPostingViewer can register itself against a BlogPosting instance so it will be notified when information is available about the blog posting.
[03] public class BlogPostingViewer: UserControl,IBlogPostingListener { [04] protected int PostId; [05] [06] public BlogPostViewer(int postId) { [07] PostId=postId; [08] BlogPosting.GetInstance(postId).AddListener(this); [09] } [10] [11] public void Dispose() { [12] BlogPosting.GetInstance(postId).RemoveListener(this); [13] super.Dispose(); [14] }
This example shows a pattern usable in a Silverlight applicaton, unlike the GWT style in the model-view article. The Dispose() method will need to be called manually when the BlogPostingViewer is no longer needed, since it will never be garbage collected so long as a reference to it inside the BlogPosting exists. (This points to a general risk of memory leaks with Multitons that we’ll talk about later.) This problem can be addre
The BlogPostingViewer goes on to implement the IBlogPostingListener interface, updating the visual appearance of the user interface to reflect information from the UI:
[15] public void UpdatePosting(BlogPostingData d) { [16] if (d==null) { [17] ClearUserInterface(); // user-defined method blanks out UI [18] return [19] } [20] TitleField.Text=d.Title; [21] ... [22] } }
We assume that BlogPostingData represents the state of the BlogPosting at a moment in time, distinct from the BlogPosting, which represents the BlogPosting as a persistent object. BlogPostingData might (roughly) correspond to the the columns of a relational table and look something like:
[23] public class BlogPostingData { [24] public string Title { get; set;} [25] public Contributor Author { get; set; } [26] public string Body { get; set;} [27] public Category[] AssociatedCategories { get; set;} [28] ... [29] }
We could then add a BlogPostingViewer to the user interface and schedule it’s initialization by writing
[30] var viewer=new BlogPostingViewer(PostId); [31] OuterControl.Children.Add(viewer); [32] BlogPosting.GetInstance(PostId).Fetch();
Note that line [32] tells the BlogPosting instance to retreive a copy of the posting from the server (an instance of BlogPostingData) and call UpdatePosting() on all of the listeners. Therefore, there will be a time between line [30] and the time when the async call started on line [32] gets back when the BlogPostingViewer is empty (not initialized with BlogPostingData.) Therfore, the BlogPostingViewer must be designed so that nothing bad happens when it’s in that state: it has to show something reasonable to user and not crash the app if the user clicks a button that isn’t ready yet.
(In a more developed application, the BlogPosting could keep a cache of the latest BlogPostingData: this could improve responsiveness by updating the BlogPostingViewer at the moment it registers, or by doing a timestamp or checksum stamp against the server to reduce the bandwidth requirements of a Fetch(), just watch out for the unintended consequences of multiple code paths.)
Here’s an implementation of a Multiton in C# that’s not too different from the Java implementation from Wikipedia.
class BlogPosting { #region Initialization private static readonly Dictionary<int,BlogPosting> _Instances = new Dictionary<int,BlogPosting>(); private BlogPosting(int key) { ... construct the object ... } public static BlogPosting GetInstance(int key) { lock(_Instances) { BlogPosting instance; if (_Instances.TryGetValue(key,out instance)) { return instance; }
instance = new BlogPosting(key); _Instances.Add(key, instance); return instance; } } #endregion ... the rest of the class ... }
I’m pretty sure that a version of this could be created in C# with slightly sweeter syntax that would look like
BlogPosting.Instance[postId]
but this doesn’t address the weak implementation of static inheritence in many popular languages that requires us to cut-and-paste roughly 20 lines of code for each Multiton class, rather than being able to reuse inheritence logic. The Ruby Applications Library, on the other hand, contains a Multiton class that can be used to bolt Multiton behavior onto a class. It would be interesting to see what could be accomplished with PHP 5.3′s late static binding.
Multitons, unfortunately, don’t interact well with garbage collectors. Once a Multiton is created, the static _Instances array will maintain a reference to every Multiton in the system, so that Multitons won’t be collected, even if no active references exist.
You might think you could manually remove Multitons from the _Instances list, but this won’t be entirely reliable. In the case above, each BlogPosting maintains a list of IBlogPostingListeners. You could, in principle, scavenge BlogPostings with an empty set of listerners, but that doesn’t stop a class from squirreling away a copy of a BlogPosting that will later conflict with a new BlogPosting that somebody creates by using BlogPosting.GetInstance().
WeakReferences, as available in dot-Net and the full Java platform (as opposed to GWT), are not an answer to this problem, because references work backwards in this case: a BlogPosting is collectable if (i) no references to the BlogPosting exist outside the _Instances array, and (ii) a BlogPosting doesn’t hold references to other objects that may need to be updated in the future.
The severity of this issue depends on the number of Multitons created and the size of the Mulitons. If the granularity of Multitons is coarse, and you’ll only create five of them, there’s no problem. 1000 Multitons that each consume 1 kilobyte will consume about a megabyte of RAM, which is inconsequential for most applications these days. However, this amounts to a scaling issue: an application that works fine when it creates 50 Mulitons could break down when it creates 50,000.
One answer to this problem is to restrict access to Muliton so that: (i) references to Multitons can’t be saved by arbitrary objects and (ii) manages Multitons with a kind of reversed reference count, so that Multitons are discared when they no longer hold useful informaton. I call this a Captive Multiton, and this will be the subject of our next exciting episode: subscribe to our RSS feed so you won’t miss it.
So far as I can tell, Multitons have been independently discovered by many developers in recent years. I used Multitons (I called them “Parameterized Singleons”) in the manner above in a GWT application that I developed in summer 2007. The PureMVC Framework uses Multitons to allow multiple instances of the framework to exist in an address space. A reusable Multiton implementation exists in Ruby.
The Muliton Pattern is an initialization pattern in the sense defined in the notorious “Design Patterns” Book. Mulitons are like Singletons in that they use static methods to control access to a private constructor, but instead of maintaining a single copy of an object in an address space, a Multiton maintains a mapping from key values to objects. A number of uses are emerging for mulitons: (i) Multitons are useful when we want to use something like the Singleton pattern, but support multiple named instances of a system in an an address space and (ii) Multitons can be a useful representation of an object in a persistent store, such as a relational database. Multitons, however, are not collected properly by conventional garbage collectors: this is harmless for applications that create a small number of mulitons, but poses a scaling problem when Multitons are used to represent a large number of objects of fine granularity — a future posting will introduce a Captive Multiton that solves this problem: subscribe to our RSS feed to follow this developing story.
]]>When people start developing RIA’s in environments such as Silverlight, GWT, Flex and plain JavaScript, they often write asynchronous communication callbacks in an unstructured manner, putting them wherever is convenient — often in an instance member of a user interface component (Silverlight and GWT) or in a closure or global function (JavaScript.)
Several problems almost invariably occur as applications become more complex that force the development of an architecture that decouples communication event handlers from the user interface: a straightforward solution is to create a model layer that’s responsible for notifying interested user interface components about data updates.
This article uses a simple example application to show how a first-generation approach to data updates breaks down and how introducing a model-view split makes for a reliable and maintainable application.
(This is one of a series of articles on RIA architecture: subscribe to the Gen5 RSS feed for future installments.)
Imagine a blogging application that works like the WordPress blog used on this site. This application consists of a number of forms, one of which is used to write a new post:
This form lets you fill out two text fields: a title and the body of the post. It also contains a dropdown list of categories, and gives you the option of adding a new category. Categories are represented (server-side) in a table in a relational database that looks like:
[01] CREATE TABLE categoryList ( [02] id integer primary key auto_increment, [03] name varchar(255) [04] );
Adding a category to the database requires a call to the server that adds a row to the database and returns the new category list, which is then used to update the dropdown list. I’ll show you samples of the app in a pseudocode in an imaginary environment which combines the best of Silverlight and GWT. First we initialize the form and set an event handler that’s called when somebody clicks on the AddCategoryButton:
[05] class CreatePostForm { [06] protected TextBox Title; [07] protected ListBox Category; [08] protected TextBox AddCategoryName; [09] protected Button AddCategoryButton; [10] protected RichTextArea Body; [11] protected Button Submit; [12] [13] public CreatePostForm() { [14] ... initialize and lay out UI elements ... [15] [16] AddCategoryButton.OnClick += AddCategoryButton_OnClick; [17] [18] ... finish construction ... [19] }
Leaving out error handling and other details, the job of the event handler is to pass the name of the new category to the server. The event handler is defined as an instance method of CreatePostForm:
[20] protected void AddCategoryButton_OnClick { [21] Server.Instance.AddCategory(AddCategoryName.Text,AddCategory_Completed) [22] }
The AddCategory RPC call is defined on a Singleton called Server, and takes two arguments: (1) a string with the name of the new category, and (2) a reference to to the callback function that gets called when the RPC call is complete. The callback, AddCategory_Completed, is also an instance method:
[23] protected void AddCategory_Completed(List<ListBoxItem> items) { [24] Category.Items = items; [25] }
ListBoxItem is a class that represents a single row in a ListBox, which has properties ListBox.Id and ListBox.Name. This is simple and straightforward code, and it ought to maintainable, right?
Let’s see
Well, when we finish writing the class, we notice the first problem – a minor problem. There are two buttons on the form, so we need two event handlers and two callback functions. As a UI class gets complicated, it can accumulate quite a few callback functions, and it can get tricky keeping track of them all. Careful naming, code organization, and the use of #region in C# can help organize the code, but it’s easy to build UI controls that have tens of methods in which we can get lost.
Over time, we’ll add more forms to the app, and pretty soon we’ll add another form that has a category list: perhaps this a form used by administrators to search for posts: let’s call it AdminSearchForm. AdminSearchForm also contains a Listbox called Category. It’s a protected field of AdminSearchForm, but we need to update it when the administrator adds a new category. It seems reasonable to add a public method to AdminSearchForm
[26] public class AdminSearchForm { [27] ... [28] protected ListBox Categories; [29] ... [30] public void UpdateCategoryList(List<ListBoxItem> items) { [31] Categories.Items=items; [32] } [33] }
Now we update the AddCategory_Completed function so it updates the AdminSearchForm:
[34] public class CreatePostForm { [35] ... [36] protected void AddCategory_Completed(List<ListBoxItems> items) { [37] Categories.Items=items; [38] App.Instance.MainTabPanel.AdminSearchForm.UpdateCategoryList(items); [39] } [40] }
Not too bad, eh? just four more lines of transparent code to update AdminSearchForm, even if line [38] has a rather ugly coupling to the detailed structure of the application.
Over the next few weeks, we add a few more dropdown lists to the application, we keep doing the same thing, and it’s fine for a while. Then we start running into problems:
I’m sorry to admit that, when I built my first GWT app, I ran into all of the above problems, plus a number of others. I tried a number of ad hoc solutions until I was forced to sit down and develop an architecture (the one below) that doesn’t run out of steam. Today, you can do better.
Ok, the plan is to create two classes: CategoryList and CategoryListBox that work together to solve the problem of updating CategoryList boxes. CategoryList is a singleton: it keeps track of the current state of the category list and keeps a list of clients that need to know when the list is updated.
The code for CategoryList looks like:
[41] public class CategoryList { [42] private static CategoryList _Instance; [43] public static CategoryList Instance { [44] get { [45] if (_Instance==null) [46] _Instance=new CategoryList(); [47] [48] return _Instance; [49] } [50] } [51] [52] private List<ListBoxItems> Items {get; set;} [53] private List<ICategoryListener> Listeners; [54] private CategoryList() { ... construct ...};
Java programmers might notice a few C#-isms here, in particular the way the class defines a static property called Instance that other classes use. We don’t, however, use the C# event mechanism, because it doesn’t do exactly what we want to do.
We call UpdateItems when there’s a change in the category list, or when we initialize the category list when the application starts. UpdateItems as an ordinary method, although a C# stylist might probably make the Items property public and put the following logic in the setter:
[55] public void UpdateItems(List<ListBoxItems> items) { [56] Items=items; [57] foreach(var l in Listeners) { [58] l.UpdateItems(Items); [59] } [60] }
CategoryListBoxes will register and unregister themselves with the CategoryList with the following methods:
[61] public AddListener(ICategoryListener l) { [62] Listeners.Add(l); [63] l.UpdateItems(Items); [64] } [65] public RemoveListener(ICategoryListener l) { [66] Listeners.Remove(l); [67] } [68] }
Note that we could have built all of this logic into the CategoryListBox, but by introducing the CategoryList class and the ICategoryListener interface, we’ve decoupled the model from the view, and given ourselves the option to create new visual representations of the category list. (WordPress, for instance has a distinct representation of the category list on the “Manage Category” screens and more than one way you can show a category list to your viewers.)
An interesting point is that AddListener immediately updates the listener when it registers itself. This is a pattern that handles asynchronous initialization: so long as the Items property starts out as something harmless, ICategoryListeners formed before app initialization is completed will be initialized when the application initialization code calls UpdateItems. If an ICategoryListener is created later, it gets initialized upon registration — either way you’re covered without having to think about it.
Let’s take a look at the CategoryListBox, which extends ListBox and implements ICategoryListener.
[70] public CategoryListBox: ListBox, ICategoryListener {
It implements ICategoryListener by implementing the UpdateItems method:
[71] public UpdateItems(List<ListBoxItems> items) { [72] Items=items [73] }
We’re going to implement registration and deregistration GWT style, because GWT has particularly strict requirements for how we can access UI components. We’re only allowed to manipulate UI components that are attached to the underyling HTML document tree — by registering and deregistering when the component is attached and detached, components get updated at the proper times:
[74] public OnAttach() { [75] super.OnAttach(); [76] CategoryList.Instance.AddListener(this); [77] } [78] [79] public OnDetach() { [80] CategoryList.Instance.RemoveListner(this); [81] super.OnDetach(); [82] }
The GWT style is particularly nice in that it prevents long-lasting circular references between the view and the model: once you remove the view from the visual, the reference in the model goes away. Silverlight is more forgiving in where you can register the control: you can do it either the constructor or the Loaded event, but I don’t see an equivalent Unloaded event which could be used for automatic deregistration — manual deregistration may be necessary to prevent memory leaks.
So what have we got?
We’ve got a CategoryListBox control that works together with the CategoryList singleton to keep itself updated. So long as we call CategoryList.UpdateItems() during the initialization process, we can just include a CategoryListBox where we want it and never worry about initialization or updating. We can even create new ICategoryListeners if we want to make other visual controls that display the category list. This is a path to simple and scalable development.
The Model-View-Controller paradigm is a perennially popular buzzword in computing. The phrase was coined in the early 1980′s to describe a particular implementation in Smalltalk, which was one of the first implementations of a modern GUI. The Controller is a third component that mediates between the View, Controller and their environment. Although Controllers are widspread in server-based web applications, the Controller often withers away in today’s GUI environments, because it’s functions are often implemented by the event-handling mechanisms that come with the environment. In this case, “Controller”-like logic is embedded in certain methods of the CategoryList.
Note that there are two objects here that could be called a “Model”. I’m calling the CategoryList a model because it has a 1-1 relationship with an object on the server: the categoryList table. CategoryList is a relatively persistent object that lasts for the lifetime of the RIA. There’s another kind of “Model” object, the List<ListBoxItems> that is stored in the Items property of CategoryList and is passed to a ICategoryListener during initialization or update — that object represents the state of the categoryList table at a particular instance time. The generic List<> is an adequate representation of the state of categoryList, although there are many cases where we might want to define a new class to represent the momentary state of a server object.
Something else funny about CategoryList is that it doesn’t export a public Items property. It certainly could, bu I chose not to because a getter for an asynchronous model object is making an empty promise.
A getter in a synchronous application can always initialize or update itself before returning: a similar method in an asynchronous object must return to it’s caller before it can receive information from the server. As asynchronous model can return a cached value of Items if available, but it can make a much firmer promise to deliver correct updates of Items when they become available. CategoryList does, however, deliver a cached copy of Items to CategoryListeners after registration, as this is an effective and efficient mechanism for initialization.
Would it be possible to define only a temporary ‘model’ class and put a single Controller class in charge of updates? Sure. I think that would make more sense in a dynamically typed language like Javascript than it does in Java or C#, since it would be hard for such a Controller to enforce type-safety. Could we call CategoryList a Controller? Perhaps, but I think CategoryList is a logical place to locate methods that manipulate the categoryList — it really is a representation of a persistent object.
This is a good start, but we haven’t entirely solved the RIA architecture problem. Let’s talk about some of the issues we’d face if we generalized this approach:
I’ll be elaborating on these issues in future postings: subscribe to my RSS feed to keep up to date!
It’s simple to initialize and update data in the simplest RIA’s, but asynchronous communications makes it increasingly difficult as applications grow in complexity. A simple approach to data updating that is reliable and maintainable is to create a set of persistent model classes that maintain:
Model objects are responsible for updating View objects, which in turn, are responsible for registering themselves with the Model. The result of this is that View objects can be used composably in the UI: View objects can be added to the user interface without explicitly writing code to manage data updates.
Although this pattern can be applied immediately, we’ll get the most of it when it (or a similar pattern) is incorporated in client-side RIA frameworks. There are only a few client-side frameworks today (for instance, Cairngorn and PureMVC) but I think we’ll see exciting developments in the next year: subscribe to the Gen5 RSS feed to keep up with developments.
]]>Silverlight 2 Beta 2 has changed the concurrency model for asynchronous communications. In Silverlight 2 Beta 1, asynchronous requests always returned on the UI Thread. This was convenient, since updates to the user interface can only be done via the UI thread. As of Silverlight 2 Beta 2, asynchronous callbacks are fired in worker threads that come from a thread pool: although this potentially allows for better performance via concurrency, it increases the risk for race conditions between callbacks – more importantly, some mechanism is necessary to make code that updates the user interface run in the UI thread.
It’s straightforward to execute a function in the UI thread by using the Dispatcher property of any ScriptObject The tricky part is that ScriptObjects are part of the user interface, so you can only access the Dispatcher property from the UI thread. At first this seems like a chicken-and-egg situation: you need a Dispatcher to get to the UI thread, but you need to be in the UI thread to get a Dispatcher.
This dilemma can be resolved by accessing a Dispatcher in your App.xaml.cs file and stashing it away in a static variable on application startup:
private void Application_Startup(object sender, StartupEventArgs e) { ... UIThread.Dispatcher = RootVisual.Dispatcher; }
UIThread is a simple static class:
public static class UIThread { public static Dispatcher Dispatcher {get; set;} public static void Run(Action a) { Dispatcher.BeginInvoke(a); } }
At some point in the future, you can transfer execution to the UIThread by scheduling a function to run in it.
public class ProcessHttpResponse(...) { ... UIThread.Run(UpdateUserInterface); }
The parameter of Run is an Action delegate, which is a function that returns void and takes no parameters. That’s fine and good, but what if you want to pass some parameters to the function that updates the UI. The usual three choices for passing state in asynchronous applications apply, but it’s particularly easy and fun to use a closure here:
public class ProcessHttpResponse(...) { ... var elementToUpdate=...; var updateWithValue=...; UIThread.Run(delegate() { UpdateUserInterface(elementToUpdate,updateWithValue) }); }
If your application is complex, and you have nested asynchronous calls, you’re left with an interesting question: where is the best place to switch execution to the UI thread? You can switch to the UI Thread as soon as you get back from an HttpRequest or a WCF call and you must switch to the UI Thread before you access any methods or properties of the user interface. What’s best?
It is simple and safe to switch to the UI Thread immediately after requests return from the server. If you’re consistent in doing this, you’ll never have trouble accessing the UI thread, and you’ll never have trouble with race conditions between returning communication responses. On the other hand, you’ll lose the benefit of processing responses concurrently, which can improve speed and responsiveness on today’s multi-core computers.
It’s straightforward to exploit concurrency when a requests can be processed independently. For instance, imagine a VRML viewer written in Silverlight. Displaying a VRML would require the parsing of a file, the construction of the scene graph and the initialization of a 3-d engine, which may require building data structures such as a BSP Tree. Doing all of this work in the UI Thread would make the application lock up while a model is loading — it would be straightforward, instead, to construct all of the data structures required by the 3-d engine, and attach the fully initialized 3-d engine to the user interface as the last step. Since the data structures would be independent of the rest of the application, thread safety and liveness is a nonissue.
Matters can get more complicated, however, if the processing of a request requires access to application-wide data; response handlers running in multiple threads will probably corrupt shared data structures unless careful attention is paid to thread safety. One simple approach is to always access shared data from the UI Thread, and to transfer control to the UI Thread with UIThread.Run before accessing shared variables.
Silverlight 2 Beta 2 introduces a major change in the concurrency model for asynchronous communication requests. Unlike SL2B1, where asynchronous requests executed on the user interface thread, SL2B2 launches asynchronous callbacks on multiple threads. Although this model offers better performance and responsiveness, it requires Silverlight programmers to explicitly transfer execution to the UI thread before accessing UI objects: most SL2B1 applications will need to be reworked.
This article introduces a simple static class, UIThread, which makes it easy to schedule execution in the UI Thread.
]]>int AddToCount(int amount,string countId) { int countValue=GetCount(countId); return countValue+amount; }
This doesn’t work if the GetCount function is asynchronous, where we need to write something like
int AddToCountBegin(int amount,string countId,CountCallback outerCallback) { GetCountBegin(countId,AddToCountCallback); } void AddToCountCallback(int countValue) { ... some code to get the values of amount and outerCallback ... outerCallback(countValue+amount); }
Several things change in this example: (i) the AddToCount function gets broken up into two functions: one that does the work before the GetCount invocation, and one that does the work after GetCount completes. (ii) We can’t return a meaningful value from AddToCountCallback, so it needs to ‘return’ a value via a specified callback function. (iii) Finally, the values of outerCallback and amount aren’t automatically shared between the functions, so we need to make sure that they are carried over somehow.
There are three ways of passing context from a function that calls and asynchronous function to the callback function:
Let’s talk about these alternatives:
In this case, a context object is passed to the asynchronous function, which passes the context object to the callback. The advantage here is that there aren’t any constraints on how the callback function is implemented, other than by accepting the context object as a callback. In particular, the callback function can be static. A major disadvantage is that the asynchronous function has to support this: it has to accept a state object which it later passes to the callback function.
The implementation of HttpWebRequest.BeginGetResponse(AsyncCallback a,Object state) in the Silverlight libraries is a nice example. If you wish to pass a context object to the AsyncCallback, you can pass it in the second parameter, state. Your callback function will implement the AsyncCallback delegate, and will get something that implements IAsyncResult as a parameter. The state that you passed into BeginGetResponse will come back in the IAsyncResult.AsyncState property. For example:
class MyHttpContext { public HttpWebRequest Request; public SomeObject FirstContextParameter; public AnotherObject AnotherContextParameter; } protected void myHttpCallback(IAsyncResult abstractResult) { MyHttpContext context = (MyHttpContext) abstractResult.AsyncState; HttpWebResponse Response=(HttpWebResponse) context.Request.EndGetResponse(abstractResult); } public doHttpRequest(...) { ... MyHttpContext context=new MyHttpContext(); context.Request=Request; context.FirstContextParameter = ... some value ...; context.AnotherContextParameter = .. another value ...; Request.BeginGetResponse(); Request.Callback(myHttpCallback,context); }
Note that, in this API, the Request object needs to be available in myHttpCallback because myHttpCallbacks get the response by calling the HttpWebResponse.EndGetResponse() method. We could simply pass the Request object in the state parameter, but we’re passing an object we defined, myHttpCallback, because we’d like to carry additional state into myHttpCallback.
Note that the corresponding method for doing XMLHttpRequests in GWT, the use of a RequestBuilder object doesn’t allow using method (1) to pass context information — there is no state parameter. in GWT you need to use method (2) or (3) to pass context at the RequestBuilder or GWT RPC level. You’re free, of course, to use method (1) when you’re chaining asynchronous callbacks: however, method (2) is more natural in Java where, instead of a delegate, you need to pass an object reference to designate a callback function.
Functions (or Methods) are always attached to a class in C# and Java: thus, the state of a callback function can be kept in either static or instance variables of the associated class. I don’t advise using static variables for this, because it’s possible for more than one asynchronous request to be flight at a time: if two request store state in the same variables, you’ll introduce race conditions that will cause a world of pain. (see how race conditions arise in asynchronous communications.)
Method 2 is particularly effective when both the calling and the callback functions are methods of the same class. Using objects whose lifecycle is linked to a single asynchronous request is an effective way to avoid conflicts between requests (see the asynchronous command pattern and asynchronous functions.)
Here’s an example, lifted from the asynchronous functions article:
public class HttpGet : IAsyncFunction<String> { private Uri Path; private CallbackFunction<String> OuterCallback; private HttpWebRequest Request; public HttpGet(Uri path) { Path = path; } public void Execute(CallbackFunction<String> outerCallback) { OuterCallback = outerCallback; try { Request = (HttpWebRequest)WebRequest.Create(Path); Request.Method = "GET"; Request.BeginGetRequestStream(InnerCallback,null); } catch (Exception ex) { OuterCallback(CallbackReturnValue<String>.CreateError(ex)); } } public void InnerCallback(IAsyncResult result) { try { HttpWebResponse response = (HttpWebResponse) Request.EndGetResponse(result); TextReader reader = new StreamReader(response.GetResponseStream()); OuterCallback(CallbackReturnValue<String>.CreateOk(reader.ReadToEnd())); } catch(Exception ex) { OuterCallback(CallbackReturnValue<String>.CreateError(ex)); } } }
Note that two pieces of context are being passed into the callback function: an HttpWebRequest object named Request (necessary to get the response) and a CallbackFunction<String> delegate named OuterCallback that receives the return value of the asynchronous function.
Unlike Method 1, Method 2 makes it possible to keep an unlimited number of context variables that are unique to a particular case in a manner that is both typesafe and oblivious to the function being called — you don’t need to cast an Object to something more specific, and you don’t need to create a new class to hold multiple variables that you’d like to pass into the callback function.
Method 2 comes into it’s own when it’s used together with polymorphism, inheritance and initialization patterns such as the factory pattern: if the work done by the requesting and callback methods can be divided into smaller methods, a hierarchy of asynchronous functions or commands can reuse code efficiently.
In both C# and Java, it’s possible for a method defined inside a method to have access to variables in the enclosing method. In C# this is a matter of creating an anonymous delegate, while in Java it’s necessary to create an anonymous class.
Using closures results in the shortest code, if not the most understandable code. In some cases, execution proceeds in a straight downward line through the code — much like a synchronous version of the code. However, people sometimes get confused the indentation, and, more seriously, parameters after the closure definition and code that runs immediately after the request is fired end up in an awkward place (after the definition of the callback function.)
public class HttpGet : IAsyncFunction<String> { private Uri Path; public HttpGet(Uri path) { Path = path; } public void Execute(CallbackFunction<String> outerCallback) { OuterCallback = outerCallback; try { HttpWebRequest request = (HttpWebRequest)WebRequest.Create(Path); Request.Method = "GET"; Request.BeginGetRequestStream(delegate(IAsyncResult result) { try { response = request.EndGetResponse(result); TextReader reader = new StreamReader(response.GetResponseStream()); outerCallback(CallbackReturnValue<String>.CreateOk(reader.ReadToEnd())); } catch(Exception ex) { outerCallback(CallbackReturnValue<String>.CreateError(ex)); }
},null); // <--- note parameter value after delegate definition
} catch (Exception ex) { outerCallback(CallbackReturnValue<String>.CreateError(ex)); } } }
The details are different in C# and Java: anonymous classes in Java can access local, static and instance variables from the enclosing context that are declared final — this makes it impossible for variables to be stomped on while an asynchronous request is in flight. C# closures, on the other hand, can access only local variables: most of the time this prevents asynchronous requests from interfering with one another, unless a single method fires multiple asynchronous requests, in which case counter-intuitive things can happen.
In addition to receiving return value(s), callback functions need to know something about the context they run in: to write reliable applications, you need to be conscious of where this information is; better yet, a strategy for where you’re going to put it. Closures, created with anonymous delegates (C#) or classes (Java) produce the shortest code, but not necessarily the clearest. Passing context in an argument to the callback function requires the cooperation of the called function, but it makes few demands on the calling and callback functions: the calling and callback functions can both be static. When a single object contains both calling and callback functions, context can be shared in a straightforward and typesafe manner; and when the calling and callback functions can be broken into smaller functions, opportunities for efficient code reuse abound.
]]>