Archive for April, 2008

Embrace Dynamic PHP

Mat Byrne recently posted source code for a dynamic domain object in PHP which takes advantage of the dynamic nature of PHP.  It’s a good example of how programmers can take advantage of the unique characteristics of a programming language.

Statically typed languages such as C# and Java have some advantages:  they run faster and IDE’s can understand the code enough to save typing (with your fingers),  help you refactor your code,  and help you fix errors.  Although there’s a lot of things I like symfony,  it feels like a Java framework that’s invaded the PHP world.  Eclipse would help you deal with the endless getters and setters and domain object methods with 40-character names in Java,  Eclipse.

The limits of polymorphism are a serious weakness of today’s statically typed languages.  C# and Java apps that I work with are filled with if-then-else or case ladders when they need to initialize a dynamically chosen instance of one of a set of classes that subclass a particular base class or that implement a particular interface.  Sure,  you can make a HashMap or Dictionary that’s filled with Factory objects,  but any answer for that is cumbersome.  In PHP,  however,  you can write

$class_name="Plugin_Module_{$plugin_name}";
$instance = new $class_name($parameters);

This is one of several patterns which make it possible to implement simple but powerful frameworks in PHP.

Mat,  on the other hand,  uses the ‘magic’ __call() method to implement get and set methods dynamically.  This makes it possible to ‘implement’ getters and setters dynamically by simply populating a list of variables,  and drastically simplifies the construction and maintainance of domain objects.  A commenter suggests that he go a step further and use the __get() and __set() method to implement properties.  It’s quite possible to implement active records in PHP with a syntax like

$myTable = $db->myTable;
$row = $myTable->fetch($primaryKey);
$row->Name="A New Name";
$row->AccessCount = $row->AccessCount+1;
$row->save();

I’ve got an experimental active record class that introspects the database (no configuration file!) and implements exactly the above syntax,  but it currently doesn’t know anything about joins and relationships.  It would be a great day for PHP to have a database abstraction that is (i) mature,  (ii) feels like PHP,  and (iii) solves (or reduces) the two-artifact problem of maintaining both a database schema AND a set of configuration files that control the active record layer.

The point of this post isn’t that dynamically typed languages are better than statically typed languages,  but rather that programmers should make the most of the features of the language they use:  no PHP framework has become the ‘rails’ of PHP because no PHP framework has made the most of the dynamic natures of the PHP language.

Once Asynchronous, Always Asynchronous

Oliver Steele writes an excellent blog about coding style,  and has written some good articles on asynchronous communications with a focus on Javascript.

Minimizing Code Paths In Asynchronous Code,  a recent post of his,  is about a lesson that I learned the hard way with GWT that applies to all RIA systems that use asynchronous calls.  His example is the same case I encountered,  where a function might return a value from a cache or might query the server to get the value:   an obvious way to do this in psuedocode is:

function getData(...arguments...,callback) {
   if (... data in cache...) {
      callback(...cached data...);
   }
  cacheCallback=anonymousFunction(...return value...) {
     ... store value in cache...
     callback(...cached data...);
  }
   getDataFromServer(...arguments...,cacheCallback)
}

At first glance this code looks innocuous,  but there’s a major difference between what happens in the cached and uncached case.  In the cached case,  the callback() function gets called before getData() returns — in the uncached case,  the opposite happens.  What happens in this function has a global impact on the execution of the program,  opening up two code paths that complicate concurrency control and introduce bugs that can be frustrating to debug.

This function can be made more reliable if it schedules callback() to run after the thread it is running in completes.  In Javascript,  this can be done with setTimeout().   In Silverlight use System.Windows.Threading.Dispatcher.  to schedule the callback to run in the UI thread.

Asynchronous Functions

Asynchronous Commands are a useful way to organize asynchronous activities, but they don’t have any way to pass values or control back to a caller. This post contains a simple Asynchronous Function library that lets you do that. In C# you call an Asynchronous Function like:

 void CallingMethod(...) {
    ... do some things ...
    IAsyncFunction<String> httpGet=new HttpGet(... parameters...);
    anAsynchronousFunction.Execute(CallbackMethod);
}

void CallbackMethod(CallbackReturnValue<String> crv) {
    if (crv.Error!=null) { ... handle Error,  which is an Exception ...}
    String returnValue=crv.Value;
    ... do something with the return value ...
}

We’re using generics so that return values can be passed back in a type safe manner. The type of the return value of the asynchronous function is specified in the type parameter of IAsyncFunction and CallbackReturnValue.

Asynchronous functions catch exceptions and pass them back in  in the CallbackReturnValue.  This makes it possible to propagate exceptions back to the caller,  as in synchronous functions.  The code to do this must has to be manually replicated in each asynchronous function,  however,  the code can be put into a wrapper delegate.

You could do the same thing in Java, but the CallbackMethod would need to be a class that implements an interface rather than a delegate.

Continue Reading »

Reliable Distributed Systems

Developing distributed systems can be difficult, and many of the patterns that are successful in developing conventional applications (such as constructing complex operations by composing simpler operations) lead to applications that work… some of the time. Although researchers have known it for years, a new generation of practitioners are learning the hard way that there’s an intractable contradiction between scalability, reliability and data integrity.

Ken Birman’s textbook Reliable Distributed Systems, is an excellent introduction to this brave new world, focused on the construction of systems that are reliable — that keep working when something goes wrong. This is critical for rich internet applications (that work over an unreliable public internet) and for applications that run on large clusters (where there’s a lot of hardware to fail.) If you find his text is pricey, you’ll appreciate the slides from his Cornell course available on his home page.

The Asynchronous Command Pattern for HTTP in Silverlight and GWT

When you’re writing RIA applications in an environment like Silverlight or GWT, you’re restricted to doing asynchronous http calls to the server — this leaves you with a number of tricky choices, such as, where to put your callback functions. To be specific, imagine we’ve created a panel in the user interface where a user enters information, then clicks on a form to submit it. The first place you might think of putting the callback is in the class for the panel, something like

public class MyPanel:StackPanel {
	... other functions ...

        void SubmitButton_Click(Object sender,EventArgs e) {
           ... collect data from forms ...
           ServerWrapper.DoSubmission(formData,SubmissionCallback);
        }

        void SubmissionCallback(SubmissionResult result) {
           ... update user interface ...
        }
}

(Although code samples are in C#, the language I’m using now, I developed this pattern when working on a Java project.) This is a straightforward pattern for the simplest applications, but it runs out of steam when your application becomes more complex. It can become confusing to keep track of your callback functions when your object does more than one kind of asynchronous call: for instance, if it has multiple buttons. If the same action can be done on the server from more than one place in the UI, it’s not clear where the callback belongs.

One answer to the problem is to use the Command Pattern, to organize asynchronous activities into their own classes that contain both the code that initiates an asynchronous request and the callback that runs when the request completes. Continue Reading »

Optimistic Locking For Retrieving Result Sets

I’m in the middle of updating my Silverlight code to use asynchronous HTTP requests — fortunately, I spent last summer writing a GWT application, where HTTP requests have always been asynchronous, so I’ve got a library of patterns for solving common problems.

For instance, suppose that you’re doing a search, and then you’re displaying the result of the search. The most reliable way to do this is to use Pattern Zero, which is, do a single request to the server that retrieves all the information — in that case you don’t need to worry about what happens if, out of 20 HTTP requests, one fails.

Sometimes you can’t redesign the client-server protocol, or you’d like to take advantage of caching, in which case you might do something like this (in psuedo code):

getAListOfResults(new AsyncCallback {
     ... clearGUI();
         foreach(result as item) {
            fetchItem(item,new AsyncCallback {
               ... addItemToGui()
         }
}

First we retrieve a list of items, then we retrieve information about each item: this is straightforward, but not always reliable. Even if your application runs in a single thread, as it would in GWT or if you did everything in the UI thread in Silverlight, you can still have race conditions: for instance, results can come back in a random order, and getAListOfResults() can be called more than once by multiple callbacks — that’s really the worst of the problems, because it can cause results to appear more than once in the GUI.

There are a number of solutions to this problem, and a number of non-solutions. A simple solution is to make sure that getAListOfResults() never gets called until the result set has come back. I was able to do that for quite a while last summer, but the application finally reached a level of complexity where it was impossible… or would have required a major redesign of the app. Another is to use pessimistic locking: to not let getAListOfResults() run while result sets are coming back — I think this can be made to work, but if you’re not careful, your app can display stale data or permanently lock up.

Fortunately there’s a pattern to retrieve result sets using optimistic locking that displays fresh data and can’t fail catastrophically

Continue Reading »