Archive for July, 2008

Stop Catching Exceptions!

Motivation

It’s clear that a lot of programmers are uncomfortable with exceptions [1] [2]; in the feedback of an article I wrote about casting, it seemed that many programmers saw the throwing of a NullReferenceException at a cast to be an incredible catastrophe.

In this article, I’ll share a philosophy that I hope will help programmers overcome the widespread fear of exceptions. It’s motivated by five goals:

  1. Do no harm
  2. To write as little error handling code as possible,
  3. To think about error handling as little as possible
  4. To handle errors correctly when possible,
  5. Otherwise errors should be handled sanely

To do that, I

  1. Use finally to stabilize program state when exceptions are thrown
  2. Catch and handle exceptions locally when the effects of the error are local and completely understood
  3. Wrap independent units of work in try-catch blocks to handle errors that have global impact

This isn’t the last word on error handling, but it avoids many of the pitfalls that people fall into with exceptions. By building upon this strategy, I believe it’s possible to develop an effective error handling strategy for most applications: future articles will build on this topic, so keep posted by subscribing to the Generation 5 RSS Feed.

Continue Reading »

The Multiton Design Pattern

Introduction

Many people have independely discovered a new design pattern, the “Multiton”, which, like the “Singleton” is an initialization pattern in the style of the Design Patterns book. Like the Singleton, the Multiton provides a method that controls the construction of a class: instead of maintaining a single copy of an object in an address space, the Multiton maintains a Dictionary that maps keys to unique objects.

The Multiton pattern can be used in systems that store persistent data in a back-end store, such as a relational databases. The Multiton pattern can be used to maintain a set of objects are mapped to objects (rows) in a persistent store: it applies obviously to object-relational mapping systems, and is also useful in asynchronous RIA’s, which need to keep track of user interface elements that are interested in information from the server.

An alternate use case of Mulitons, seen in the “Multicore” version of the PureMVC framework, is the extension of the Singleton pattern to support multiple instances of a system in a single address space.

As useful as the Multiton pattern is, this article explains how Multitons use references in a way that doesn’t work well with conventional garbage collection. Multitons are a great choice when the number of Multitons is small, but they may leak memory unacceptablely when more than a few thousand are created. Future posts will describe patterns, such as the Captive Multiton, that provide the same capabilities with more scalable memory management — subscribe to our RSS feed to keep informed.

Continue Reading »

The Role Of The Model in Silverlight, GWT and Javascript

Introduction

When people start developing RIA’s in environments such as Silverlight, GWT, Flex and plain JavaScript, they often write asynchronous communication callbacks in an unstructured manner, putting them wherever is convenient — often in an instance member of a user interface component (Silverlight and GWT) or in a closure or global function (JavaScript.)

Several problems almost invariably occur as applications become more complex that force the development of an architecture that decouples communication event handlers from the user interface: a straightforward solution is to create a model layer that’s responsible for notifying interested user interface components about data updates.

This article uses a simple example application to show how a first-generation approach to data updates breaks down and how introducing a model-view split makes for a reliable and maintainable application.

(This is one of a series of articles on RIA architecture: subscribe to the Gen5 RSS feed for future installments.)

Example Application: Blogging And The Category Dropdown

Imagine a blogging application that works like the WordPress blog used on this site. This application consists of a number of forms, one of which is used to write a new post:

This form lets you fill out two text fields: a title and the body of the post. It also contains a dropdown list of categories, and gives you the option of adding a new category. Categories are represented (server-side) in a table in a relational database that looks like:

[01] CREATE TABLE categoryList (
[02]     id                 integer primary key auto_increment,
[03]     name               varchar(255)
[04] ); Continue Reading »

The Semantics of Dictionaries, Maps and Hashtables

Introduction

The first language I used that put dictionaries on my fingertips was Perl, where the solution to just about any problem involved writing something like

$hashtable{$key}=$value;

Perl called a dictionary a ‘hash’,  a reference to the way Perl implemented dictionaries.  (Dictionaries are commonly implemented with hashtables and b-trees,  but can also be implemented with linked-list and other structures.)  The syntax of Perl is a bit odd, as you’d need to use $, # or % to reference scalar,  array or hash variables in different contexts,  but dictionaries with similar semantics became widespread in dynamic languages of that and succeeding generations, such as Python, PHP and Ruby.  ‘Map’ container classes were introduced in Java about a decade ago,  and programmers are using dictionaries increasingly in static languages such as Java and C#.

Dictionaries are a convenient and efficient data structure, but there’s are areas in which different mplementations behave differently: for instance,  in what happens if you try to access an undefined key.   I think that cross-training is good for developers,  so this article compares this aspect of the semantics of dictionaries in four popular languages:  PHP,  Python,  Java and C#.

Use cases

There are two use cases for dictionaries, so far as error handling is concerned:

  1. When you expect to look up undefined values, and
  2. When you don’t

Let’s look at three examples:

Computing A Histogram

One common use for a dictionary is for counting items, or recording that items in a list or stream have been seen. In C#, this is typically written something like:

[01] var count=Dictionary<int,int>();
[02] foreach(int i in inputList) {
[03]   if (!counts.Contains(i))
[04]       count[i]=0;
[05]
[06]   count[i]=count[i]+1
[07] }

The Dictionary count now contains the frequency of items inputList, which could be useful for plotting a histogram. A similar pattern can be used if we wish to make a list of unique items found in inputList. In either case,  looking up values that aren’t already in the hash is a fundamental part of the algorithm.

Processing Input

Sometimes, we’re getting input from another subsystem, and expect that some values might not be defined. For instance, suppose a web site has a search feature with a number of optional features, and that queries are made by GET requests like:

[08] search.php?q=kestrel
[09] search.php?q=admiral&page=5
[10] search.php?q=laurie+anderson&page=3&in_category=music&after_date=1985-02-07

In this case, the only required search parameter is “q”, the query string — the rest are optional. In PHP (like many other environments), you can get at GET variables via a hashtable, specifically, the $_GET superglobal, so (depending on how strict the error handling settings in your runtime are) you might write something like

[11] if ($_GET["q"])) {
[12]     throw new InvalidInputException("You must specify a query");
[13] }
[14]
[15] if($_GET["after_date"]) {
[16]  ... add another WHERE clause to a SQL query ...
[17] }

This depends, quite precisely, on two bits of sloppiness in PHP and Perl: (a) Dereferencing an undefined key on a hash returns an undefined value, which is something like a null. (b) both languages have a liberal definition of true and false in an if() statement. As a result, the code above is a bit quirky. The if() at line 11 evaluates false if q is undefined, or if q is the empty string. That’s good. However, both the numeric value 0 and the string “0″ also evaluate false. As a result, this code won’t allow a user to search for “0″, and will ignore an (invalid) after_date of 0, rather than entering the block at line [16], which hopefully would validate the date.

Java and C# developers might enjoy a moment of schadenfreude at the above example, but they’ve all seen, written and debugged examples of input handling code that just as quirky as the above PHP code — with several times the line count. To set the record straight, PHP programmers can use the isset() function to precisely test for the existence of a hash key:

[11] if (isset($_GET["q"]))) {
[12]     throw new InvalidInputException("You must specify a query");
[13] }

The unusual handling of “0″ is the kind of fault that can survive for years in production software:  so long as nobody searches for “0″,  it’s quite harmless.  (See what you get if you search for a negative integer on Google.)  The worst threat that this kind of permissive evaluation poses is when it opens the door to a security attack,  but we’ve also seen that highly complex logic that strives to be “correct” in every situation can hide vulnerabilities too.

Relatively Rigid Usage

Let’s consider a third case: passing a bundle of context in an asynchronous communications call in a Silverlight application written in C#. You can do a lot worse than to use the signatures:

[14] void BeginAsyncCall(InputType input,Dictionary<string, object> context,CallbackDelegate callback);
[15] void CallbackDelegate(ReturnType returnValue,Dictionary<string,object> context);

The point here is that the callback might need to know something about the context in which the asynchronous function was called to do it’s work. However, this information may be idiosyncratic to the particular context in which the async function is called,  and is certainly not the business of the asynchronous function. You might write something like

[16] void Initiator() {
[17]   InputType input=...;
[18]   var context=Dictionary<string,object>();
[19]   context["ContextItemOne"]= (TypeA) ...;
[20]   context["ContextItemTwo"]= (TypeB) ...;
[21]   context["ContextItemThre"] = (TypeC) ...;
[22]   BeginAsyncCall(input,context,TheCallback);
[23] }
[24]
[25] void TheCallback(ReturnType output,Dictionary<string,object> context) {
[26]   ContextItemOne = (TypeA) context["ContextItemOne"];
[27]   ContextItemTwo = (TypeB) context["ContextItemTwo"];
[28]   ContextItemThree = (TypeC) context["ContextItemThree"];
[29]   ...
[30] }

This is nice, isn’t it?  You can pass any data values you want between Initiator and TheCallback. Sure,  the compiler isn’t checking the types of your arguments,  but loose coupling is called for in some situations.  Unfortunately it’s a little too loose in this case,  because we spelled the name of a key incorrectly on line 21.

What happens?

The [] operator on a dot-net Dictionary throws a KeyNotFoundException when we try to look up a key that doesn’t exist.   I’ve set a global exception handler for my Silverlight application which,  in debugging mode,  displays the stack trace.  The error gets quickly diagnosed and fixed.

Four ways to deal with a missing value

There are four tools that hashtables give programmers to access values associated with keys and detect missing values:

  1. Test if key exists
  2. Throw exception if key doesn’t exist
  3. Return default value (or null) if key doesn’t exist
  4. TryGetValue

#1: Test if key exists

PHP:    isset($hashtable[$key])
Python: key in hashtable
C#:     hashtable.Contains(key)
Java:   hashtable.containsKey(key)

This operator can be used together with the #2 or #3 operator to safely access a hashtable.  Line [03]-[04] illustrates a common usage pattern.

One strong advantage of the explicit test is that it’s more clear to developers who spend time working in different language environments — you don’t need to remember or look in the manual to know if the language you’re working in today uses the #2 operator or the #3 operator.

Code that depends on the existence test can be more verbose than alternatives,  and can  be structurally unstable:  future edits can accidentally change the error handling properties of the code.  In multithreaded environments,  there’s a potential risk that an item can be added or removed between the existance check and an access — however,  the default collections in most environment are not thread-safe,  so you’re likely to have worse problems if a collection is being accessed concurrently.

#2 Throw exception if key doesn’t exist

Python: hashtable[key]
C#:     hashtable[key]

This is a good choice when the non-existence of a key is really an exceptional event.  In that case,  the error condition is immediately propagated via the exception handling mechanism of the language,  which,  if properly used,  is almost certainly better than anything you’ll develop.  It’s awkward,  and probably inefficient,  if you think that non-existent keys will happen frequently.  Consider the following rewrite of the code between [01]-[07]

[31] var count=Dictionary<int,int>();
[32] foreach(int i in inputList) {
[33]   int oldCount;
[34]   try {
[35]       oldCount=count[i];
[36]   } catch (KeyNotFoundException ex) {
[37]       oldCount=0
[38]   }
[39]
[40]   count[i]=oldCount+1
[41] }

It may be a matter of taste,  but I think that’s just awful.

#3 Return a default (often null) value if key doesn’t exist

PHP:    $hashtable[key] (well,  almost)
Python: hashtable.get(key, [default value])
Java:   hashtable.get(key)

This can be a convenient and compact operation.  Python’s form is particularly attractive because it lets us pick a specific default value.  If we use an extension method to add a Python-style GetValue operation in C#,  the code from [01]-[07] is simplified to

[42] var count=Dictionary<int,int>();
[43] foreach(int i in inputList)
[44]   count[i]=count.GetValue(i,0)+1;

It’s reasonable for the default default value to be null (or rather,  the default value of the type),  as it is in Python,  in which case we could use the ??-operator to write

[42] var count=Dictionary<int,int>();
[43] foreach(int i in inputList)
[44]   count[i]=(count.GetValue(i) ?? 0)+1;

(A ?? B equals A if A is not null,  otherwise it equals B.)   The price for this simplicity is two kinds of sloppiness:

  1. We can’t tell the difference between a null (or default) value associated with a key and no value associated with a key
  2. The potential of null value exports chaos into the environment:  trying to use a null value can cause a NullReferenceException if we don’t explictly handle the null.  NullReferenceExceptions don’t bother me if they happen locally to the function that returns them,  but they can be a bear to understand when a null gets written into an instance variable that’s accessed much later.

Often people don’t care about 1,  and the risk of 2 can be handled by specifying a non-null default value.

Note that PHP’s implementation of hashtables has a particularly annoying characteristic.  Error handling in php is influenced by the error_reporting configuration variable which can be set in the php.ini file and other places.  If the E_STRICT bit is not set in error_reporting,   PHP barrels on past places where incorrect variable names are used:

[45] $correctVariableName="some value";
[46] echo "[{$corectValiableName}]"; // s.i.c.

In that case, the script prints “[]” (treats the undefined variable as an empty string) rather than displaying an error or warning message.  PHP will give a warning message if E_STRICT is set,  but then it applies the same behavior to hashtables:  an error message is printed if you try to dereference a key that doesn’t exist — so PHP doesn’t consistently implement type #3 access.

#4 TryGetValue

There are quite a few methods (Try-* methods) in the .net framework that have a signature like this:

[47] bool Dictionary<K,V>.TryGetValue(K key,out V value);

This method has crisp and efficient semantics which could be performed in an atomic thread-safe manner:  it returns true if finds the key,  and otherwise returns false.  The output parameter value is set to the value associated with the key if a value is associated with the key,  however,  I couldn’t find a clear statement of what happens if the key isn’t found.  I did a little experiment:

[48] var d = new Dictionary<int, int>();
[49] d[1] = 5;
[50] d[2] = 7;
[51] int outValue = 99;
[52] d.TryGetValue(55, out outValue)
[53] int newValue = outValue;

I set a breakpoint on line 53 and found thate the value of outValue was 0,  which is the default value of the int type.  It seems,  therefore,  that TryGetValue returns the default value of the type when it fails to find the key.  I wouldn’t count on this behavior,  as it is undocumented.

The semantics of TryGetValue are crisp and precise.  It’s particularly nice that something like TryGetValue could be implemented as an atomic operation,  if the underyling class is threadsafe.  I fear,  however,  that TryGetValue exports chaos into it’s environment.  For instance,  I don’t like declaring a variable without an assignment,  like below:

[54] int outValue;
[55] if (d.TryGetValue(55,outValue)) {
[56] ... use outValue ...
[57] }

The variable outValue exists before the place where it’s set,  and outside of the block where it has a valid value.  It’s easy for future maintainers of the code to try to use outValue between lines [54]-[55] or after line [57].  It’s also easy to write something like 51],  where the value 99 is completely irrelevant to the program.  I like the construction

[58] if (d.Contains(key)) {
[59]    int value=d[key];
[60]    ... do something with value ...
[61] }

because the variable value only exists in the block [56]-[58] where it has a defined value.

Hacking Hashables

A comparison of hashtables in different languages isn’t just academic.  If you don’t like the operations that your language gives you for hashtables,  you’re free to implement new operations.  Let’s take two simple examples.  It’s nice to have a Python-style get() in PHP that never gives a warning message,  and it’s easy to implement

[62] function array_get($array,$key,$defaultValue=false) {
[63]   if (!isset($array[$key]))
[64]      return $defaultValue;
[65]
[66]   return $array[$key];
[67] }

Note that the third parameter of this function uses a default value of false,  so it’s possible to call it in a two-parameter form

[68] $value=array_get($array,$key);

with a default default of false,  which is reasonable in PHP.

Extension methods make it easy to add a Python-style get() to C#;  I’m going to call it GetValue() to be consistent with TryGetValue():

[69] public static class DictionaryExtensions {
[70]   public static V GetValue<K, V>(this IDictionary<K, V> dict, K key) {
[71]      return dict.GetValue(key, default(V));
[72]   }
[73]
[74]   public static V GetValue<K, V>(this IDictionary<K, V> dict, K key, V defaultValue) {
[75]      V value;
[76]      return dict.TryGetValue(key, out value) ? value : defaultValue;
[77]   }
[78] }

Conclusion

Today’s programming languages put powerful data structures,  such as dictionaries,  on your fingertips.  When we look closely,  we see subtle differences in the APIs used access dictionaries in different languages.  A study of the different APIs and their consequences can help us think about how to write code that is more reliable and maintainable,  and informs API design in every language

kick it on DotNetKicks.com

Extension Methods, Nulls, Namespaces and Precedence in C#

Introduction

Extension methods are the most controversial feature that Microsoft has introduced in C# 3.0.  Introduced to support the LINQ query framework,  extension methods make it possible to define new methods for existing classes.

Although extension methods can greatly simplify code that uses them,  many are concerned that they could transform C# into something that programmers find unrecognizable,  or that C#’s namespace mechanisms are inadequate for managing large systems that use extension methods.  Adoption of the LINQ framework,  however,  means that extension methods are here to stay,  and that .net programmers need to understand how to use them effectively,  and,  in particular,  how extension methods are different from regular methods.

This article discusses three ways in which extension methods differ from regular methods:

  1. Extension methods can be called on null objects without throwing an exception
  2. Extension methods cannot be called inside of a subclass without the use of ‘this’
  3. The precedence rules for extension methods

The Menace of Null Values

The treatment of null values is one of the major weaknesses of today’s generation of languages.  Although C# makes it possible to make nullable versions of value types,  such as int? and guid?,  there’s nothing in C# or Java like the “NOT NULL” declaration in SQL.  As a result,  handling nulls is a significant burden to writing correct code.  Consider the simple case where you want to write

[01] someObject.DoSomething();

(where DoSomething is an ordinary instance method)  When I type something like this,  Resharper often highlights the line of code to warn me that someObject might be null.  In some cases I might be confident that it never will,  but if there is any change that it will be null,  I’ll need to write something like

[02] if(someObject!=null) {
[03]    someObject.DoSomething();
[04] }

or maybe

[05] if(someObject==null)
[06]    return;
[07] someObject.DoSomething();

Alternatively I could accepted that an exception could be thrown by the invocation and decide to catch it (or not catch it) elsewhere in the application.  In two cases out of three,  one line of code gets bulked up to three.  Worse than that,  I need to make a decision at that point about what to when there’s an error condition — each decision is a case where somebody can make the wrong decision.  Even if coders make the wrong decision 5% of the time,  that would be 50 time bombs in your code for every 1000 method invocations.  (Oliver Steele works out a particularly outrageous but common case where it takes 10 lines of null-checking code to protect 1 line of working code.)

Extension Methods Can Accept Null Values

What does this have to do with extension methods?

Unlike ordinary instance methods,  extension methods do not automatically throw an exception if you call them on a null-valued object.  Depending on your point of view,  this can be (i) a gotcha,  or (ii) a useful tool for simplifying your code.  Here’s a little example:

[08] namespace ExtensionMethodTest {
[09]
[10]   static class ObjectExtension {
[11]       static public bool IsNull(this object o) {
[12]            return o == null;
[13]       }
[14]    }
[15]
[16]    class Program {
[17]        static void Main(string[] args) {
[18]            String s1 = "hello";
[19]            Console.WriteLine(s1.IsNull());
[20]            String s2 = null;
[21]            Console.WriteLine(s2.IsNull());
[22]            Console.WriteLine(s2.ToUpper());
[23]        }
[24]    }
[25] }

This example does something a bit bold:  it attaches an extension method to object,   adding an extenson method to every object in the system.  This method,  object.IsNull() returns true if object is null and false if it isn’t.  Some people might see this as a nice example of syntactic sugar,  others may see it as reckless.  What’s important is that it works:  if you run this program from the command line,  line [21] will print ‘true’,  while line [22],  which uses an ordinary method,  will throw a NullReferenceException.

Events and Extension Methods for Delegates

Chris Brandsma works out a practical example of how extension methods can be used to fix a broken and dangerous API.  That is,  the event handling mechanism commonly used in C#:

[26] public eventEventHandler<EventArgs> OnLoadData;
[27] ...
[28] OnLoadData += SomeEventHandler;
[29] ...
[30] OnLoadData(this, argument);

OnLoadData is a MulticastDelegate.  You can attach an unlimited number of real delegates to it.  The sample above works great if you attach at least one delegate,  but it fails with a NullReferenceException if you don’t.  Perhaps this isn’t a problem for you,  because you’re smart and you write

[31] if (OnLoadData==null) {
[32]     OnLoadData(this,argument)
[33] }

Unfortunately,  there are two little problems with that.  First,  none of us program in a vacuum,   so many of us will end up having to maintain or use objects where somebody forgot to include a null check.   Secondly,  the example between lines [31] and [33] isn’t thread safe.  It’s possible that a method can be removed from OnLoadData between the time of the null check and the call!

It turns out that extension methods can be added to delegates,  so Chris created a really nice extension method called Fire() that encapsulates the error check code between 31-33.   Now you can just write the code you wanted to write:

[34] OnLoadData.Fire(this,argument);

and be confident that knowledge about threads and quirks of the type system is embedded in an extension method.

You must use this to access an extension method inside a subclass

Suppose you’re building a Silverlight application and you’d like your team to have an important method that incorporates something tricky on their fingertips.  For instance,  suppose you’re implementing error handling in an event handler that’s responding to a user-initiated event or an async callback.   You can always write

[35] if(... something wrong...) {
[36]    ... several lines of code to display dialog box ...
[37]    return;
[38] }

But this is something that (i) programmers don’t want to do to begin with,  (ii) that programmers will have to do tens or hundreds of times,  and (iii) isn’t going to be in the main line of testing.  It’s a quality problem waiting to happen.  It’s imperative,  therefore,  to reduce the amount of code to do the right thing as much as possible…  To make it easier to do the right thing than to do the wrong thing.   It’s tempting to define an extension method like:

[39] public static void ErrorDialog(this UserControl c, string message) {
[40]    throw new ErrorMessageException(message);
[41] }

and catch the ErrorMessageException in the global error handler.  (The “method that doesn’t return” is effective,  because it avoids the need to repeat the return,  which occassionaly seems to vanish when people write repetitive error handling code.)  You’d think that this simplifies the code inside the UserControls you write to:

[42] if (... something wrong...)  {
[43]    ErrorDialog(...);
[44] }

But it turns out that line [43] doesn’t actually work,  and you need to write

[45] if (... something wrong...) {
[46]    this.ErrorDialog(...);
[47] }

in which case you might as well use an ordinary static method on a helper class.

What’s wrong with extension methods?

I’ve seen two arguments against extension methods:  (i) extension methods could make code hard to understand (and hence maintain) and (ii) extension methods are vulnerable to namespace conflicts.  I think (i) is a specious argument,  but (ii) is serious.

I think (i) splits into two directions.  First there’s the practical problem that a programmer is going to see some code like

[48] String s="somebody@example.com";
[49] if (s.IsValidEmailAddress()) {
[50]     ... do something ...
[51] }

and wonder where the heck IsValidEmailAddress() comes from,  where it’s documented,  and so forth.  Practically,  Visual Studio understands extension methods well,  so a user that clicks on “Go To Definition” is going to get a quick answer.

Going further,  however,  one can imagine that extension methods could transform C# unrecognizably:  I think of a friend of mine who,  in the 1980′s,  liked FORTRAN better than C,  and abused preprocessor macros so he could write C code that looked like FORTRAN.   This is connected with a fear of lambda expressions,  and other features that derive from functional programming.  For instance,  that beginning programmers just won’t get it.

We’ll see how it all works out,  but I think that new features in C# are going to help the language evolve in a way more like jquery and prototype have transformed javascript.  Microsoft is bringing concepts that have been locked in the ivory tower for decades into the mainstream:  all programming languages are going to benefit in the long term.

Extension methods,  precedence and namespaces

Here’s the killer.

I can make extension methods available by just adding a namespace to my .cs file with a using directive.  The compiler scans the namespace for extension methods in static classes,  and makes them available.  Pretty easy,  right?  Well,  what happens if two extension methods with the same name get declared in two namespaces which get included in the same file?  What if we define an extension method on class A,  but there’s a conventional method with the same name on class B?  What if file One.cs uses namesspace C,  and Two.cs uses namespace D,   so that ThisExtensionMethod means something different in One.cs and Two.cs?

There are real problems in how extension methods interact with namespaces.  These problems aren’t as fatal as namespace conflicts were in C (and C++ before namespaces),  but they are for real.

One answer is to avoid the use of extension methods entirely,  but that causes the loss of the benefits.  Anyone who uses extension methods should take a close look at the C# version 3.0 specification and think about how precedence rules effect their work:

(i) Instance methods take precedence over extension methods.  The definition of an instance method makes extension methods with the same name inaccessable.  This happens at the level of methods,  not method groups,  so two methods with the same name but different signatures can be handled by an extension method and instance method respectively.
(ii) Once the compiler tries extension methods,  processing works backwards from the closest enclosing namespace declaration outward,  trying extension methods defined in using groups.
(iii) The compiler throws an error when there are two or more extension methods that are candidates for a spot.

Matt Manela demonstrates an interesting example on the MSDN forums.  With three examples,  he demonstrates that the existence of an instance method (that overrides both extension methods) will suppress the generation of an error message about a conflict between extension methods.  This indicates that potentially conflicting extension methods in two namespaces will only cause an error if an attempt is made to use them.

Mitigating Namespace Conflicts

Overall,  conflicts between extension methods in different namespaces will not result in catastrophic consequences:

  1. The compiler raises an error if there is any ambiguity as to which extension method to apply at a particular invocation — code won’t silently change behavior upon adding a new namespace in a using directive.
  2. The compiler does not throw an error if potentially conflicting extension methods are declared in two different namespaces including in distinct using directives if those extension methods are not used — therefore,  conflicting extension methods won’t automatically prevent you from using any namespaces you choose.
  3. If there is a conflict,  either between two extension methods or an extension method and an instance methods,  you can always call a specific extension method like an ordinary static example.  For instance,  in the case above:

ObjectExtension.IsNull(someObject);

You won’t end up in a situation where an extension method becomes unavailable because of a conflict — you’ll just be forced to use an uglier syntax.  I do see two real risks:

  1. You can end up using an extension method that you don’t expect if you’re not keeping track of which using directives are in your file,  and
  2. An instance method can silently shadow an extension method.  A change in the definition of a method could cause the behavior of a (former) extension method cal to change in a suprising way.  On the other hand,  this could be a useful behavior if you’d like a subclass to override a behavior defined in an extension method.

A common bit of advice that I’ve seen circulating is that extension methods should be defined in separate namespaces,  so that it would be possible to include or not include extension methods associated with a namespace to avoid conflicts.  I think this is based on superstition,  for,  as we’ve seen,  conflicting extension methods do not preclude the use of two namespaces;  this advice is certainly not followed in the System.Linq namespace,  which defines a number of valuable extension methods in the System.Linq.Enumerable static class.

Conclusion

We’re still learning how to use extension methods effectively.  Although extension methods have great promise,  they’re difference from ordinary instance methods in a number of ways.  Some of these,  like the difference in null handling,  are minor,  and could potentially be put to advantage.  Others,  such as the interaction with namespaces in large projects,   are more challenging.  It’s time to start building on our experiences to develop effective patterns for using extension methods.

kick it on DotNetKicks.com