Extension Methods, Nulls, Namespaces and Precedence in C#

Introduction

Extension methods are the most controversial feature that Microsoft has introduced in C# 3.0.  Introduced to support the LINQ query framework,  extension methods make it possible to define new methods for existing classes.

Although extension methods can greatly simplify code that uses them,  many are concerned that they could transform C# into something that programmers find unrecognizable,  or that C#’s namespace mechanisms are inadequate for managing large systems that use extension methods.  Adoption of the LINQ framework,  however,  means that extension methods are here to stay,  and that .net programmers need to understand how to use them effectively,  and,  in particular,  how extension methods are different from regular methods.

This article discusses three ways in which extension methods differ from regular methods:

  1. Extension methods can be called on null objects without throwing an exception
  2. Extension methods cannot be called inside of a subclass without the use of ‘this’
  3. The precedence rules for extension methods

The Menace of Null Values

The treatment of null values is one of the major weaknesses of today’s generation of languages.  Although C# makes it possible to make nullable versions of value types,  such as int? and guid?,  there’s nothing in C# or Java like the “NOT NULL” declaration in SQL.  As a result,  handling nulls is a significant burden to writing correct code.  Consider the simple case where you want to write

[01] someObject.DoSomething();

(where DoSomething is an ordinary instance method)  When I type something like this,  Resharper often highlights the line of code to warn me that someObject might be null.  In some cases I might be confident that it never will,  but if there is any change that it will be null,  I’ll need to write something like

[02] if(someObject!=null) {
[03]    someObject.DoSomething();
[04] }

or maybe

[05] if(someObject==null)
[06]    return;
[07] someObject.DoSomething();

Alternatively I could accepted that an exception could be thrown by the invocation and decide to catch it (or not catch it) elsewhere in the application.  In two cases out of three,  one line of code gets bulked up to three.  Worse than that,  I need to make a decision at that point about what to when there’s an error condition — each decision is a case where somebody can make the wrong decision.  Even if coders make the wrong decision 5% of the time,  that would be 50 time bombs in your code for every 1000 method invocations.  (Oliver Steele works out a particularly outrageous but common case where it takes 10 lines of null-checking code to protect 1 line of working code.)

Extension Methods Can Accept Null Values

What does this have to do with extension methods?

Unlike ordinary instance methods,  extension methods do not automatically throw an exception if you call them on a null-valued object.  Depending on your point of view,  this can be (i) a gotcha,  or (ii) a useful tool for simplifying your code.  Here’s a little example:

[08] namespace ExtensionMethodTest {
[09]
[10]   static class ObjectExtension {
[11]       static public bool IsNull(this object o) {
[12]            return o == null;
[13]       }
[14]    }
[15]
[16]    class Program {
[17]        static void Main(string[] args) {
[18]            String s1 = "hello";
[19]            Console.WriteLine(s1.IsNull());
[20]            String s2 = null;
[21]            Console.WriteLine(s2.IsNull());
[22]            Console.WriteLine(s2.ToUpper());
[23]        }
[24]    }
[25] }

This example does something a bit bold:  it attaches an extension method to object,   adding an extenson method to every object in the system.  This method,  object.IsNull() returns true if object is null and false if it isn’t.  Some people might see this as a nice example of syntactic sugar,  others may see it as reckless.  What’s important is that it works:  if you run this program from the command line,  line [21] will print ‘true’,  while line [22],  which uses an ordinary method,  will throw a NullReferenceException.

Events and Extension Methods for Delegates

Chris Brandsma works out a practical example of how extension methods can be used to fix a broken and dangerous API.  That is,  the event handling mechanism commonly used in C#:

[26] public eventEventHandler<EventArgs> OnLoadData;
[27] ...
[28] OnLoadData += SomeEventHandler;
[29] ...
[30] OnLoadData(this, argument);

OnLoadData is a MulticastDelegate.  You can attach an unlimited number of real delegates to it.  The sample above works great if you attach at least one delegate,  but it fails with a NullReferenceException if you don’t.  Perhaps this isn’t a problem for you,  because you’re smart and you write

[31] if (OnLoadData==null) {
[32]     OnLoadData(this,argument)
[33] }

Unfortunately,  there are two little problems with that.  First,  none of us program in a vacuum,   so many of us will end up having to maintain or use objects where somebody forgot to include a null check.   Secondly,  the example between lines [31] and [33] isn’t thread safe.  It’s possible that a method can be removed from OnLoadData between the time of the null check and the call!

It turns out that extension methods can be added to delegates,  so Chris created a really nice extension method called Fire() that encapsulates the error check code between 31-33.   Now you can just write the code you wanted to write:

[34] OnLoadData.Fire(this,argument);

and be confident that knowledge about threads and quirks of the type system is embedded in an extension method.

You must use this to access an extension method inside a subclass

Suppose you’re building a Silverlight application and you’d like your team to have an important method that incorporates something tricky on their fingertips.  For instance,  suppose you’re implementing error handling in an event handler that’s responding to a user-initiated event or an async callback.   You can always write

[35] if(... something wrong...) {
[36]    ... several lines of code to display dialog box ...
[37]    return;
[38] }

But this is something that (i) programmers don’t want to do to begin with,  (ii) that programmers will have to do tens or hundreds of times,  and (iii) isn’t going to be in the main line of testing.  It’s a quality problem waiting to happen.  It’s imperative,  therefore,  to reduce the amount of code to do the right thing as much as possible…  To make it easier to do the right thing than to do the wrong thing.   It’s tempting to define an extension method like:

[39] public static void ErrorDialog(this UserControl c, string message) {
[40]    throw new ErrorMessageException(message);
[41] }

and catch the ErrorMessageException in the global error handler.  (The “method that doesn’t return” is effective,  because it avoids the need to repeat the return,  which occassionaly seems to vanish when people write repetitive error handling code.)  You’d think that this simplifies the code inside the UserControls you write to:

[42] if (... something wrong...)  {
[43]    ErrorDialog(...);
[44] }

But it turns out that line [43] doesn’t actually work,  and you need to write

[45] if (... something wrong...) {
[46]    this.ErrorDialog(...);
[47] }

in which case you might as well use an ordinary static method on a helper class.

What’s wrong with extension methods?

I’ve seen two arguments against extension methods:  (i) extension methods could make code hard to understand (and hence maintain) and (ii) extension methods are vulnerable to namespace conflicts.  I think (i) is a specious argument,  but (ii) is serious.

I think (i) splits into two directions.  First there’s the practical problem that a programmer is going to see some code like

[48] String s="somebody@example.com";
[49] if (s.IsValidEmailAddress()) {
[50]     ... do something ...
[51] }

and wonder where the heck IsValidEmailAddress() comes from,  where it’s documented,  and so forth.  Practically,  Visual Studio understands extension methods well,  so a user that clicks on “Go To Definition” is going to get a quick answer.

Going further,  however,  one can imagine that extension methods could transform C# unrecognizably:  I think of a friend of mine who,  in the 1980′s,  liked FORTRAN better than C,  and abused preprocessor macros so he could write C code that looked like FORTRAN.   This is connected with a fear of lambda expressions,  and other features that derive from functional programming.  For instance,  that beginning programmers just won’t get it.

We’ll see how it all works out,  but I think that new features in C# are going to help the language evolve in a way more like jquery and prototype have transformed javascript.  Microsoft is bringing concepts that have been locked in the ivory tower for decades into the mainstream:  all programming languages are going to benefit in the long term.

Extension methods,  precedence and namespaces

Here’s the killer.

I can make extension methods available by just adding a namespace to my .cs file with a using directive.  The compiler scans the namespace for extension methods in static classes,  and makes them available.  Pretty easy,  right?  Well,  what happens if two extension methods with the same name get declared in two namespaces which get included in the same file?  What if we define an extension method on class A,  but there’s a conventional method with the same name on class B?  What if file One.cs uses namesspace C,  and Two.cs uses namespace D,   so that ThisExtensionMethod means something different in One.cs and Two.cs?

There are real problems in how extension methods interact with namespaces.  These problems aren’t as fatal as namespace conflicts were in C (and C++ before namespaces),  but they are for real.

One answer is to avoid the use of extension methods entirely,  but that causes the loss of the benefits.  Anyone who uses extension methods should take a close look at the C# version 3.0 specification and think about how precedence rules effect their work:

(i) Instance methods take precedence over extension methods.  The definition of an instance method makes extension methods with the same name inaccessable.  This happens at the level of methods,  not method groups,  so two methods with the same name but different signatures can be handled by an extension method and instance method respectively.
(ii) Once the compiler tries extension methods,  processing works backwards from the closest enclosing namespace declaration outward,  trying extension methods defined in using groups.
(iii) The compiler throws an error when there are two or more extension methods that are candidates for a spot.

Matt Manela demonstrates an interesting example on the MSDN forums.  With three examples,  he demonstrates that the existence of an instance method (that overrides both extension methods) will suppress the generation of an error message about a conflict between extension methods.  This indicates that potentially conflicting extension methods in two namespaces will only cause an error if an attempt is made to use them.

Mitigating Namespace Conflicts

Overall,  conflicts between extension methods in different namespaces will not result in catastrophic consequences:

  1. The compiler raises an error if there is any ambiguity as to which extension method to apply at a particular invocation — code won’t silently change behavior upon adding a new namespace in a using directive.
  2. The compiler does not throw an error if potentially conflicting extension methods are declared in two different namespaces including in distinct using directives if those extension methods are not used — therefore,  conflicting extension methods won’t automatically prevent you from using any namespaces you choose.
  3. If there is a conflict,  either between two extension methods or an extension method and an instance methods,  you can always call a specific extension method like an ordinary static example.  For instance,  in the case above:

ObjectExtension.IsNull(someObject);

You won’t end up in a situation where an extension method becomes unavailable because of a conflict — you’ll just be forced to use an uglier syntax.  I do see two real risks:

  1. You can end up using an extension method that you don’t expect if you’re not keeping track of which using directives are in your file,  and
  2. An instance method can silently shadow an extension method.  A change in the definition of a method could cause the behavior of a (former) extension method cal to change in a suprising way.  On the other hand,  this could be a useful behavior if you’d like a subclass to override a behavior defined in an extension method.

A common bit of advice that I’ve seen circulating is that extension methods should be defined in separate namespaces,  so that it would be possible to include or not include extension methods associated with a namespace to avoid conflicts.  I think this is based on superstition,  for,  as we’ve seen,  conflicting extension methods do not preclude the use of two namespaces;  this advice is certainly not followed in the System.Linq namespace,  which defines a number of valuable extension methods in the System.Linq.Enumerable static class.

Conclusion

We’re still learning how to use extension methods effectively.  Although extension methods have great promise,  they’re difference from ordinary instance methods in a number of ways.  Some of these,  like the difference in null handling,  are minor,  and could potentially be put to advantage.  Others,  such as the interaction with namespaces in large projects,   are more challenging.  It’s time to start building on our experiences to develop effective patterns for using extension methods.

kick it on DotNetKicks.com