Thursday, December 20, 2007

Upgrading to Windows XP SP2

After months of soul-searching, I made the gut-wrenching decision today to upgrade my home PC to Windows XP SP2.

Upgrade from Vista, that is.

I'm completely convinced that Vista is not designed to run on single-core/processor machines.  I've run Vista on work machines without any hiccups, with Aero Glass going full on.  I thought I had a semi-decent home PC:

  • AMD Athlon XP 2800+
  • 2 GB RAM

Alas, it was not enough to net me more than about 2.9 on the Windows Experience Index.  UAC annoys the hell out of me, most file operations take forever, I'm denied access to do simple operations, like creating a folder on my D: drive.  At work, I'll turn all of these safety features off, as I'm okay running with scissors in a development environment.  I have no idea how a home user deals with all of it, I sure couldn't.  Hopefully Vista's SP1 will fix these issues.

Tuesday, December 18, 2007

Extension methods and primitive obsession

In another water-cooler argument today, a couple of coworkers didn't like my extension method example.  One main problem is that it violates instance semantics, where you expect that a method call off an instance won't work if the instance is null.  However, extension methods break that convention, leading the developer to question every method call and wonder if it's an extension method or not.  For example, you can run into these types of scenarios:

string nullString = null;

bool isNull = nullString.IsNullOrEmpty();

In normal circumstances, the call to IsNullOrEmpty would throw a NullReferenceException.  Since we're using an extension method, we leave it up to the developer of the extension method to determine what to do with null references.

Since there's no way to describe to the user of the API whether or not the extension method handles nulls, or how it handles null references, this can lead to quite a bit of confusion to clients of that API, or later, those maintaining code using extension methods.

In addition to problems with dealing with null references (which Elton pointed out, could be better handled with design-by-contract), some examples of extension methods online propose examples that show more than a whiff of the "Primitive Obsession" code smell:

Dealing with primitive obsession

In both of the examples above (Scott cites David's example), an extension method is used to determine if a string is an email:

string email = txtEmailAddress.Text;

if (! email.IsValidEmailAddress())
{
    // oh noes!
}

It's something I've done a hundred times, taking raw text from user input and performing some validation to make sure it's the "right" kind of string I want.  But where do you stop with validation?  Do you assume all throughout the application that this string is the correct kind of string, or do you duplicate the validation?

An alternative approach is accept that classes are your friend, and create a small class to represent your "special" primitive.  Convert back and forth at the boundaries between your system and customer-facing layers.  Here's the new Email class:

public class Email
{
    private readonly string _value;
    private static readonly Regex _regex = new Regex(@"^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$");

    public Email(string value)
    {
        if (!_regex.IsMatch(value))
            throw new ArgumentException("Invalid email format.", "value");

        _value = value;
    }

    public string Value
    {
        get { return _value; }
    }

    public static implicit operator string(Email email)
    {
        return email.Value;
    }

    public static explicit operator Email(string value)
    {
        return new Email(value);
    }

    public static Email Parse(string email)
    {
        if (email == null)
            throw new ArgumentNullException("email");

        Email result = null;

        if (!TryParse(email, out result))
            throw new FormatException("Invalid email format.");

        return result;
    }

    public static bool TryParse(string email, out Email result)
    {
        if (!_regex.IsMatch(email))
        {
            result = null;
            return false;
        }

        result = new Email(email);
        return true;
    }
}

I do a few things to make it easy on developers to use an email class that can play well with strings as well as other use cases:

  • Made Email immutable
  • Defined conversion operators to and from string
  • Added the Try-Parse pattern

The usage of the Email class closely resembles usage for other string-friendly types, such as DateTime:

string inputEmail = txtEmailAddress.Text;

Email email;

if (! Email.TryParse(inputEmail, out email))
{
    // oh noes!
}

txtEmailAddress.Text = email;

Now I can go back and forth from strings and my Email class, plus I provided a way to convert without throwing exceptions.  This looks very similar to code dealing with textual date representations.

Yes, but

The final Email class takes more code to write than the original extension method.  However, now that we have a single class that plays nice with primitives, additional Email behavior has a nice home.  With a class in place, I can now model more expressive emails, such as ones that include names like "Ricky Bobby <ricky.bobby@rb.com>".  Once the home is created, behavior can start moving in.  Otherwise, validation would be sprinkled throughout the system at each user boundary, such as importing data, GUIs, etc.

If you find yourself adding logic to primitives to the point of obsession, it's a strong indicator you're suffering from primitive obsession and a nice, small, specialized class can help eliminate a lot of the duplication primitive obsession tends to create.

Dead Google Calendar gadget

This morning I received an interesting yet disturbing message on the Google Calendar gadget on my iGoogle home page:

Great gadget that it was, I think I might be a little more discerning about what gadgets I put on the home page.  Word of warning, you probably don't want to google "donkey-punching", definitely NSFW.  It looks like Google changed something, broke the gadget, and the gadget author decided to let everyone know, through an....interesting means.

ALT.NET summary blog

If the ALT.NET mailing list is too much to keep up with, as it is the Mother of All Firehoses (MOAF), several folks have pointed out a nice summary blog:

Alt.Net Pursefight!

It keeps a nice daily ego check and blow-by-blow commentary of some of the more interesting comment wars going on there.  Pretty funny.

Friday, December 14, 2007

Ruby-style Array methods in C# 3.0

A while back I played with Ruby-style loops in C# 3.0.  This sparked my jealousy of other fun Ruby constructs that I couldn't find in C#, and a couple of them are the "each" and "each_with_index" methods for arrays.  Here's an example, from thinkvitamin.com:

my_vitamins = ['b-12', 'c', 'riboflavin']

my_vitamins.each do |vitamin|
  puts "#{vitamin} is tasty!"
end
=> b-12 is tasty!
=> c is tasty!
=> riboflavin is tasty!

With both Arrays and List<T> in .NET, this is already possible: 

string[] myVitamins = {"b-12", "c", "riboflavin"};

Array.ForEach(myVitamins,
    (vitamin) =>
    {
        Console.WriteLine("{0} is tasty", vitamin);
    }
);

var myOtherVitamins = new List<string>() { "b-12", "c", "riboflavin" };

myOtherVitamins.ForEach(
    (vitamin) =>
    {
        Console.WriteLine("{0} is very tasty", vitamin);
    }
);

There are a few problems with these implementations, however:

  • Inconsistent between types
  • IEnumerable<T> left out
  • Array has a static method, whereas List<T> is instance
  • Index is unknown

Since T[] implicitly implements IEnumerable<T>, we can create a simple extension method to handle any case.

Without index

I still like the "Do" keyword in Ruby to signify the start of a block, and I'm not a fan of the readability (or "solubility", whatever) of the "ForEach" method.  Instead, I'll borrow from the loop-style syntax I created in the previous post that uses a "Do" method:

myVitamins.Each().Do(
    (vitamin) =>
    {
        Console.WriteLine("{0} is tasty", vitamin);
    }
);

To accomplish this, I'll need something to add the "Each" method, and something to provide the "Do" method.  Here's what I came up with:

public static class RubyArrayExtensions
{
    public class EachIterator<T>
    {
        private readonly IEnumerable<T> values;

        internal EachIterator(IEnumerable<T> values)
        {
            this.values = values;
        }

        public void Do(Action<T> action)
        {
            foreach (var item in values)
            {
                action(item);
            }
        }
    }

    public static EachIterator<T> Each<T>(this IEnumerable<T> values)
    {
        return new EachIterator<T>(values);
    }
}

The "Each" generic method is an extension method that extends anything that implements IEnumerable<T>, which includes arrays, List<T>, and many others.  IEnumerable<T> is ripe for extension, as .NET 3.5 introduced dozens of extension methods for it in the System.Linq.Enumerable class.  With these changes, I now have a consistent mechanism to perform an action against an array or list of items:

string[] myVitamins = { "b-12", "c", "riboflavin" };

myVitamins.Each().Do(
    (vitamin) =>
    {
        Console.WriteLine("{0} is tasty", vitamin);
    }
);

var myOtherVitamins = new List<string>() { "b-12", "c", "riboflavin" };

myOtherVitamins.Each().Do(
    (vitamin) =>
    {
        Console.WriteLine("{0} is very tasty", vitamin);
    }
);

With index

Ruby also has a "each_with_index" method for arrays, and in this case, there aren't any existing methods on System.Array or List<T> to accomplish this.  With extension methods, this is still trivial to accomplish.  I now just include the index whenever executing the callback to the Action<T, int> passed in.  Here's the extension method with the index:

public static class RubyArrayExtensions
{
    public class EachWithIndexIterator<T>
    {
        private readonly IEnumerable<T> values;

        internal EachWithIndexIterator(IEnumerable<T> values)
        {
            this.values = values;
        }

        public void Do(Action<T, int> action)
        {
            int i = 0;
            foreach (var item in values)
            {
                action(item, i++);
            }
        }
    }

    public static EachWithIndexIterator<T> EachWithIndex<T>(this IEnumerable<T> values)
    {
        return new EachWithIndexIterator<T>(values);
    }
}

The only difference here is I keep track of an index to send back to the delegate passed in from the client side, which now looks like this:

string[] myVitamins = { "b-12", "c", "riboflavin" };

myVitamins.EachWithIndex().Do(
    (vitamin, index) =>
    {
        Console.WriteLine("{0} cheers for {1}!", index, vitamin);
    }
);

var myOtherVitamins = new List<string>() { "b-12", "c", "riboflavin" };

myOtherVitamins.EachWithIndex().Do(
    (vitamin, index) =>
    {
        Console.WriteLine("{0} cheers for {1}!", index, vitamin);
    }
);

This now outputs:

0 cheers for b-12!
1 cheers for c!
2 cheers for riboflavin!
0 cheers for b-12!
1 cheers for c!
2 cheers for riboflavin!

Pointless but fun

I don't think I'd ever introduce these into production code, as it's never fun to drop new ways to loop on other's laps.  If anything, it shows how even parentheses can hinder readability, even if the method names themselves read better.

In any case, I now have a simple, unified mechanism to perform an action against any type that implements IEnumerable<T>, which includes arrays and List<T>.

Wednesday, December 12, 2007

Decomposing a book club

Book clubs can be a great way to foster learning and encourage growth on a team.  They aren't always the best avenue for training, which might include:

  • Formal training
  • Industry events
  • Presentations
  • Brown bag lunches
  • etc.

I always enjoyed book clubs because it gave our team a chance to discuss technical topics on a regular basis, sometimes outside of domains we were working on.

Planning a book club

If you're starting a book club for the first time, you might have to just pick a book that everyone might be interested in without specifically asking anyone.  Having the desire to start a book club probably means you already know what deficiencies exist in your team, so you're better equipped to pick a book.

In that case, pick a deficiency and find a great book to study.  It helps if you've read it beforehand so you know what you're getting yourself into.  Book clubs need guidance and leadership during meetings to make sure learning and growth take place, and it's hard to do if the material is new to everyone. 

Software books usually come in two flavors:

  • Design
  • Technology

Examples of design books might be GoF, anything in the Fowler signature series, Pragmatic Programmer, and other books that are design/principle specific but language/technology agnostic.  Technology books are filled with referential material, but not as much guidance on specific subjects like ASP.NET.

Some books I've done book clubs on are:

  • Programming ASP.NET 2.0: Core Reference, Dino Esposito
  • The Pragmatic Programmer, Andrew Hunt and David Thomas
  • Agile Software Development: Principles, Practices and Patterns, Robert C. Martin
  • Framework Design Guidelines, Cwalina and Abrams
  • Working Effectively with Legacy Code, Michael Feathers
  • Refactoring to Patterns, Joshua Kerievsky

As you can see, I'm heavily weighted to "Design" type books, mostly because values, principles, and practices translate well to any new technology.

Book club agenda

After we select a book, we decide on an agenda and schedule.  For internal book clubs, we try to meet once a week at lunch, covering about 30-40 pages a session.  We try to get each book club to end after 10-12 weeks, or about 4 months.  Everyone reads all the material and discusses it at each meeting.  Sometimes we skip chapters if the material isn't relevant or particularly insightful.

Here in the Austin area, we're forming an Austin-wide DDD book club to go over Evans' Domain-Driven Design book.  The first thing I do in determining schedule is to break down the parts and chapters into the number of pages they take up:

I laid out each of the chapters and noted what page they started on, then calculated the part and chapter lengths in separate columns.  Since a single discussion can never cover more than 40 pages in an hour, I used conditional formatting to add bars signifying lengths of the chapter for easier planning.

Chapter length distribution seemed to be all over the place, so I created a distribution chart to see it a little more clearly:

This distribution is interesting, as it shows a fairly random distribution of chapter lengths.  Not Gaussian, not evenly distributed, but a whole bunch of short chapters, a few medium chapters, a few more big chapters, and one off-the chart chapter.  Ideally, I'd have one 40 page chapter per session, but it doesn't always work out that way.

Suggested agendas

Sometimes, especially with the Fowler Signature Series books, the introductory chapters suggest both the important chapters and a suggested study sequence/reading sequence.  This can guide the book club agenda.

In the case of the Evans book, he calls out the "most important" chapters, and says that all others can be read out of order, but are meant to be read in their entirety.  We might flag some chapters as "nice to have" and put them aside, to go back to later if there's time.

The meeting

Book club meetings are meant to be fun, open, inviting, and intellectually appealing.  If someone isn't engaged, they're probably on their laptop or Crack-berry, so keep an eye out for wandering minds.  Asking questions tends to get everyone involved, rather than a chorus of "yeah that makes sense, I agree, echo echo echo".

If you're leading the book club, be prepared to read ahead and have a list of talking points before you go in.  If anything, the book club leader is charged with creating discussion, but not leading discussion.  Talking points tend to enforce focus, so discussions don't wind down tangents or social topics for too long.  "OMG did you see last night's episode of 'Heroes'?" should be kept to a minimum.

Finally, be aware of your audience.  If you're covering a topic new to a lot of folks, you might have to do some additional prodding to make sure everyone feels like they contributed.  Nobody likes to listen to someone else's conversation for an hour (well, almost nobody).

Encouraging self-improvement

Probably the biggest benefit of book clubs and other organic training like brown bags is that they create a culture of self-improvement.  Having the team engaged in book clubs, brown bags, user groups, etc. can set the bar higher when it comes to quality and pride in individual workmanship.  Plus, sometimes companies pay for the lunch, so that's always good, right?

Thursday, December 6, 2007

Don't hide the ugly

I wanted to take some time to highlight the difference between encapsulation and subterfuge.  Just so we're on the same page:

  • Encapsulation: The ability to provide users with a well-defined interface to a set of functions in a way which hides their internal workings.
  • Subterfuge: An artifice or expedient used to evade a rule, escape a consequence, hide something, etc.

When related to code, both of these techniques hide internal details of the system.  The key difference is who we're hiding the details from.  With encapsulation, we're hiding details from end-users or consumers of our API, which is a good thing.  With subterfuge, we're hiding ugly from developers needing to change the API, which can be disastrous.

Subterfuge hides the ugly, and for ugly to get fixed, I want it front and center.  Subterfuge comes in many varieties, but all achieve the same end result of hiding the ugly instead of fixing the ugly.

Region directives

Region directives in C# and VB.Net let you declare a region of code that can be collapsed or hidden.  I've used them in the past to partition a class based on visibility or structure, so I'll have a ".ctor" region or maybe a "Public Members" region.  Other regions can be collapsed so I don't need to look at them.

For example, our DataGateway class seems nice and concise, and it looks like it has only about 10-20 lines of code:

One small problem, note the region "Legacy db access code".  By applying a region to some nasty code, I've hidden the ugliness away from the developer who otherwise might think it was a problem.  The developer doesn't know about the problem, as I've collapsed over 5000 lines of code into one small block.

If a class is big and nasty, then just let it all hang out.  If it's really bothersome, extract the bad code into a different class.  At least you'll have separation between ugly and nice.  Hiding ugly code in region directives doesn't encourage anyone to fix it, and it tends to deceive those glancing at a class how crazy it might be underneath.

One type per file, please

Nothing irks me more than looking at a solution in the solution explorer, seeing a relatively small number of files, and then realizing that each file has a zillion types.  A file called "DataGateway.cs" hides all sorts of nuttiness:

public enum DataGatewayType
{

}

public abstract class DataGateway
{
    public abstract int[] GetCustomerIDs();
}

public class SqlDataGateway : DataGateway
{
    public override int[] GetCustomerIDs()
    {
        // sproc's for all!!!

        return new int[] { };
    }
}

public class OracleDataGateway : DataGateway
{
    public override int[] GetCustomerIDs()
    {
        // now we're getting enterprise-y 

        return new int[] { };
    }
}

public class MySqlGateway : DataGateway
{
    public override int[] GetCustomerIDs()
    {
        // we don't support hippies 

        return null;
    }
}

Java, as I understand it, has a strict convention of one type per file.  I really like that convention (if I'm correct), as it forces the developer to match their file structure to their package (assembly) structure.  No such luck in .NET, although there may be some FxCop or ReSharper rules to generate warnings.

The problem with combining multiple types to one file is that it makes it very difficult for the developer to gain any understanding of the code they're working with.  Incorrect assumptions start to arise when I see a file name that doesn't match the internal structure.  Some of the rules I use are:

  • One type per file (enum, delegate, class, interface, struct)
  • Project name matches assembly name (or root namespace name)
  • File name matches type name
  • Folder structure matches namespaces
    • i.e. <root>\Security == RootNamespace.Security

With these straightforward rules, a developer can look at the folder structure or the solution explorer and know exactly what types are in the project, how they are organized, and maybe even understand responsibilities and architecture.  Even slight deviations from the above rules can cause unnecessary confusion for developers.  You're not doing anyone any favors by cramming 10 classes into one file, so don't do it.

Letting it all hang out

When I was young, my mom would insist on me cleaning my room once a week (the horror!).  Being the resourceful chap that I was, I found I could get the room resembling clean by cramming everything just out of sight, into the closet, under the bed, even in the bed.  This strategy worked great until I actually needed something I stashed away, and couldn't find it.  I had created the illusion of organization, but the monster was lurking just out of sight.

So instead of hiding the problem, I forced myself to deal with the mess until I cared enough to clean it up.  Eventually this led me to fix problems early and often, so they don't snowball into a disastrous mess.  This wasn't out of self-satisfaction or anything like that, but laziness, as I found it was more work to deal with hidden messes than it was to clean it up in the first place.

If my code is a mess, I'll just let it be ugly.  Ugly gets annoying and ugly gets fixed, but ugly swept under the rug gets overlooked, until it pounces on the next unsuspecting developer.

Monday, December 3, 2007

Dealing with primitive obsession

One code smell I tend to miss a lot is primitive obsession.  Primitives are the building blocks of data in any programming language, such as strings, numbers, booleans, and so on.

Many times, primitives have special meaning, such as phone numbers, zip codes, money, etc.  Nearly every time I encounter these values, they're exposed as simple primitives:

public class Address
{
    public string ZipCode { get; set; }
}

But there are special rules for zip codes, such as they can only be in a couple formats in the US: "12345" or "12345-3467".  This logic is typically captured somewhere away from the "ZipCode" value, and typically duplicated throughout the application.  For some reason, I was averse to creating small objects to hold these values and their simple logic.  I don't really know why, as data objects tend to be highly cohesive and can cut down a lot of duplication.

Beyond what Fowler walks through, I need to add a couple more features to my data object to make it really useful.

Creating the data object

First I'll need to create the data object by following the steps in Fowler's book.  I'll make the ZipCode class a DDD Value Object, and this is what I end up with:

public class Address
{
    public ZipCode ZipCode { get; set; }
}

public class ZipCode
{
    private readonly string _value;

    public ZipCode(string value)
    {
        // perform regex matching to verify XXXXX or XXXXX-XXXX format
        _value = value;
    }

    public string Value
    {
        get { return _value; }
    }
}

This is pretty much where Fowler's walkthrough stops.  But there are some issues with this implementation:

  • Now more difficult to deal with Zip in its native format, strings
  • Zip codes used to be easier to display

Both of these issues can be easy to fix with the .NET Framework's casting operators and available overrides.

Cleaning it up

First, I'll override the ToString() method and just output the internal value:

public override string ToString()
{
    return _value;
}

Lots of classes, tools, and frameworks use the ToString method to display the value of an object, and now it will use the internal value of the zip code instead of just outputting the name of the type (which is the default).

Next, I can create some casting operators to go to and from System.String.  Since zip codes are still dealt with mostly as strings in this system, I stuck with string instead of int or some other primitive.  Also, many other countries have different zip code formats, so I stayed with strings.  Here are the cast operators, both implicit and explicit:

public static implicit operator string(ZipCode zipCode)
{
    return zipCode.Value;
}

public static explicit operator ZipCode(string value)
{
    return new ZipCode(value);
}

I prefer explicit operators when converting from primitives, and implicit operators when converting to primitives.  FDG guidelines for conversion operators are:

DO NOT provide a conversion operator if such conversion is not clearly expected by the end users.

DO NOT define conversion operators outside of a type's domain.

DO NOT provide an implicit conversion if the conversion is potentially lossy.

DO NOT throw exceptions from implicit casts.

DO throw System.InvalidCastException if a call to a cast operator results in lossy conversion and the contract of the operator does not allow lossy conversions.

I meet all of these guidelines, so I think this implementation will work.

End result

Usability with the changed ZipCode class is much improved now:

Address address = new Address();

address.ZipCode = new ZipCode("12345"); // constructor
address.ZipCode = (ZipCode) "12345"; // explicit operator

string zip = address.ZipCode; // implicit operator

Console.WriteLine("ZipCode: {0}", address.ZipCode); // ToString method

Basically, my ZipCode class now "plays nice" with strings and code that expects strings.

With any hurdles out of the way for using simple data objects, I can eliminate a lot of duplication and scattered logic by creating small, specialized classes for all of the special primitives in my app.

Time is running out

I popped open Windows Live Writer today and got a fun message:

I thought this product was free, and I never paid for anything, so I'm a little confused how a free product can expire.  Live Writer isn't supported on Server 2003, which is what I use, so I have to jump through 80 or so hoops to get Live Writer installed on Server 2003.  Everything works perfectly fine, but now it seems I will be compelled to jump through the same hoops to upgrade to a version I don't need.  Fun times.

Thursday, November 29, 2007

Hall of shame

We keep a "Hall of Shame" of WTF-level code snippets to remind us that however bad we might think things get, it could always be worse.  It also serves as a reminder to us that we can't be complacent with ignorance, and lack of strong technical leadership in critical applications can have far-reaching negative consequences.  Here are a few fun examples.

Runtime obsolescence

I've seen this when we have members that we don't want anyone to use anymore.  Unfortunately, it didn't quite come out right:

public static bool Authenticate(string username, string password)
{
    throw new Exception("I'm obsolete, don't call me, bro!");
}

Nobody knows not to call this method until their codebase actually executes this code at runtime, which could be at any phase in development.  If it's an obscure method, it might not get called at all until production deployment if test coverage is low.

We want to communicate to developers at compile-time that members are obsolete, and we can do this we the ObsoleteAttribute:

[Obsolete("Please use the SessionContext class instead.", true)]
public static bool Authenticate(string username, string password)
{
}

Both the IDE and compilers support this attribute to give feedback to developers when a member should no longer be used.

Exception paranoia

I've seen exception paranoia mostly when developers don't have a thorough understanding of the .NET Framework.  A try...catch block is placed around code that can absolutely never fail, but juuuuuust in case it does:

public static MyIdentity Ident
{
    get
    {
        if (((HttpContext.Current != null) && (HttpContext.Current.User != null)) && (HttpContext.Current.User.Identity != null))
        {
            try
            {
                return (HttpContext.Current.User.Identity as MyIdentity);
            }
            catch (Exception exception)
            {
                ExceptionManager.Publish(exception);
                return null;
            }
        }
        return null;
    }
}

This is also another case of TCPS.  The "if" block already takes care of any failures of the "try" block, and as we're using the "as" operator, the downcasting can never fail either.  The 13 lines of code above were reduced to only one:

public static MyIdentity Ident
{
    get
    {
        return HttpContext.Current.User.Identity as MyIdentity;
    }
}

If HttpContext.Current is null, I'd rather throw the exception, as my ASP.NET app probably won't work without ASP.NET up and running.

Clever debug switches

This is probably my favorite Hall of Shame example.  So much so that we named an award at my previous employer after the pattern.  Sometimes I like to insert a block of code for debugging purposes.  If I want my debug statements to live on in source control, I have a variety of means to do so, including compiler directives, the ConditionalAttribute, or even source control to look at history.  A not-so-clever approach we found was:

public void SubmitForm()
{
    if (false)
    {
        MessageBox.Show("Submit button was clicked.");
    }
}

To make the code run, you change the block to "if (true)".  This had us cracking up so much we named our Hall of Shame award after it, calling it the "If False Award".  Whenever we found especially heinous or corner-cutting code, they deserved the "If False Award".  We never awarded ignorance, only laziness.

Opportunities to improve

It's easy to let a "Hall of Shame" create negative and unconstructive views towards the codebase.  We always used it as a teaching tool to show that we would fix ignorance, but we would not tolerate laziness.

Every developer starts their career writing bad code.  Unfortunately, some never stop.

Intention-concealing interfaces: blob parameters

When someone is using your code, you want your code to be as explicit and easy to understand as possible.  This is achieved through Intention-Revealing Interfaces.  Evans describes the problems of opaque and misleading interfaces in Domain-Driven Design:

If a developer must consider the implementation of a component in order to use it, the value of encapsulation is lost.  If someone other than the original developer must infer the purpose of an object or operation based on its implementation, that new developer may infer a purpose that the operation or class fulfills only by chance.  If that was not the intent, the code may work for the moment, but the conceptual basis of the design will have been corrupted, and the two developers will be working at cross-purposes.

His recommendation is to create Intention-Revealing Interfaces:

Name classes and operations to describe their effect and purpose, without reference to the means by which they do what they promise.  This relieves the client developer of the need to understand the internals.  The name should conform to the Ubiquitous Language so that the team members can quickly infer their meaning.  Write a test for a behavior before creating it, to force your thinking into client developer mode.

Several times now I've run into blob parameters (I'm sure there are other names for them).  Blob parameters are amorphous, mystery parameters used in constructors and methods, usually of type ArrayList, HashTable, object[], or even params object[].  Here's an example:

public class OrderProcessor
{
    public void Process(object[] args)
    {
        Order order = (Order) args[0];
        Customer customer = (Customer) args[1];

        // etc.
    }
}

The intentions behind this code are good, but the results can lead to some frustrating results.  Furthermore, this pattern leads to "Intention-Concealing Interfaces", which is style completely opposite from the Intention-Revealing style Eric proposes.  Clients need to know the intimate details of the internal implementation of the component in order to use the blob parameters.

Wrong kind of extensibility

Blob parameters lead to the worst kind of coupling, beyond just "one concrete class using another concrete class".  When using blob parameters, the client gets coupled to the internal implementation of unwinding the blob.  Additionally, new client code must have intimate knowledge of the details, as the correct order of the parameters is not exposed by the blob method's interface.

When a developer needs to call the "Process" method above, they are forced to look at the internal implementation.  Hopefully they have access to the code, but otherwise they'll need to pop open Reflector to determine the correct parameters to pass in.  Instead of learning if the parameters are correct at compile-time, they have to wait until run-time.

Alternatives

Several alternatives exist before resorting to blob parameters:

  • Explicit parameters
  • Creation method
  • Factory patterns
  • IoC containers
  • Generics

Each of these alternatives can be used separately or combined together to achieve both Intention-Revealing Interfaces and simple code.

Explicit Parameters

Explicit parameters just means that members should ask explicitly what they need to operate through their signature.  The Process method would be changed to:

public class OrderProcessor
{
    public void Process(Order order, Customer customer)
    {
        // etc.
    }
}

The Process method takes exactly what components it needs to perform whatever operations it does, and clients of the OrderProcessor can deduce this simply through the method signature.

If this signature needs to change due to future requirements, you have a few choices to deal with this change:

  • Overload the method to preserve the existing signature
  • Just break the client code

People always assume it's a big deal to break client code, but in many cases, it's not.  Unless you're shipping public API libraries as part of your product, backwards compatibility is something that can be dealt with through Continuous Integration.

Creation Method/Factory Patterns/IoC Containers

When blob parameters start to invade constructors, it's a smell that you need some encapsulation around object creation.  Instead of dealing with changing requirements through blob parameters, encapsulate object creation:

public class OrderProcessor
{
    private readonly ISessionContext _context;

    public OrderProcessor() : this(ObjectFactory.GetInstance<ISessionContext>())
    {
    }

    public OrderProcessor(ISessionContext context)
    {
        _context = context;
    }
}

Originally, the ISessionContext was a hodgepodge of different object that I needed to pass in to the OrderProcessor class.  Since these dependencies became more complex, I encapsulated them into a parameter object, and introduced a factory method (the "ObjectFactory" class) to encapsulate creation.  Client code no longer needs to pass in a group of complex objects to create the OrderProcessor.

Generics

Sometimes blob parameters surface because a family of objects needs to be created or used, but all have different needs.  For example, the Command pattern that requires data to operate might look like this before generics:

public interface ICommand
{
    void Execute(object[] args);
}

public class TransferCommand : ICommand
{
    public void Execute(object[] args)
    {
        Account source = (Account) args[0];
        Account dest = (Account) args[1];
        int amount = (int) args[2];

        source.Balance -= amount;
        dest.Balance += amount;
    }
}

The ICommand interface needs to be flexible in the arguments it can pass to specific implementations.  We can use generics instead of blob parameters to accomplish the same effect:

public interface ICommand<T>
{
    void Execute(T args);
}

public struct Transaction
{
    public Account Source;
    public Account Destination;
    public int Amount;

    public Transaction(Account source, Account destination, int amount)
    {
        Source = source;
        Destination = destination;
        Amount = amount;
    }
}

public class TransferCommand : ICommand<Transaction>
{
    public void Execute(Transaction args)
    {
        Account source = args.Source;
        Account dest = args.Destination;
        int amount = args.Amount;

        source.Balance -= amount;
        dest.Balance += amount;
    }
}

Each ICommand implementation can describe its needs through the generic interface, negating the need for blob parameters.

No blobs

I've seen a lot of misleading, opaque code, but blob parameters take the prize for "Intention-Concealing Interfaces".  Instead of components being extensible, they're inflexible, brittle, and incomprehensible.  Before going down the path of creating brittle interfaces, exhaust all alternatives before doing so.  Every blob parameter I've created I rolled back later as it quickly becomes difficult to deal with.

Monday, November 26, 2007

String extension methods

I got really tired of the IsNullOrEmpty static method, so I converted it into an extension method:

public static class StringExtensions
{
    public static bool IsNullOrEmpty(this string value)
    {
        return string.IsNullOrEmpty(value);
    }
}

Since that method was already static, I just put a simple wrapper around it.  All scenarios pass:

[TestMethod]
public void Matches_existing_behavior()
{
    Assert.IsFalse("blarg".IsNullOrEmpty());
    Assert.IsTrue(((string)null).IsNullOrEmpty());
    Assert.IsTrue("".IsNullOrEmpty());

    string value = null;

    Assert.IsTrue(value.IsNullOrEmpty());

    value = string.Empty;

    Assert.IsTrue(value.IsNullOrEmpty());

    value = "blarg";

    Assert.IsFalse(value.IsNullOrEmpty());
}

The only rather strange thing about extension methods is that they can be called by null references, as shown in the test above.  As long as everyone understands that extension methods are syntactic sugar and not actually extending the underlying types, I think it's clear enough.

The extension method is more readable and lets me work with strings instead of having to remember that it's a static method.  Static methods that reveal information about instances of the same type seem to be prime candidates for extension methods.

Some 80/20 feedback

Jeff had an interesting post today on putting programmers into two bins, an 80% and a 20%.  The 20% were the alpha-geeks, and the 80% were everyone else,
"Morts" for lack of a better term.  One characterization was that the 20% folks let tech-related items push past work hours, while the 80% limited tech stuff to between the hours of 8 and 5.  I still don't like the idea of pushing people into bins, but there's only really 3 types of people I care about:

  • Those that lead
  • Those that listen
  • Those that don't listen

Although I've had a relatively short experience in development so far, I've only met maybe one or two people that just didn't listen.  Reasons why they don't listen might be indifference, personal, political, etc.  These people are very few, maybe 1-5% of the people I've worked with, but those are the ones I work to limit my involvement with.

I want leaders, but not all leaders as too many alpha geeks can lead to severe ego clashes.  I don't want all listeners as someone has to set direction.  But as long as I have at least one leader and enough listeners, it's enough to deliver.

Also, it's a mistake to have a goal to bring the 80% up into the 20%.  If someone wants to be in the 20%, fine, but it's not a prerequisite for a successful project.  Being in the 80% bracket just means you have different priorities, but not that you don't care.  Saying the 80% just don't care is misleading, polarizing and narcissistic.

When I used to interview people, I cared less about years of experience, certifications and the like.  Our most successful hires were those who showed a willingness to learn and improve, and an openness to listen and try new approaches.  It only took us about 3-6 months to train good mid-level developers, but it can take years (or never) to convince senior-level folks to listen.

Tuesday, November 20, 2007

Stop the madness

I've been extending a legacy codebase lately to make it a bit more testable, and a few small, bad decisions have slowed my progress immensely.  One decision isn't bad in and of itself, but a small bad decision multiplied a hundred times leads to significant pain.  It's death by a thousand cuts, and it absolutely kills productivity.  Some of the small decisions I'm seeing are:

  • Protected/public instance fields
  • Inconsistent naming conventions
  • Try-Catch-Publish-Swallow
  • Downcasting

The pains of these bad decisions can wreak havoc when trying to add testability to legacy codebases.

Public/protected instance fields

One of the pillars of OOP is encapsulation, which allows clients to use an object's functionality without knowing the details behind it.  From FDG:

The principle states that data stored inside an object should be accessible only to that object.

Followed immediately by their guideline:

DO NOT provide instance fields that are public or protected.

It goes on to say that access to simple public properties are optimized by the JIT compiler, so there's no performance penalty in using better alternatives.  Here's an example of a protected field:

public class Address
{
    protected string zip;

    public string Zip
    {
        get { return zip; }
    }
}

public class FullAddress : Address
{
    private string zip4;

    public string Zip4 
    {
        get
        {
            if (Zip.Contains("-"))
            {
                zip4 = zip.Substring(zip.IndexOf("-") + 1);
                zip = zip.Substring(0, zip.IndexOf("-"));
            }
            return zip4;
        }
    }
}

There was an originally good reason to provide the derived FullAddress write access to the data in Address, but there are better ways to approach it.  Here's a better approach:

public class Address
{
    private string zip;

    public string Zip
    {
        get { return zip; }
        protected set { zip = value; }
    }
}

I've done two things here:

  • Added a protected setter to the Zip property
  • Changed the access level of the field to private

Functionally it's exactly the same for derived classes, but the design has greatly improved.  We should only declare private instance fields because:

  • Public/protected/internal violates encapsulation
  • When encapsulation is violated, refactoring becomes difficult as we're exposing the inner details
  • Adding a property later with the same name breaks backwards binary compatibility (i.e. clients are forced to recompile)
  • Interfaces don't allow you to declare fields, only properties and methods
  • C# 2.0 added the ability to declare separate visibility for individual getters and setters

There's no reason to have public/protected instance fields, so make all instance fields private.

Inconsistent naming conventions

Names of classes, interfaces, and members can convey a great deal of information to clients if used properly.  Here's a good example of inconsistent naming conventions:

public class Order
{
    public Address address { get; set; }
}

public class Quote : Order
{
    public void Process()
    {
        if (address == null)
            throw new InvalidOperationException("Address is null");
    }
}

When I'm down in the "Process" method, what is the "address" variable?  Is it a local variable?  Is it a private field?  Nope, it's a property.  Since it's declared camelCase instead of PascalCase, it led to confusion on the developer's part about what we were dealing with.  If it's local variable, which the name suggests, I might treat the value much differently than if it were a public property.

Deviations from FDG's naming conventions cause confusion.  When I'm using an .NET API that uses Java's camelCase conventions, it's just one more hoop I have to jump through.  In places where my team had public API's we were publishing, it wasn't even up for discussion whether or not we would follow the naming conventions Microsoft used in the .NET Framework.  It just happened, as any deviation from accepted convention leads to an inconsistent and negative user experience.

It's not worth the time to argue whether interfaces should be prefixed with an "I".  That was the accepted convention, so we followed it.  Consistent user experience is far more important than petty arguments on naming conventions.  If I developed in Java, I'd happily use camelCase, as it's the accepted convention.

Another item you may notice as there are no naming guidelines for instance fields.  This reinforces the notion that they should be declared private, and the only people who should care about the names are the developers of that class and the class itself.  In that case, just pick a convention, stick to it, and keep it consistent across your codebase so it becomes one less decision for developers.

Try Catch Publish Swallow

Exception handling can really wreck a system if not done properly.  I think developers might be scared of exceptions, given the number of useless try...catch blocks I've seen around.  Anders Heljsberg notes:

It is funny how people think that the important thing about exceptions is handling them. That is not the important thing about exceptions. In a well-written application there's a ratio of ten to one, in my opinion, of try finally to try catch. Or in C#, using statements, which are like try finally.

Here's an example of a useless try-catch:

public class OrderProcessor
{
    public void Process(Order order)
    {
        try
        {
            ((Quote)order).Process();
        }
        catch (Exception ex)
        {
            ExceptionManager.Publish(ex);
        }
    }
}

In here we have Try Catch Publish Swallow.  We put the try block around an area of code that might fail, and catch exceptions in case it does.  To handle the exception, we publish it through some means, and then nothing.  That's exception swallowing.

Here's a short list of problems with TCPS:

  • Exceptions shouldn't be used to make decisions
  • If there is an alternative to making decisions based on exceptions, use it (such as the "as" operator in the above code)
  • Exceptions are exceptional, and logging exceptions should be done at the highest layer of the application
  • Not re-throwing leads to bad user experience and bad maintainability, as we're now relying on exception logs to tell us our code is wrong

Another approach to the example might be:

public class OrderProcessor
{
    public void Process(Order order)
    {
        Quote quote = order as Quote;
        
        if (quote != null)
            quote.Process();
    }
}

The only problem this code will have is if "quote.Process()" throws an exception, and in that case, we'll let the appropriate layer deal with those issues.  Since I don't any resources to clean up, there's no need for a "try..finally".

Downcasting

I already wrote about this recently, but it's worth mentioning again.  I spent a great deal of time recently removing a boatload of downcasts from a codebase, and it made it even worse that the downcasts were pointless.  Nothing in client code was using any additional members in the derived class.  It turned out to be a large pointless hoop I needed to jump through to enable testing.

Regaining sanity

The problem with shortcuts and knowingly bad design decisions is that this behavior can become habitual, and the many indiscretions can add up to disaster.  I had a high school band director who taught me "Practice doesn't make perfect - perfect practice makes perfect."

By making good decisions every day, it becomes a habit.  Good habits practiced over time eventually become etched into your muscle memory so that it doesn't require any thought.  Until you run into a legacy codebase that is, and you realize how bad your old habits were.

ReSharper in VS 2008

UPDATE 11/26/2007

Jeffrey Palermo notes that you can use R# in VS 2008 by turning off a couple of features.

Ilya Ryzhenkov, product manager at JetBrains, gives detailed information on R# support in VS 2008 and talks about R# 4.0.

After installing VS 2008 Team Suite yesterday, I thought I'd try the ReSharper 3.0 for VS 2008 download.  The verdict?  Works great unless you use any C# 3.0 feature.  Here's one example of it blowing up (but not crashing):

I'm using:

  • var keyword
  • anonymous types
  • LINQ

That was a simple LINQ query to count the number of types in the .NET Framework that implement IDisposable and end in the word "Context" (to settle a bet).  Here's another example:

That's an experimental API for pre- and post-conditions (for pseudo-design-by-contract) that uses:

  • extension methods
  • lambda expressions

From JetBrains' website, they clearly state in the download page:

ReSharper 3.0 for Visual Studio 2008 (Codename "Orcas") only supports C# 2.0 and Visual Basic .NET 8 language features. C# 3.0 will be supported in the next major release of ReSharper.

Now I have a dilemma.  Using Visual Studio without ReSharper is like taking a trip in a horse and buggy instead of a car.  Sure, it'll get you there, but you'll have to deal with a lot of crap along the way.  Until R# v.Next comes out, I'll just have to deal with it.  If you're using VS 2008, don't install R# 3.0 if you plan on using C# 3.0.  Hopefully it'll come out soon, but I haven't seen any roadmaps yet, so I'll just wait.

Monday, November 19, 2007

Installing Visual Studio 2008

As has already been noted in numerous places, VS 2008 dropped today.  The installation process has been surprisingly smooth, given my past installation experiences.

Downloading went surprisingly quick (after I was able to get in to the subscriptions site, which took about 2 hours).  I averaged around 900KB/sec (that's bytes, not bits), about 10 times more than I expected on RTM day.

Cleaning up

First, you'll need to make sure that you don't have any Beta bits installed.  I had VS 2008 Beta 2 on my dev machine, and no problems cropped up during uninstallation.  I removed pretty much anything that had "2008" and "3.5" in its name, as I had installed quite a few of the Beta 2 releases.

Additionally, I removed them in the reverse order I installed them.  Don't uninstall VS 2008 before Team Explorer for example.  My lack of later installation issues might be attributed to this.

Installing

Even after I uninstalled the 2008 and 3.5 apps I found, the 2008 installer noted that I had some pre-release software still installed (.NET Compact Framework) that needed to be removed.  Their installers have always been good about letting you know about conflicting versions, but it's a relief that it plays well with Beta 2.

About halfway through the installation, the installer noted that I needed to close some apps.  These included IE and XML Notepad.  This always happens with me and various MS installers, but I primarily use Firefox, so I didn't have to close any windows I needed.

About an hour later, all of the installations finished (VS 2008 Team Suite, Team Explorer 2008, and MSDN).

First impressions

After the obligatory first-launch screen (which settings do you want?  Whatever ReSharper does!), I noticed that the "Recent Projects" list included the same set of projects as Beta 2, and they all opened with no issues.

The only problem so far is that there is no officially supported version of ReSharper for VS 2008 RTM, and the website says that the next major release will be the first officially supported version.  But let's be honest, any time spent without ReSharper just makes us love it even more, right?

Thursday, November 15, 2007

Progress made

I just wanted to highlight a quote from Scott Guthrie's recent post on MVC (emphasis mine):

VS 2008 Professional now includes built-in testing project support for MSTest (previously in VS 2005 this required a Visual Studio Team System SKU), and our default ASP.NET MVC project template automatically creates one of these projects when you use VS 2008

We'll also be shipping project template downloads for NUnit, MBUnit and other unit test frameworks as well, so if you prefer to use those instead you'll also have an easy one click way to create your application and have a test project immediately ready to use with it.

I know it's not all the way to "convention over configuration", but it is nice that the tool now says, "oh you want to create a website?  Ya gonna need tests."  Now it's a conscious decision on the part of the developer to "opt-out" of tests, instead of opting-in.  Just another way a tool can help guide good decisions.

As it's looking like unit testing is becoming a first-class citizen in both the tools and the frameworks, I think it's time for me to jump ship.  Mainstream is for chumps.  But seriously, it's nice to see that testing concerns are becoming more mainstream, though I think that it just means the community needs to double their efforts in this precarious stage, such that growth continues in the right direction.  I don't want to unwind a bunch of bad design, habits, and practices just because the tooling made it easy.

I do still get demos of "right-click class in Solution Explorer, click 'Generate Unit Tests', watch bugs disappear!", so there's some work needed on the marketing front.  Unfortunately, I saw a lot of not-so-technical decision makers nodding their head and asking, "can the unit test template also do null checking?"  "Can my tool be so awesome and RAD, can I skimp and hire brain-dead developers?"  "Does this mean I don't have to train people how to be good developers, just good tool-users?"

This is a tangent, but slick tooling can help lower the barrier, but it can also take the user in a direction they don't want to go (or don't know they shouldn't go).

Wednesday, November 14, 2007

A downcasting tragedy

And now, for another tale of legacy code woe, here's a gut-wrenching tragedy of OCP and downcasting, in five acts.

Exposition

We have a feature we're implementing that when the store's configuration settings allow for a survey link to be shown, it should show up, otherwise, it shouldn't.  We put these settings in a simple interface (cut down for brevity):

public interface ISettings
{
    bool ShowSurvey { get; }
}

Settings can come from many places, such as configuration files, database, or web service.  This is a pretty nice example of OCP, as our app is open for extension for different "ISettings" implementations, but closed for modification as someone can't decide to use something else besides "ISettings".  Since we have an interface, the source doesn't matter.

We manage access to the settings through a static class:

public static class StoreContext
{
    public static ISettings Settings { get; set; }
}

Actual instances are set during early in the ASP.NET pipeline, so later stages have that information.  The original ISettings implementations are stored in ASP.NET Session.

Now, we need to provide functional and regression tests for this feature, using Selenium.

Rising action

To help with functional and regression tests, I'm employing the record and playback state technique I mentioned earlier.  I create an "ISettings" implementation that can be serialized and deserialized into XML, which helps when I want to manually edit the settings:

[XmlRoot("Settings")]
public class SettingsDto : ISettings
{
    public SettingsDto() { }

    public SettingsDto(ISettings settings)
    {
        ShowSurvey = settings.ShowSurvey;
    }

    public bool ShowSurvey { get; set; }
}

My "download.aspx" page combines both the cart and the settings information into a single DTO, and streams this file to the client.  In my "replay.aspx" page, I take the XML, deserialize it, and put it into Session.  Finally, I redirect to the "shoppingcart.aspx" page, which pulls the ISettings implementation out of Session and sets the StoreContext variable to my deserialized SettingsDto instance.

This is all fairly straightforward, and I'm pretty happy about this solution as I can now plug in any settings I like and verify the results through automated tests.  As it is with all legacy code, things just aren't that simple.

Climax

Running the code the first time, downloading and replaying goes smoothly, and all of my values are serialized and deserialized properly as my unit tests prove.  However, when I get redirected to the "shoppingcart.aspx" page, I get a distressing exception:

System.InvalidCastException:
  Unable to cast object of type 'BusinessObjects.SettingsDto'
  to type 'BusinessObjects.DbSettings'.

The dreaded invalid cast exception, something is making some really bad assumptions and killing my code.

Falling action

Just when I thought I was making progress, the InvalidCastException came up and crushed my dreams.  Looking at the stack trace, I see that my the exception is coming from the base Page class that all pages in our app derive from.  Here's the offending code:

public abstract class StorePage : Page
{
    protected DbSettings Settings
    {
        get { return (DbSettings)StoreContext.Settings; }
    }
}

A convenience property on the base class allowed derived classes to get at the static ISettings instance.  But on closer inspection, this property is downcasting the ISettings instance coming out of StoreContext to DbSettings.  Downcasting is when you cast to a derived type from an instance of a base type.  That's a fairly large assumption to make, though.  How do I know for sure that the instance is really the derived type, and not some other derived type of the base type?  I can't guarantee that assumption holds, as this example shows.

To make things worse, it explicitly casts instead of using the C# "as" operator (TryCast for VB.NET folks).  At least then it won't fail when trying to downcast, it would just be null.  Using the "as" operator is perfectly acceptable as a means of downcasting.  I try to avoid it, as it can be a smell of Inappropriate Intimacy.

Because the code assumes that the ISettings instance is actually seriously for realz a DbSettings instance, I can never take advantage of OCP and swap out different "ISettings" implementations.  Quite depressing.

Denouement

As an experiment, and because I have a nice automated build process, I can just try to change the type referenced in the StorePage class to ISettings instead of DbSettings.  Luckily, the build succeeded and I kept the change.

If it had failed and clients absolutely needed DbSettings, I might have to look at pushing up methods to the ISettings interface, if it's possible.  If I have access to ISettings that is.  If I don't, well, I'm hosed, and my denouement would turn into a catastrophe.

The moral of this story is:

Avoid downcasting, as it makes potentially false assumptions about the underlying type of an object.  Use the "as" operator instead, and perform null checking to determine if the downcasting succeeded or not.  We can then perform safe downcasting and avoid using exceptions as flow control.

Tip of the day

If you're going to physically demonstrate "handwaving", make sure you don't have hot chai tea in one of the waving hands.

Because it burns.

See also: Cause and effect

The sinking ship

An interesting quote from a colleague today (paraphrased):

Moving developers to projects on the legacy system is like rearranging chairs on the Titanic.

We can tidy it up, but it's still going to the bottom of the Atlantic.  Probably better just to catch the plane.

It came up in the context of investing strategically versus buying tactically.  It's not an either-or question, but it's important to be thinking about long-term goals and assigning resources appropriately to match your strategic view.  The Technical Debt metaphor can offer guidance here too, as buying tactically tends to max out your credit cards (so to speak), while investing strategically actually nets returns.

Monday, November 12, 2007

Coding without confidence

As a favor to my wife, I developed an Excel application to help her employer out with scheduling and status in a small wood manufacturing shop.  Excel was chosen as it was the lowest barrier to entry, and the quickest to get out the door.  It wasn't a mission-critical app and I wasn't getting paid or anything, so I thought I could get something delivered fairly easily.

After all, what was needed was "no big deal" and "we just need something that works".

The final solution used a lot of hidden columns to calculate intermediate results, and macros to make it easier for the end user to interact with the spreadsheet.  The problem is that there are absolutely zero tests on anything, nor can their ever be.  There's no such thing as "ExcelUnit".  I found a "VBAUnit", but it was last released over 4 years ago.

For a while, I couldn't figure out why I was so worried about releasing (i.e., emailing them a copy) the Excel app.  I realized that I had been coding without any confidence that what I was delivering worked, and would continue to work.

I've been so accustomed to having tests and automated builds as my backup, coding without them felt dirty and a little scary.  I was asked "so does it work?" and my only answer was "I hope so".  I was entrenched in "CyfDD", or "Cross-your-fingers Driven Development".  Does it work?  I dunno, better cross your fingers.  CyfDD is a close cousin of "GoykapDD", or "Get-on-your-knees-and-pray Driven Development".  Releasing only once per year or two leads to GoykapDD.

Sometimes I get asked why I bother with automated tests and builds.  Frankly, without automated tests and builds, I have zero confidence in my code.  Nor should I, or should anyone else for that matter.  Confidence in code written without automated tests and builds is just delusion.

Not a good sign

At the end of lunch today at a Chinese restaurant, I cracked open the obligatory fortune cookie to learn what a stale, shrink-wrapped, mass-produced dessert could portend.

It was empty.

I ask for insight into my fate, and the gods answer in their infinite wisdom, "Mu".

Coming after a big launch at work, I really didn't know what to think about this one.  Can't I just get one cookie tell me "Your launch will be great success.  And the Rockets will win tonight."  Is that really too much to ask?

Wednesday, November 7, 2007

Some NAnt tips

Jason got me thinking on some other NAnt tips from his post on listing the targets in the build file.

Include NAnt in your source control

I like to put NAnt in source control and NEVER install it.  I've had too many problems upgrading NAnt and having projects break because of whatever reason, and I have even more issues with custom tasks I write.  It helps to keep your dev machine clean, your build machine clean, and leads to faster dev machine setup time.  Here's my current favorite source tree structure:

  • <ProjectName/Version/Branch>
    • src
      • Where source code goes
    • lib
      • any Dlls referenced from projects, including NUnit dll's
    • tools
      • NAnt
      • NUnit
      • NCover
      • any other exe needed for build/development
    • docs
    • project.sln
    • project.build
    • go.bat

This was all inspired by (stolen from) TreeSurgeon.

Put your build file in your solution

I like to create a solution folder to hold extra project-related files, like my build scripts.  When they're in the solution, it makes it easier to load them up and edit them.  Since ReSharper supports many features for NAnt and MSBuild scripts, it makes sense to have them easy to get to.

I already talked about this earlier, but solution folders are a great way to create logical organization in your solution that doesn't have to correspond to the physical organization on disk.

Create a NAnt bootstrap batch file

I can't stand typing in "nant -buildfile:blahblahblah", especially when I execute the NAnt version that's in my source tree (not the one installed).  I like to create a one-line batch file called "go.bat" that bootstraps NAnt and my build file:

@tools\nant\NAnt.exe -buildfile:NBehave.build %*

The @ symbol tells cmd.exe not to echo the command text out, then I specify the location of NAnt and the build file.  The last part, "%*", just tells cmd.exe to append all commands after "go.bat" to the "nant.exe" program.  For example, I can now do this:

go clean test deploy

The "clean test deploy" commands are targets in my NAnt build script.  Any other NAnt command line options can be typed in after "go".  I put this batch file at the project root level.

Use the "description" attribute to document public targets

As Jason mentioned, the "-projecthelp" NAnt switch will output a list of targets in your NAnt script.  By providing a description in your target, you can provide some better information in the help output.  Here's what NBehave's help looks like:

The targets are separated into Main and Sub Targets.  Main Targets are targets with descriptions, and Sub Targets are those without.  I can still provide internal documentation with XML comments in the build script.  By only providing descriptions on targets that I need to call often, I can create some level of encapsulation in my build file.

I'm sure there are a dozen better tips out there, but these are the ones I make sure to use on every project I'm involved with.

Tuesday, November 6, 2007

Joining Los Techies

How very rude of me, looking back I forgot to mention I've joined Los Techies.  A few things drew me to joining:

  • Name sounds very similar to a certain cerveza
  • The promise of a free T-Shirt
  • Being part of a community is important to me, and the shadier the better
  • My continuing quest to lower the bar wherever I go
  • Peer intimidation factor lowered by lack of MVPs, unlike other communities

My feed URL will stay the same (http://feeds.feedburner.com/GrabBagOfT), and I'll cross-post between the two sites as Google seems to like the Blogspot site. I'll continue to respond to comments on each as well.  However, I'd recommend additionally subscribing to the community feed (http://www.lostechies.com/blogs/MainFeed.aspx).  You'll have access to a greater breadth and depth of content than I could ever hope to create.

Guidance over governance

Being part of a large IT organization, I've become acutely aware of the need to provide a central policy or governance over critical IT operations.  While greenfield/product teams can largely operate in independence, the risk associated with large, complex, and business-critical systems demands some degree of oversight.  However, an IT governance team can fall into some huge traps in their effort to set and enforce policy:

  • Overstepping their bounds and enforcing governance where only guidance is needed
  • Failing to measure the outcomes of governance decisions

For governance bodies to be effective, there needs to be accountability for decisions made.  If teams feel negatively affected by governance decisions, there needs to be a feedback loop back into the decision-making process.  Strong leadership needs to be in place to ensure fact-based decisions and not personal, political, or selfishly reasoned decisions.  As governance has so many opportunities to fail, it becomes even more important to harvest policies internally, grow them from observed best-practices, and encourage adoption through prescriptive guidance.

Since how a team delivers is at least as important as what a team delivers, how can IT influence the "how"?  There are basically two approaches:

  • Guidance
  • Governance

From the team's perspective, the only difference between the two is where the decision gets made to incorporate a policy.  With guidance, the team makes the decision, and with governance, someone else does.  This not-so-subtle difference should be a top consideration when IT decides to enforce policy through governance or encourage policy through guidance.

Removing decisions

One of the key ingredients in self-organizing teams is putting the trust in the team to make decisions in how and what to deliver.  Attempts to remove any abilities to make decisions can quickly have negative impacts to the team unless care is taken to get buy-in from the team.

The more decisions are removed from the team, the more the team feels like an un-creative cog in a machine or a code monkey.  Strong teams thrive on the opportunity to be creative, and teams without creative opportunities become lethargic, disinterested, and largely brain-dead.  It's an enormous waste of resources not to harvest talent available, so the bar needs to be exceedingly high when removing decisions from the team.

It's possible to allow team decisions to be made concerning policy by choosing the scope of the policy.  For example, instead of mandating TDD, require that features have automated tests, then provide guidance on how to do that.  Teams can then decide on specifics.

Favoring guidance

Policies are more effective if they represent a set of core values and principles.  Certain meta-principles are appropriate for governance, such as "automated builds" or "continuous integration".  Once IT governance crosses the line into specifics, that's when decisions are removed from teams and problems arise.

Instead, IT governance bodies can offer guidance on specifics and even support.  They can provide guides to help teams get started, pointing teams in the right direction.  Some teams may have enough internal experience to feed back their experiences back into the governance team, providing even more collective knowledge that the entire organization can take advantage of.

As the saying goes, "people are our most important asset", so if accountability is held for a team, it needs to be able to make decisions that ensure success.  If the team is accountable for decisions they didn't make, morale suffers accordingly.  By favoring guidance, the governance team can focus more on growing appropriate policies by measuring the impact and success of their guidance.