Tuesday, May 29, 2007

Refactoring NAnt and MSBuild build scripts

A while back, I talked about the harmful effects of "Copy Paste".  While editing some NAnt and MSBuild build scripts, I forgot about the evil twin of "Copy Paste", which is "Find and Replace" (I guess both twins are evil).  I needed to update an MSBuild script to have the correct version numbers of an application we're starting on.  Here's what the abridged MSBuild script looked like before any modifications:

<Project>
  <PropertyGroup>
    <LocalPath1>E:\builds\V2.1\US\ecomm\CoreBusinessObjectsDistribution</LocalPath1>
    <LocalPath2>E:\builds\V2.1\US\ecomm\OrderWorkflowDistribution</LocalPath2>
    <LocalPath3>E:\builds\V2.1\US\ecomm\Store.BusinessObjects.Ecommerce</LocalPath3>
    <LocalPath4>E:\builds\V2.1\US\ecomm\Store.UI</LocalPath4>
    <LocalPath5>E:\builds\V2.1\US\ecomm\Store.Utilities.Ecommerce</LocalPath5>
    <LocalPath6>E:\builds\V2.1\US\ecomm\MyCompany.Store.UI.Ecommerce</LocalPath6>
    <LocalPath7>E:\builds\V2.1\US\ecomm\MyCompany.Store.UI</LocalPath7>
    <LocalPath8>E:\builds\V2.1\US\ecomm\store</LocalPath8>
    <LocalPath9>E:\builds\V2.1\US\ecomm</LocalPath9>
    <LocalPath10>E:\builds\V2.1\US\ecomm\Deploy</LocalPath10>
  </PropertyGroup>
</Project>

This file was targeting our "V2.1" release, but I needed to update it to "V2.1.5", so all of the directory names had to be changed.  I started to whip out the ever-faithful "Ctrl-H" to perform a "Find and Replace", but I stopped myself.  This was a great opportunity for a refactoring.

Eliminating duplication

One of the major code smells is duplicated code.  But duplications don't always have to occur in code, as the previous MSBuild script showed.  I needed to change all of the references of "V2.1" to "V2.1.5", and there were two dozen examples of these, which I would need to change through "Find and Replace".

The problem with "Find and Replace" is that it can be error-prone.  The search can be case-sensitive, I might pick "Search entire word", etc.  There are so many options, I would need to try several combinations to make sure I found all of the instances I wanted to replace.  Instead of wallowing through the "Find and Replace" mud, can't I just eliminate the duplication so I only need to make one change?  Why don't we take a look at our catalog of refactorings to see if one fits for MSBuild.

Refactoring the script

Detailed in Martin Fowler's refactoring book and website, I can look up a specific code smell and find an appropriate refactoring.  There are some websites that also list out "smells to refactorings".  The one that looks the most promising is Extract Method.  MSBuild scripts don't exactly have methods, but they do have the concepts of properties and tasks.

I can introduce a property that encapsulates the commonality between all of the "LocalPathXxx" properties, which is namely the root directory.  I'll give the extracted property a good name, and then make all properties and tasks that use the root directory use my new property instead of hard-coding the path.  Here's the final script:

<Project>
  <PropertyGroup>
    <LocalPathRoot>E:\builds\V2.1.5\ecomm</LocalPathRoot>
    <LocalPath1>$(LocalPathRoot)\CoreBusinessObjectsDistribution</LocalPath1>
    <LocalPath2>$(LocalPathRoot)\OrderWorkflowDistribution</LocalPath2>
    <LocalPath3>$(LocalPathRoot)\Store.BusinessObjects.Ecommerce</LocalPath3>
    <LocalPath4>$(LocalPathRoot)\Store.UI</LocalPath4>
    <LocalPath5>$(LocalPathRoot)\Store.Utilities.Ecommerce</LocalPath5>
    <LocalPath6>$(LocalPathRoot)\MyCompany.Store.UI.Ecommerce</LocalPath6>
    <LocalPath7>$(LocalPathRoot)\MyCompany.Store.UI</LocalPath7>
    <LocalPath8>$(LocalPathRoot)\store</LocalPath8>
    <LocalPath9>$(LocalPathRoot)</LocalPath9>
    <LocalPath10>$(LocalPathRoot)\Deploy</LocalPath10>
  </PropertyGroup>
</Project>

Now in future versions (V2.2 maybe?) we'll only need to make one change, instead of several dozen.  Any time I eliminate duplication, I greatly reduce the chances for error.

So where are we?

The code smells laid out in Martin Fowler's book don't apply only to code.  As we've seen with this MSBuild script, they can apply to all sorts of other domains where duplication causes problems.  All we have to do is find appropriate mappings to the new domain for the refactorings laid out for that particular smell.  Of course, if you don't know about code smells and how to recognize them, the duplication will probably continue to live on and wreak havoc on your productivity.

My next step is to replace these horrible "LocalPathXxx" property names with intention-revealing names.  Originally, this script had comments around each property explaining what it meant.  There's nothing like using intention-revealing names to eliminate the need for comments.

No comments: