Thursday, November 29, 2007

Intention-concealing interfaces: blob parameters

When someone is using your code, you want your code to be as explicit and easy to understand as possible.  This is achieved through Intention-Revealing Interfaces.  Evans describes the problems of opaque and misleading interfaces in Domain-Driven Design:

If a developer must consider the implementation of a component in order to use it, the value of encapsulation is lost.  If someone other than the original developer must infer the purpose of an object or operation based on its implementation, that new developer may infer a purpose that the operation or class fulfills only by chance.  If that was not the intent, the code may work for the moment, but the conceptual basis of the design will have been corrupted, and the two developers will be working at cross-purposes.

His recommendation is to create Intention-Revealing Interfaces:

Name classes and operations to describe their effect and purpose, without reference to the means by which they do what they promise.  This relieves the client developer of the need to understand the internals.  The name should conform to the Ubiquitous Language so that the team members can quickly infer their meaning.  Write a test for a behavior before creating it, to force your thinking into client developer mode.

Several times now I've run into blob parameters (I'm sure there are other names for them).  Blob parameters are amorphous, mystery parameters used in constructors and methods, usually of type ArrayList, HashTable, object[], or even params object[].  Here's an example:

public class OrderProcessor
{
    public void Process(object[] args)
    {
        Order order = (Order) args[0];
        Customer customer = (Customer) args[1];

        // etc.
    }
}

The intentions behind this code are good, but the results can lead to some frustrating results.  Furthermore, this pattern leads to "Intention-Concealing Interfaces", which is style completely opposite from the Intention-Revealing style Eric proposes.  Clients need to know the intimate details of the internal implementation of the component in order to use the blob parameters.

Wrong kind of extensibility

Blob parameters lead to the worst kind of coupling, beyond just "one concrete class using another concrete class".  When using blob parameters, the client gets coupled to the internal implementation of unwinding the blob.  Additionally, new client code must have intimate knowledge of the details, as the correct order of the parameters is not exposed by the blob method's interface.

When a developer needs to call the "Process" method above, they are forced to look at the internal implementation.  Hopefully they have access to the code, but otherwise they'll need to pop open Reflector to determine the correct parameters to pass in.  Instead of learning if the parameters are correct at compile-time, they have to wait until run-time.

Alternatives

Several alternatives exist before resorting to blob parameters:

  • Explicit parameters
  • Creation method
  • Factory patterns
  • IoC containers
  • Generics

Each of these alternatives can be used separately or combined together to achieve both Intention-Revealing Interfaces and simple code.

Explicit Parameters

Explicit parameters just means that members should ask explicitly what they need to operate through their signature.  The Process method would be changed to:

public class OrderProcessor
{
    public void Process(Order order, Customer customer)
    {
        // etc.
    }
}

The Process method takes exactly what components it needs to perform whatever operations it does, and clients of the OrderProcessor can deduce this simply through the method signature.

If this signature needs to change due to future requirements, you have a few choices to deal with this change:

  • Overload the method to preserve the existing signature
  • Just break the client code

People always assume it's a big deal to break client code, but in many cases, it's not.  Unless you're shipping public API libraries as part of your product, backwards compatibility is something that can be dealt with through Continuous Integration.

Creation Method/Factory Patterns/IoC Containers

When blob parameters start to invade constructors, it's a smell that you need some encapsulation around object creation.  Instead of dealing with changing requirements through blob parameters, encapsulate object creation:

public class OrderProcessor
{
    private readonly ISessionContext _context;

    public OrderProcessor() : this(ObjectFactory.GetInstance<ISessionContext>())
    {
    }

    public OrderProcessor(ISessionContext context)
    {
        _context = context;
    }
}

Originally, the ISessionContext was a hodgepodge of different object that I needed to pass in to the OrderProcessor class.  Since these dependencies became more complex, I encapsulated them into a parameter object, and introduced a factory method (the "ObjectFactory" class) to encapsulate creation.  Client code no longer needs to pass in a group of complex objects to create the OrderProcessor.

Generics

Sometimes blob parameters surface because a family of objects needs to be created or used, but all have different needs.  For example, the Command pattern that requires data to operate might look like this before generics:

public interface ICommand
{
    void Execute(object[] args);
}

public class TransferCommand : ICommand
{
    public void Execute(object[] args)
    {
        Account source = (Account) args[0];
        Account dest = (Account) args[1];
        int amount = (int) args[2];

        source.Balance -= amount;
        dest.Balance += amount;
    }
}

The ICommand interface needs to be flexible in the arguments it can pass to specific implementations.  We can use generics instead of blob parameters to accomplish the same effect:

public interface ICommand<T>
{
    void Execute(T args);
}

public struct Transaction
{
    public Account Source;
    public Account Destination;
    public int Amount;

    public Transaction(Account source, Account destination, int amount)
    {
        Source = source;
        Destination = destination;
        Amount = amount;
    }
}

public class TransferCommand : ICommand<Transaction>
{
    public void Execute(Transaction args)
    {
        Account source = args.Source;
        Account dest = args.Destination;
        int amount = args.Amount;

        source.Balance -= amount;
        dest.Balance += amount;
    }
}

Each ICommand implementation can describe its needs through the generic interface, negating the need for blob parameters.

No blobs

I've seen a lot of misleading, opaque code, but blob parameters take the prize for "Intention-Concealing Interfaces".  Instead of components being extensible, they're inflexible, brittle, and incomprehensible.  Before going down the path of creating brittle interfaces, exhaust all alternatives before doing so.  Every blob parameter I've created I rolled back later as it quickly becomes difficult to deal with.

10 comments:

terry said...

i think those rules get a little stale when you're using weakly typed, dynamic languages.

lots of methods i resort to in ruby/rails require good documentation, because they have blob-ish parameters that are typed but aren't clear all the time. most helper methods in rails take a hash as the 3rd (or sometimes 3rd and 4th) arguments. those hashes have settings in them, like for a text field, :maxlength => 80, :size => 20, :value => "yourmom", etc.

it puts out for pretty clear code on my side (you can tell that i'm setting the max length, size, etc), but there's no way to deduce that from the interface, and i don't usually go trolling through rails code (unless i need some inspiration).

i'd like to propose a new pattern. "documented methods" as opposed to "undocumented method." undocumented methods hide intent :(

Jimmy Bogard said...

A documented method doesn't make it easy to change or not brittle. Similarly, hashes don't imply how it affects behavior. I think hashes are just a cop-out for better designs, and the only thing hashes avoid is element positioning of parameter arrays.

As far as using hashes to set properties, I'd rather have something like the C# object initializers or Java's anonymous classes.

Side-effect-free functions + intention-revealing interfaces = easier to maintain code.

Documentation that describes internal behavior or details can be worse, because I can't test documentation to make sure it's still correct. See: http://grabbagoft.blogspot.com/2007/06/problem-with-code-comments.html

But, I'm not a rails dev, so I really don't know the conventions over there.

terry said...

i think we're also comparing things from different perspectives.

mine's more from an API consumer standpoint and yours is more from an API author standpoint.

hash parameters in ruby let you get away with a type of method overloading (which ruby doesn't support otherwise) and can produce some really clean looking API consuming code.

i'm not sure what the code inside the API looks like, and as long as it's documented, i shouldn't have to. that said, naming conventions can only get you so far with this approach, which is why i miniranted about a documented-method design pattern.

Jimmy Bogard said...

I guess the difference is in .NET, passing ArrayLists, hashes, dictionaries, object[] or what have you assumes you have knowledge of the internal parts of the method. Documentation doesn't help there, as the code is the truth, not the documentation.

I trust documentation to tell me external functionality, but never internal details. In .NET, if I have to know what hash key to set for behavior to change, the API designer has failed to give me a discoverable design.

terry said...

i'm to a point where i don't see the difference.

function(val1, val2, val3)

vs

function(a => val1, b => val2, c => val3)

i think there's a lot to be said about conventions and standards, and rails is (fairly) consistent in using those. want to change the size on something? pass :size => value. want to set the action for a link? pass :action => action.

especially when you're looking at helpers to set up some pretty crazy stuff (like prototype ajax links), it's significantly more readable to see

link_to_remote :action => blah, :params => {params}, :complete => 'Effect.fade(this);'

than it is to see

ajax_link(blah, {params}, 'Effect.fade(this);')

it means i don't have to know the API i'm using nearly as well to understand what other people's code is doing.

Jimmy Bogard said...

If IDE's supported discovery of available hashes, I'd agree. But I've written enough documentation to know it's tough to keep current.

Dealing with HTML-type stuff might be different as I'm dealing more with components and not objects/classes.

If hashes are nice, why aren't all ruby methods declared solely with hashes as the argument? What's the threshold to cross where the API designer decides to use hashes instead of individual parameters?

terry said...

most rails api calls are hashes. in fact, most libraries i've used use hashes. it's much cleaner.

Todo.find(:all, :limit => 200)

is much nicer than

Query.object(Todo).add_limit(200).execute()

Jimmy Bogard said...

Most libraries? Not any .NET or Java ones I'm aware of.

Besides, your comparison is textbook straw man argument. By presenting "A" over "B", you're misrepresenting that "B" is what I'm advocating. Certainly not.

The example you showed is certainly doable in a CLR app:

Todo.FindAll().WithLimit(200)

Or

Todo.FindAll(With.Limit(200))

Or

var top200 = todoRepository.FindAll().Take(200); (NHibernate + LINQ)

Fluent interfaces are easy to make in Java and .NET. They certainly look almost as clean.

Clean to look at doesn't mean easy to maintain or develop against. Until you can prove ease of discovery of hashes, you haven't proven maintainability. Documentation and original source code are not sufficient for library, component, or API maintainability.

In the case you've shown, if I depended on the "find" method, I would set up unit tests to prove that it still accepted those hash values. These kinds of "safety valves" are code smells.

multi_io said...


public interface ICommand<T>
{
void Execute(T args);
}


Since this is supposed to be about "intention-revealing" code -- why is it that everyone calls his/her type parameters "T"? You wouldn't call all your value parameters "v", would you? So -- is this some bad habit that's been in use for 20 years so nobody gives it a second thought anymore? Just call the thing "ArgumentType" or something -- that's what it is, isn't it? It's the type that an argument to this command must have. So there.

Jimmy Bogard said...

@multi_io

Probably for the same reasons nobody uses "index1, index2, index3" for loop variable names, if "i, j, k" have the same accepted meaning.