Tuesday, July 3, 2007

Limitations of generic base classes

I thought I had created something fairly useful with a generic Value Object in a previous post.  Generic base classes are nice, and there are several recommended base classes for creating collections classes.  Whenever I try to make an interesting API with a generic base class, limitations and faulty assumptions always reduce the usefulness of that base class.  Let's first start with a couple of classes that we'd like to include in a generic API:

public class Address
{
    private readonly string _address1;
    private readonly string _city;

    public Address(string address1, string city)
    {
        _address1 = address1;
        _city = city;
    }

    public string Address1
    {
        get { return _address1; }
    }

    public string City
    {
        get { return _city; }
    }
}

public class ExpandedAddress : Address
{
    private readonly string _address2;

    public ExpandedAddress(string address1, string address2, string city)
        : base(address1, city)
    {
        _address2 = address2;
    }

    public string Address2
    {
        get { return _address2; }
    }
}

Fairly basic classes, just two types of addresses, with one a subtype of the other.  So what kinds of issues do I usually run in to with generic base classes?  Let's look at a few different types of generic base classes to see.

Concrete generic implementations

A concrete generic implementation is a concrete class that inherits a generic base class:

public class AddressCollection : Collection<Address>
{
    public AddressCollection() {}

    public AddressCollection(IList<Address> list) : base(list) {}
}

Following the Framework Design Guidelines, I created a concrete collections class by inheriting from System.Collections.ObjectModel.Collection<T>.  I also have the ExpandedAddress class, so how do I create a specialized collection of ExpandedAddresses?  I have a few options:

  • Create an ExpandedAddressCollection class inheriting from AddressCollection
  • Create an ExpandedAddressCollection class inheriting from Collection<ExpandedAddress>
  • Use the existing AddressCollection class and put ExpandedAddress instances in it.

All of these seem reasonable, right?  Let's take a closer look.

Inherit from AddressCollection

Here's what the ExpandedAddressCollection class would look like:

public class ExpandedAddressCollection : AddressCollection
{
    public ExpandedAddressCollection() {}

    public ExpandedAddressCollection(IList<Address> list) : base(list) {}
}

That's not very interesting, it didn't add any information to the original AddressCollection.  What's more, this ExpandedAddressCollection ultimately inherits Collection<Address>, not Collection<ExpandedAddress>.  Everything I try to put in or get out will be an Address, not an ExpandedAddress.  For example, this code wouldn't compile:

List<ExpandedAddress> addresses = new List<ExpandedAddress>();
addresses.Add(new ExpandedAddress("Address1", "Austin", "TX"));

ExpandedAddressCollection addressList = new ExpandedAddressCollection(addresses);

Because of limitations in generic variance, namely that C# does not support generic covariance or contravariance.  Even though ExpandedAddress is a subtype of Address, and ExpandedAddress[] is a subtype of Address[], IList<ExpandedAddress> is not a subtype of IList<Address>.

Inherit from Collection<ExpandedAddress>

In this example, I'll just implement the ExpandedAddressCollection in the same manner as AddressCollection:

public class ExpandedAddressCollection : Collection<ExpandedAddress>
{
    public ExpandedAddressCollection() {}

    public ExpandedAddressCollection(IList<ExpandedAddress> list) : base(list) { }
}

Now I my collection is strongly types towards ExpandedAddresses, so the example I showed previously would now compile.  It seems like I'm on the right track, but I run into even more issues:

  • ExpandedAddressCollection is not a subtype of AddressCollection
  • Collection hierarchy does not match hierarchy of Addresses (one is a tree, the other is flat)
  • I can't pass an ExpandedAddressCollection into a method expecting an AddressCollection
  • Since there is no relationship between the two collection types, I can't use many patterns where a relationship is necessary

So even though my collection is strongly typed, it becomes severely limited in more interesting scenarios.

Use existing AddressCollection class

In this instance, I won't even create an ExpandedAddressCollection class.  Any time I need a collection of ExpandedAddresses, I'll use the AddressCollection class, and cast as necessary.  I won't be able to pass an IList<ExpandedAddress> to the constructor because of the variance issue, however.  If I need to include some custom logic in the collection class, I'll run into the same problems highlighted earlier if I'm forced to create a new subtype of AddressCollection.

So we've seen the limitations of concrete generic implementations, what other options do I have?

Constrained generic base classes

I'd like a way to propagate the concrete type parameter back up to the original Collection<T>.  What if I make the AddressCollection generic as well?  Here's what that would look like:

public class AddressCollection<T> : Collection<T>
    where T : Address
{
    public AddressCollection() {}

    public AddressCollection(IList<T> list) : base(list) {}
}

public class ExpandedAddressCollection : AddressCollection<ExpandedAddress>
{
    public ExpandedAddressCollection() {}

    public ExpandedAddressCollection(IList<ExpandedAddress> list) : base(list) { }
}

So now I have a constrained base class for an AddressCollection, and an implementation for an ExpandedAddressCollection.  What do I gain from this implementation?

  • ExpandedAddressCollection is completely optional, I could just define all usage through an AddressCollection<ExpandedAddress>
  • Any AddressCollection concrete type will be correctly strongly typed for an Address type

Again, with some more interesting usage, I start to run into some problems:

  • I can never reference only AddressCollection, as I always have to give it a type parameter.
  • Once I give it a type parameter, I run into the same variance issues as before, namely AddressCollection<ExpandedAddress> is not a subtype of AddressCollection<Address>
  • Since I can never define any method in terms of solely an AddressCollection, I either need to make the containing class generic or the method generic.

For example, I can write the following code:

public void TestGenerics()
{
    AddressCollection<Address> addresses = new AddressCollection<Address>();
    addresses.Add(new Address("132 Anywhere", "Austin"));

    int count = NumberOfAustinAddresses(addresses);

    AddressCollection<ExpandedAddress> expAddresses = new AddressCollection<ExpandedAddress>();
    expAddresses.Add(new ExpandedAddress("132 Anywhere", "Apt 123", "Austin"));

    count = NumberOfAustinAddresses(addresses);
}

private int NumberOfAustinAddresses<T>(AddressCollection<T> addresses)
    where T : Address
{
    int count = 0;
    foreach (T address in addresses)
    {
        if (address.City == "Austin")
            count++;
    }
    return count;
}

This isn't too bad for the implementation nor the client code.  I don't even need to specify a generic parameter in the method calls.  If I can live with generic methods, this pattern might work for most situations.

The only other problem I'll have is that I might need to create subtypes of AddressCollection, like I did above with ExpandedAddressCollection.  In this case, I'd can continue to make each subtype generic and constrained to the derived type:

public class ExpandedAddressCollection<T> : AddressCollection<T>
    where T : ExpandedAddress
{
    public ExpandedAddressCollection() {}

    public ExpandedAddressCollection(IList<T> list) : base(list) { }
}

Again, if I can live with generic methods, I would be happy with this implementation, as I now keep the same hierarchy as my Addresses.  It is a little strange declaring ExpandedAddressCollection with a type parameter (ExpandedAddressCollection<ExpandedAddress>), but I'll live.

There's one more type of generic base class I'd like to explore.

Self-constrained generic base classes

Sometimes, I need to implement a certain generic interface, such as IEquatable<T>.  I could simply pass in the concrete type into the generic parameter like this:

public abstract class ValueObject : IEquatable<ValueObject>

But if I'm trying to create an abstract or a base class for subtypes to use, I'll run into problems where derived types won't implement IEquatable<DerivedType>.  Instead, I can make this abstract class generic and self-constraining:

public abstract class ValueObject<T> : IEquatable<T>
    where T : ValueObject<T>

Now, derived types will implement IEquatable<DerivedType>, as I showed in my post on value types.  Unfortunately, subtypes will only implement the correct IEquatable<T> for the first derived class:

public class Address : ValueObject<Address>
public class ExpandedAddress : Address

In this case, ExpandedAddress is not a subtype of ValueObject<ExpandedAddress>, and therefore only implements IEquatable<Address>.  I can't use the same tricks with the constrained generic base class, as I would need to declare Address as generic, and therefore never instantiable by itself.  The self-constrained generic base or abstract class is unfortunately only useful in hierarchies one level deep.

Conclusion

So generics aren't the silver spoon I thought it would be, but there are some interesting proposals to allow variance for generic types out there.  I might not be able to cover all of the scenarios I'd like for a generic base class, but by identifying several options and their consequences, I can make a better decision on solving the problem.

2 comments:

Tom said...

This may be a bit late (I haven't checked any of your other posts) but I did run into a similar situation last week with inheritance trees and generics and wondered what you thought of this solution.

Basically, we've got several widgets that deal with different type of people, let's say EmployeeWidget, ManagerWidget and CustomerWidget, each of which shares some common functionality declared in Widget<T> where T:Person. Now, since a Manager inherits from Employee, I too thought that maybe ManagerWidget should inherit from EmployeeWidget. Seems logical. But then I ran into the problem of losing all my generics support and T being an Employee, not a Manager.

Our solution was to go ahead and have ManagerWidget inherit from Widget<Manager>. But (this is where it deviates) we then created a base class for Widget<T> to inherit from... so it's now declared Widget<T>:Widget. This allowed us to get the polymorphism back as we can just pass things around as the base Widget class. The common behavior is defined here mostly as abstract functions that then get overriden in Widget<T>, cast to T, and then executed.

Just wondering what you thought... :)

Jimmy Bogard said...

@Tom

That's a pretty good workaround. There's not a whole lot of options without variance, so you're stuck with playing with the inheritance, or trying to move towards composition.

Here's another approach:

IWidget<T> where T : Person

EmployeeWidget : IWidget<Employee>

ManagerWidget : EmployeeWidget, IWidget<Manager>

Explicitly implement the interfaces, and you're set.