Wednesday, September 17, 2008

Covariance and Contravariance in C# and Java

Every time I start programming something new, I inevitably come across code like this:


Of course, the above code is a complete lie; it won't work in C# or java. This is because, in these languages, the list types are not covariant. Subtypes are covariant wrt their supertype; they instance can stand in anyplace a superclass instance can. Type covariance is a narrowing convention. Lists with different generic parameters cannot stand in for each other. So, you get a compile time error. Of course, to confuse everyone more, the array types in both these languages are covariant.

In java, one can slide around this problem:
While this code will compile and do what you want, this syntax actually expresses contravariance. Contravariance is a widening convention; the return ? extends BaseClass is wider than DerivedClass.

What programmers typically want with this anti-pattern is to fit our local objects to some other signature. While, on the surface, this seems like a logical request, there are several issues lurking...

First, consider what would happens in C# and Java with covariant arrays. Wikipedia says it best:
// a is a single-element array of String
String[] a = new String[1];

// b is an array of Object
Object[] b = a;

// Assign an Integer to b. This would be possible if b really were
// an array of Object, but since it really is an array of String,
// we will get a java.lang.ArrayStoreException.
b[0] = 1;
What this means is that despite being statically typed, both languages actually have to do runtime type checks for array insertion. As the Wikipedia article points out, this is basically your only choice if you want covariant mutable lists.

However, invariant lists are exactly what we want! External parties shouldn't be messing with our members anyway. So, the crux of the problem is the lack of an actual ImmutableList class. It could be made trivially covariant and would fit right in with these languages OO style. While I'll often criticize Java for it's large number of useless classes (cough xml), data structures are one area where more definately is better.

In the end, C# doesn't have any support for generic contravariance (or covariance) at all. This seems to be by design; the CLI bytecode can theoretically support such things. Java, of course, gets a free pass because of type-erasure.

So why was it an anti-pattern?

It looks like Java has the edge here, but it's actually Microsoft who has the right idea. It's already an immutable list... so why is it returning a list at all? Instead, the method should be returning an IEnumerable (Iterable). One can easily design a class which wraps our list and spits it out in the correct format for the receiver. No incorrect assumptions about modifying lists, type casts, or unecessary copies. In C#:

Using this idea, I can keep a mutable list of mutable objects (the derived class). If someone else needs to see that data, I can use the type system to give them an immutable view of immutable objects (the base class) that is statically enforced.

If you have to use a static type system, the type system should at least help you out.

*** Note:
While I haven't looked at the bytecode for this class, I imagine that Microsoft's compiler tracks the implicit conversions. However, type-erasure in java (and the nature of Hot Spot as being more type-ignorant) would probably not require this.