rdilipk

I was intrigued by a post over at microsoft.public.dotnet.framework. I MSN Live Search'ed for a long time but couldn't find any convincing explanation. I have reproduced the post here. I am also not sure if this post belongs here or in the BCL group. Anyway does anyone have any insights

=============================================================

Looking at List<T> and LinkedList<T>, we have:

public class List<T> : IList<T>, ICollection<T>, IEnumerable<T>, IList, ICollection, IEnumerable

public class LinkedList<T> : ICollection<T>, IEnumerable<T>, ICollection, IEnumerable, ISerializable, IDeserializationCallback


Interesting that LinkedList<T> derives from ISerializable and IDeserializationCallback, but List<T> does not. Why the inconsistency

But, more interstingly, both LinkedList<T> and List<T> derived from ICollection<T> and from ICollection.

Now compare that to Queue<T> and Stack<T>:

public class Queue<T> : IEnumerable<T>, ICollection, IEnumerable
public class Stack<T> : IEnumerable<T>, ICollection, IEnumerable


These derive from ICollection, but *not* from ICollection<T>.

So, I can pass a List<T> or a LinkedList<T> as ICollection<T>, but I *cannot* pass Queue<T> or Stack<T> as ICollection<T>.

I can pass all four collection concrete types as ICollection, but that has other drawbacks:

- It incurs the cost of boxing and unboxing.

- ICollection<T> has an Add() method, but ICollection does *not* have an Add() method. So, if I pass collections as ICollection, I cannot add to the collections in the callee.

Why these gratuitous differences

Cheers,


Michi.

=============================================================



Re: Visual C# Language Why does List<T> not implement ISerializable/IDeserializationCallback??

Rob Teixeira

Part of the differences are evolutionary, part are due to actual differences between the collections. I'll try to take them one at a time.

List<T> is not designed to be serialized. Period. While technologies like WCF are able to serialize almost any kind of enumerable type given certain constraints, List<T> was designed to be an optimized local utility class. In no circumstances should you serialize List<T> outside the local app domain. Part of the optimized implementation won't deal nicely with versioning, for example. In fact, if you create a serializable object and include a property of List<T>, then run code analysis, it will give you a warning to remove it.

Queue and Stack do not have normal ICollection<T> semantics. That is to say, ICollection<T> has behavioral contracts for adding and removing items, which do not apply to stacks and queues.

ICollection was created back before generics. This drew two limitations - 1, you could have read-only collections, and 2, without generics, it could only provide typeless Add methods, which are less than optimal. For those reasons, and possibly some others, ICollection left it up to the implementer to specify type-specific methods such as Add, and did not create the typeless contract. If it did, you would have two conflicting Add methods, because Add(object) and Add(MyType) could resolve to the same overload in certain situations due to the variance between MyType and Object.

ICollection could overcome some of these limitations with typed parameters, so it did more of the heavy lifting for you.






Re: Visual C# Language Why does List<T> not implement ISerializable/IDeserializationCallback??

Brian Grunkemeyer - MSFT

I think a review of serialization would help a bit. Let's look at these types a little more closely. Here's my view of these types:

[DebuggerTypeProxy(typeof(Mscorlib_CollectionDebugView<>))]

[DebuggerDisplay("Count = {Count}")]

[Serializable()] <-------------------------

public class List<T> : IList<T>, System.Collections.IList

[Serializable()] <-------------------------

[System.Runtime.InteropServices.ComVisible(false)]

[DebuggerTypeProxy(typeof(System_CollectionDebugView<>))]

[DebuggerDisplay("Count = {Count}")]

public class LinkedList<T>: ICollection<T>, System.Collections.ICollection, ISerializable, IDeserializationCallback {

Remember that serializability of a type, at least for the CLR's BinaryFormatter & SoapFormatter is defined by the presence of the SerializableAttribute on the type. ISerializable is more of an implementation detail of the type, meaning that the type either has need of some special versioning constraints around compatibility with previous versions of its serialized format on disk (or over the network), or that it may explicitly need some additional code to run on deserialization to restore internal invariants. Sometimes these needs can be met by marking fields as optionally serializable, but ISerializable's more general methods were necessary.

IDeserializationCallback takes this a step further, saying that deserialization of this type is so special, we must explicitly run some code after the rest of the object graph has been deserialized. As an example of where this is necessary, consider a hash table that must call GetHashCode on all keys within the hash table. Note that we can't persist out the return value of GetHashCode, since hash functions are likely to be improved from one version of a library to another. So we must call GetHashCode on the key sometime during deserialization, but after the individual key has been deserialized. If the hash table is deserialized before the keys are deserialized, then calling GetHashCode may return the wrong value. IDeserializationCallback allows you a chance of working around simple ordering problems, by providing a post-deserialization callback.

The only confusing thing about ISerializable and IDeserializationCallback are that if a base type implements these interfaces, it suggests that derived types should be serializable (ie, they should have the SerializableAttribute on them), may need to implement a deserialization constructor, and if they override these methods, they may need to call the base type's implementation in the appropriate places. The most common place where this shows up in our library is Exception - very few developers remember the deserialization constructor on their own exceptions (which only shows up as a problem once you start using multiple appdomains, remoting, or the CLR's new add-in model). But the slightly fishy serialization design issue is that the implementation details of the type tell you whether special serialization semantics are required, and this leaks through in the public interface in terms of these marker interfaces & requirements on derived types. Given that serialization of a type isn't something to be done lightly for any type with non-trivial internal state, this requirement on subclasses is well within reason.

Back to our collections - clearly, List<T> and LinkedList<T> are serializable. The original poster's question might then be, how would these type's serialization needs be different The issue comes down to only implementation details. For List<T>, it's basically a T[], a count, and some other uninteresting state around enumeration. Assuming that T is serializable (and all subclasses of T that are stored in the List are also serializable), then List<T> can be serialized & deserialized safely.

So what about LinkedList<T> Unlike our hash tables, it doesn't require hash functions with may vary with the CLR or third party library version changes. So why do anything special here The answer is performance - we don't want to serialize out a list of LinkedListNodes. We instead write out just the interesting data. Now let's turn a critical eye to this class. Do we need the OnDeserialization method Probably not in this version.

I hope this helps you to understand serialization more clearly.

Brian Grunkemeyer

CLR Base Class Library Team





Re: Visual C# Language Why does List<T> not implement ISerializable/IDeserializationCallback??

Rob Teixeira

Thanks for the explanation Brian!

However, let me try to clear up the confusion :-)

When I wrote that one shouldn't serialize List<T> outside the local app domain, it wasn't for technical reasons. In fact, I was hoping the implication of my earlier statement that most serializers are capable of serializing enumerable types would point to the fact that technically speaking, it's perfectly feasible (even if the enumerable type doesn't physically implement ISerializable).

However, the issue with List<T> is versioning and the lack of virtual methods that pose possible breaking changes if you have to move out of the implementation currently provided. Here's the more thorough explanation from the code analysis team blog: http://blogs.msdn.com/fxcop/archive/2006/04/27/585476.aspx






Re: Visual C# Language Why does List<T> not implement ISerializable/IDeserializationCallback??

Michi Henning

Hi Brian, thanks for the detailed reply.

I pointed out the inconsistency simply because I was curious. Thanks to your explanation, I understand now how the difference comes about.

My main comment is that, at least to me, doing things this way is at least bordering on abuse of inheritance. It seems that each class has a "has-a" relationship with the serialization machinery, not an "is-a" relationship. I can see how the base interface works and how that provdes the necessary hooks, but it leaves a bad taste in my mouth. In particular, public inheritance used that way exposes details about the inner workings of a class that, really, are no-one's business but its own.

Anyway, I appreciate that you took the time to explain this--thank you very much.

Cheers,

Michi.





Re: Visual C# Language Why does List<T> not implement ISerializable/IDeserializationCallback??

Michi Henning

Hi Rob,

Thanks for your reply!

Rob Teixeira wrote:

Queue and Stack do not have normal ICollection<T> semantics. That is to say, ICollection<T> has behavioral contracts for adding and removing items, which do not apply to stacks and queues.

I'm thinking that I could mount the same argument for LinkedList<T>. For that, Add() appends an element, but it could equally well prepend the element, at the same run-time cost. So, I think I could argue equally well that ICollection<T> is unsuitable as a base interface of LinkedList<T>.

Conversely, I can also argue that, for stacks and queues, Add() could simply call Push() and Enqueue(), respectively.

The reason I'm raising this question is that I'm unmarshaling sequences off the wire, as part of ZeroC's Ice middleware (www.zeroc.com). What I would like to do is allow the user to determine the concrete collection type into which the data that arrives over the wire will be unmarshaled. So the user gets a LinkedList, a List, a Queue, or Stack, depending on their preference. As part of the unmarshaling code, I have something like the following method:

void fillWithData(ICollection<T> c)

The user passes an empty collection, and the implemetnation of this method reads the data off the wire and calls Add() to put the data into the collection.

Unfortunately, I can do this only for List and LinkedList, but not for Stack and Queue.

Alternatively, I can write the method as:

void fillWithData(ICollection c)

That works for all four collection types but, unfortunately, ICollection don't have an Add() method, so I can't fill the sequence.

So, I'm forced to write four overloads:

void fillWithData(List<T> c)

void fillWithData(LinkedList<T> c)

void fillWithData(Queue<T> c)

void fillWithData(Stack<T> c)

Now I can do what I want, but only at substantially increased effort.

Yet another alternative is to go back to the previous signature:

void fillWithData(ICollection c)

and use downcasts or a type test to see what I'm dealing with. But the code still is different for each collection

because the four collection types use different operation names for what, conceptually, is the same operation: Add() for List<T>, AddLast() for LinkedList<T>, Enqueue() for Queue<T>, and Push() for Stack<T>.

So, no matter how hard I try, I can't treat these collections generically.

What I'm really looking for is something like C++ STL's push_back() function, which can add elements to any collection, no matter what its type.

Cheers,

Michi.