简体   繁体   中英

Why do structs need to be boxed?

In C#, any user-defined struct is automatically a subclass of System.Struct System.ValueType and System.Struct System.ValueType is a subclass of System.Object .

But when we assign some struct to object-type reference it gets boxed. For example:

struct A
{
    public int i;
}

A a;
object obj = a;  // boxing takes place here

So my question is: if A is an descendant of System.Object , can't the compiler up-cast it to object type instead of boxing?

A struct is a value type. System.Object is a reference type. Value types and reference types are stored and treated differently by the runtime. For a value type to be treated as a reference type, it's necessary for it to be boxed. From a low level perspective, this includes copying the value from the stack where it originally lives to the newly allocated memory on the heap, which also contains an object header. Additional headers are necessary for reference types to resolve their vtables to enable virtual method dispatches and other reference type related features (remember that a struct on stack is just a value and it has zero type information; it doesn't contain anything like vtables and can't be directly used to resolve dynamically dispatched methods). Besides, to treat something as a reference type, you have to have a reference (pointer) to it, not the raw value of it.

So my question is - if A is an descendant of System.Object, can't compiler upcast it to object type instead of boxing?

At a lower level, a value does not inherit anything. Actually, as I said before, it's not really an object. The fact that A derives from System.ValueType which in turn derives from System.Object is something defined at the abstraction level of your programming language (C#) and C# is indeed hiding the boxing operation from you pretty well. You don't mention anything explicitly to box the value so you can simply think the compiler has "upcasted" the structure for you. It's making the illusion of inheritance and polymorphism for values while none of the tools required for polymorphic behavior is directly provided by them.

Here's how I prefer to think about it. Consider the implementation of a variable containing a 32 bit integer. When treated as a value type, the entire value fits into 32 bits of storage. That's what a value type is: the storage contains just the bits that make up the value, nothing more, nothing less.

Now consider the implementation of a variable containing an object reference. The variable contains a "reference", which could be implemented in any number of ways. It could be a handle into a garbage collector structure, or it could be an address on the managed heap, or whatever. But it's something which allows you to find an object. That's what a reference type is: the storage associated with a variable of reference type contains some bits that allow you to reference an object.

Clearly those two things are completely different.

Now suppose you have a variable of type object, and you wish to copy the contents of a variable of type int into it. How do you do it? The 32 bits that make up an integer aren't one of these "reference" things, it's just a bucket that contains 32 bits. References could be 64 bit pointers into the managed heap, or 32 bit handles into a garbage collector data structure, or any other implementation you can think of, but a 32 bit integer can only be a 32 bit integer.

So what you do in that scenario is you box the integer: you make a new object that contains storage for an integer, and then you store a reference to the new object.

Boxing is only necessary if you want to (1) have a unified type system, and (2) ensure that a 32 bit integer consumes 32 bits of memory. If you're willing to reject either of those then you don't need boxing; we are not willing to reject those, and so boxing is what we're forced to live with.

While the designers of .NET certainly didn't need to include boxing section 4.3 of the C# Language Specification explains the intent behind it quite well, IMO:

Boxing and unboxing enables a unified view of the type system wherein a value of any type can ultimately be treated as an object.

Because value types are not reference types (which System.Object ultimately is), the act of boxing exists in order to have a unified type system where the value of anything can be represented as an object.

This is different from say, C++ where the type system isn't unified, there isn't a common base type for all types.

"If struct A is an descendant of System.Object , can't the compiler up-cast it instead of boxing?"

No, simply because according to the definition of the C# language, "up-casting" in this case is boxing.

The language specification for C# contains (in chapter 13) a catalogue of all possible type conversions. All these conversions are categorized in a specific fashion (eg numeric conversions, reference conversions, etc.).

  1. There are implicit type conversions from a type S to its super-type T , but these are only defined for the pattern "from a class type S to a reference type T " . Because your struct A is not a class type, these conversions cannot be applied in your example.

    That is, the fact that A is (indirectly) derived from object (while correct) is simply irrelevant here. What is relevant is that A is a struct value type.

  2. The only existing conversion that matches the pattern "from a value type A to its reference super-type object " is categorized as a boxing conversion. Thus every conversion from a struct to object is by definition considered boxing.

struct is a value-type by design, hence it needs to be boxed when turned into a reference type. struct derives from System.ValueType , which in term derives from System.Object .

The mere fact that struct is a descendant of object, does not mean much..since the CLR deals with structs differently at runtime than a reference type.

After the question has been answered I'll present a little "trick" related to that topic:

struct s can implement interfaces. If you pass a value type to a function that expects an interface that this value type implements the value normally gets boxed. Using generics you can avoid the boxing:

interface IFoo {...}
struct Bar : IFoo {...}

void boxing(IFoo x) { ... }
void byValue<T>(T x) : where T : IFoo { ... }

var bar = new Bar();
boxing(bar);
byValue(bar);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM