简体   繁体   English

Box CIL 在 .net 内部如何工作?

[英]How does Box CIL work internally in .net?

let's say we have following C# code:假设我们有以下 C# 代码:

public static void Main() 
{
   int v = 5;
   Object o = v;
   v = 123;
   Console.WriteLine(v + (Int32) o); // Displays "1235"
}

and the IL code generated is:生成的 IL 代码是:

.locals init ([0]int32 v, [1] object o)

 // Load 5 into v.
 IL_0000: ldc.i4.5
 IL_0001: stloc.0

 // Box v and store the reference pointer in o.    <------first boxing
 IL_0002: ldloc.0
 IL_0003: box [mscorlib]System.Int32
 IL_0008: stloc.1

 // Load 123 into v.
 IL_0009: ldc.i4.s 123
 IL_000b: stloc.0

 // Box v and leave the pointer on the stack for Concat.  <------second boxing
 IL_000c: ldloc.0
 IL_000d: box [mscorlib]System.Int32

 // Unbox o: Get the pointer to the In32's field on the stack.
 IL_0017: ldloc.1
 IL_0018: unbox.any [mscorlib]System.Int32

 // Box the Int32 and leave the pointer on the stack for Concat.   <------third boxing
 IL_001d: box [mscorlib]System.Int32

 // Call Concat.
 IL_0022: call string [mscorlib]System.String::Concat(object, object) 

we can see that the first boxing and second boxing works as follow:我们可以看到第一个拳击和第二个拳击的工作原理如下:

  1. push the first argument v on stack.将第一个参数v压入堆栈。

  2. call box CIL呼叫box CIL

So it looks like when box is called, the "argument" needed is the stack pointer which points to the first field of v .所以看起来当box被调用时,所需的“参数”是指向v的第一个字段的堆栈指针。

And the third boxing works as follow:第三拳的作用如下:

  1. The preceding unbox create a value type pointer, this value type pointer points to the first field of boxed instance on heap, and then this value type pointer is pushed onto the stack.前面的unbox创建了一个值类型指针,这个值类型指针指向堆上装箱实例的第一个字段,然后这个值类型指针被压入堆栈。

  2. call box CIL呼叫box CIL

So now it looks like when box is called, it first check the stack pointer to get its content(the value type pointer that points to heap) by dereferencing stack pointer.所以现在看起来当box被调用时,它首先通过取消引用堆栈指针来检查堆栈指针以获取其内容(指向堆的值类型指针)。

So my question is that, is box CIL designed as that versatile that sometimes it reads the stack pointer directly while sometimes it dereferences the stack pointer to get another pointer(a pointer to heap in my case)?所以我的问题是, box CIL 是否设计得如此通用,有时它会直接读取堆栈指针,而有时它会取消引用堆栈指针以获取另一个指针(在我的情况下是指向堆的指针)?

The unbox.any unboxes and loads the value type in the stack (so it copies it): unbox.any将值类型拆箱并加载到堆栈中(因此它会复制它):

From MSDN :来自MSDN

The resulting object reference or value type is pushed onto the stack.生成的 object 引用或值类型被推入堆栈。

When applied to the boxed form of a value type, the unbox.any instruction extracts the value contained within obj (of type O), and is therefore equivalent to unbox followed by ldobj.当应用于值类型的装箱形式时,unbox.any 指令提取包含在 obj(O 类型)中的值,因此等效于 unbox 后跟 ldobj。

What you are thinking is the unbox instruction :你在想的是unbox指令

The unbox instruction converts the object reference (type O), the boxed representation of a value type, to a value type pointer (a managed pointer, type &), its unboxed form. unbox 指令将 object 引用(类型 O)(值类型的装箱表示)转换为未装箱形式的值类型指针(托管指针,类型 &)。 The supplied value type (valType) is a metadata token indicating the type of value type contained within the boxed object.提供的值类型 (valType) 是元数据标记,指示包含在装箱 object 中的值类型的类型。

Unlike Box, which is required to make a copy of a value type for use in the object, unbox is not required to copy the value type from the object.与 Box 不同,它需要复制用于 object 的值类型,unbox 不需要从 object 复制值类型。 Typically it simply computes the address of the value type that is already present inside of the boxed object.通常,它只是计算已存在于盒装 object 内部的值类型的地址。

I don't even know how you can force the compiler to use the unbox instruction (reading from here it isn't used in C#, or at least it wasn't used in the compiler in 2010... I've done some tests, mixing ref , boxing and unboxing and I wasn't able to force the compiler to use it)我什至不知道如何强制编译器使用unbox指令(从这里阅读它在 C# 中没有使用,或者至少在 2010 年的编译器中没有使用......我已经做了一些测试,混合ref ,装箱和拆箱,我无法强制编译器使用它)

Mmmh... By taking a look at the ILSpy (they are experts in decompiling C# code), it seems that unbox is used only in the "private" implementation of certain switch (the switch statement is compiled in different ways depending on the number and on the types of conditions it has).嗯...通过查看ILSpy (他们反编译C#代码的专家),似乎unbox仅用于某些switch的“私有”实现( switch语句根据数量以不同的方式编译)以及它具有的条件类型)。 The only reference about unbox is in a method called MatchLegacySwitchOnStringWithHashtable ... I will say the name is quite clear.关于unbox的唯一参考是在一个名为MatchLegacySwitchOnStringWithHashtable的方法中......我会说这个名字很清楚。 Another reference is in the Unsafe.il file... the file is "linked" to the Unsafe class of .NET.另一个参考在Unsafe.il文件中...该文件“链接”到 .NET 的Unsafe class。 See the proposal here about the Unsafe.Unbox<T> method.请参阅此处有关Unsafe.Unbox<T>方法的建议。 The method was accepted and is part of .NET now.该方法已被接受,现在是 .NET 的一部分。 The corresponding C# code doesn't compile: 对应的 C# 代码无法编译:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static ref T Unbox<T>(object box) where T : struct
{
    return ref (T)box;
}

In truth, by taking a look at the .NET code it is probably implemented as an intrinsic.事实上,通过查看.NET 代码,它可能是作为内在函数实现的。

Cancel everything.. Charlieface has found how to force the use of unbox :取消一切.. Charlieface 找到了如何强制使用unbox

public struct MyStruct
{
    public int A;

    public int Test()
    {
        object st2 = new MyStruct();
        int a = ((MyStruct)st2).A;
        return a;
    }
}

The method Test() is compiled to:方法Test()编译为:

// Methods
.method public hidebysig 
    instance int32 Test () cil managed 
{
    // Method begins at RVA 0x2050
    // Code size 25 (0x19)
    .maxstack 1
    .locals init (
        [0] valuetype MyStruct
    )

    IL_0000: ldloca.s 0
    IL_0002: initobj MyStruct
    IL_0008: ldloc.0
    IL_0009: box MyStruct
    IL_000e: unbox MyStruct
    IL_0013: ldfld int32 MyStruct::A
    IL_0018: ret
} // end of method MyStruct::Test

From some tests I'll say that the intelligent opcode in the family is ldfld : it can work both with value types ( Test1 ), references to value types ( Test2 ) and directly unboxed value types ( Test3 ).一些测试中,我会说该系列中的智能操作码是ldfld :它可以与值类型 ( Test1 )、对值类型的引用 ( Test2 ) 和直接未装箱的值类型 ( Test3 ) 一起使用。

public struct MyStruct
{
    public int A;

    public int Test1(MyStruct st)
    {
        int a = st.A;
        return a;
    }

    public int Test2(ref MyStruct st)
    {
        int a = st.A;
        return a;
    }

    public int Test3(MyStruct st)
    {
        object st2 = st;
        int a = ((MyStruct)st2).A;
        return a;
    }
}

is compiled to编译为

.method public hidebysig 
    instance int32 Test1 (
        valuetype MyStruct st
    ) cil managed 
{
    // Method begins at RVA 0x2050
    // Code size 7 (0x7)
    .maxstack 8

    IL_0000: ldarg.1
    IL_0001: ldfld int32 MyStruct::A
    IL_0006: ret
} // end of method MyStruct::Test1

.method public hidebysig 
    instance int32 Test2 (
        valuetype MyStruct& st
    ) cil managed 
{
    // Method begins at RVA 0x2050
    // Code size 7 (0x7)
    .maxstack 8

    IL_0000: ldarg.1
    IL_0001: ldfld int32 MyStruct::A
    IL_0006: ret
} // end of method MyStruct::Test2

.method public hidebysig 
    instance int32 Test3 (
        valuetype MyStruct st
    ) cil managed 
{
    // Method begins at RVA 0x2058
    // Code size 17 (0x11)
    .maxstack 8

    IL_0000: ldarg.1
    IL_0001: box MyStruct
    IL_0006: unbox MyStruct
    IL_000b: ldfld int32 MyStruct::A
    IL_0010: ret
} // end of method MyStruct::Test3

The ldfld opcode is always the same, and it is working with two (three) different types: int32 and reference to int32 (and reference to boxed int32 ). ldfld操作码始终相同,并且它使用两(三)种不同的类型: int32和对int32的引用(以及对装箱的int32的引用)。

@xanatos is correct. @xanatos 是正确的。 You are mixing up unbox and unbox.any您正在混淆unboxunbox.any

From the ECMA-335 spec (the spec for .NET and CIL)来自ECMA-335规范(.NET 和 CIL 的规范)

Part III.4.33:第 III.4.33 部分:

Unlike the unbox instruction, for value types, unbox.any leaves a value, not an address of a value, on the stack.与 unbox 指令不同,对于值类型, unbox.any 在堆栈上留下一个值,而不是值的地址。

Incidentally, there are instructions which take either a ref valuetype or an actual value.顺便说一句,有些指令采用ref valuetype值类型或实际值。 For example, ldfld (see more on this here )例如, ldfld (在此处查看更多信息)


Furthermore, you say:此外,你说:

So it looks like when box is called, the "argument" needed is the stack pointer which points to the first field of v.所以看起来当 box 被调用时,所需的“参数”是指向 v 的第一个字段的堆栈指针。

This is not true: ldloc.0 will load the actual value to the stack.这不是真的: ldloc.0会将实际值加载到堆栈中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM