简体   繁体   English

从数组0初始化奇怪的汇编

[英]Strange assembly from array 0-initialization

Inspired by the question Difference in initalizing and zeroing an array in c/c++ ? 灵感来自c / c ++中初始化和归零数组的问题 , I decided to actually examine the assembly of, in my case, an optimized release build for Windows Mobile Professional (ARM processor, from the Microsoft Optimizing Compiler). 在我的例子中,我决定实际检查一下针对Windows Mobile Professional(ARM处理器,来自Microsoft Optimizing Compiler)的优化发布版本。 What I found was somewhat surprising, and I wonder if someone can shed some light on my questions concerning it. 我发现的有点令人惊讶,我想知道是否有人可以解释我的问题。

These two examples are examined: 检查这两个例子:

byte a[10] = { 0 };

byte b[10];
memset(b, 0, sizeof(b));

They are used in the same function, so the stack looks like this: 它们在同一个函数中使用,因此堆栈如下所示:

[ ] // padding byte to reach DWORD boundary
[ ] // padding byte to reach DWORD boundary
[ ] // b[9] (last element of b)
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ] // b[0] = sp + 12 (stack pointer + 12 bytes)
[ ] // padding byte to reach DWORD boundary
[ ] // padding byte to reach DWORD boundary
[ ] // a[9] (last element of a)
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ] // a[0] = sp (stack pointer, at bottom)

The generated assembly with my comments: 生成的程序集带有我的注释:

; byte a[10] = { 0 };

01: mov   r3, #0        // r3 = 0
02: mov   r2, #9        // 3rd arg to memset: 9 bytes, note that sizeof(a) = 10
03: mov   r1, #0        // 2nd arg to memset: 0-initializer
04: add   r0, sp, #1    // 1st arg to memset: &a[1] = a + 1, since only 9 bytes will be set
05: strb  r3, [sp]      // a[0] = r3 = 0, sets the first element of a
06: bl    memset        // continue in memset

; byte b[10];
; memset(b, 0, sizeof(b));

07: mov   r2, #0xA      // 3rd arg to memset: 10 bytes, sizeof(b)
08: mov   r1, #0        // 2nd arg to memset: 0-initializer
09: add   r0, sp, #0xC  // 1st arg to memset: sp + 12 bytes (the 10 elements
                        // of a + 2 padding bytes for alignment) = &b[0]
10: bl    memset        // continue in memset

Now, there are two things that confuses me: 现在,有两件事让我困惑:

  1. What's the point of lines 02 and 05? 第02和05行有什么意义? Why not just give &a[0] and 10 bytes to memset? 为什么不给memset一个[0]和10个字节?
  2. Why isn't the padding bytes of a 0-initialized? 为什么0初始化的填充字节不是? Is that only for padding in structs? 那只是结构中的填充吗?

Edit: I was too curious to not test the struct case: 编辑:我太好奇,不测试结构案例:

struct Padded
{
    DWORD x;
    byte y;
};

The assembler for 0-initializing it: 用于初始化0的汇编程序:

; Padded p1 = { 0 };

01: mov   r3, #0
02: str   r3, [sp]
03: mov   r3, #0
04: str   r3, [sp, #4]

; Padded p2;
; memset(&p2, 0, sizeof(p2));

05: mov   r3, #0
06: str   r3, [sp]
07: andcs r4, r0, #0xFF
08: str   r3, [sp, #4]

Here we see in line 04 that a padding indeed occur, since str (as opposed to strb ) is used. 这里我们在第04行中看到填充确实发生,因为使用了str (而不是strb )。 Right? 对?

The reason for lines 2 and 5 is because you specified a 0 in the array initializer. 第2行和第5行的原因是因为您在数组初始值设定项中指定了0。 The compiler will initialize all constants then pad out the rest using memset. 编译器将初始化所有常量,然后使用memset填充其余常量。 If you were to put two zeros in your initializer, you'd see it strw (word instead of byte) then memset 8 bytes. 如果要在初始化程序中放置两个零,则会看到它是strw(字而不是字节)然后是memset 8字节。

As for the padding, it's only used to align memory accesses -- the data shouldn't be used under normal circumstances, so memsetting it is wasteful. 至于填充,它仅用于对齐内存访问 - 在正常情况下不应使用数据,因此将其设置为浪费。

Edit: For the record, I may be wrong about the strw assumption above. 编辑:为了记录,我可能错误地认为上面的strw假设。 99% of my ARM experience is reversing code generated by GCC/LLVM on the iPhone, so my assumption may not carry over to MSVC. 99%的ARM经验都是逆转iPhone上GCC / LLVM生成的代码,所以我的假设可能不会延续到MSVC。

Both bits of code are bugfree. 这两段代码都是无错误的。 The two lines mentioned aren't smart, but you're just proving that this compiler is emitting suboptimal code. 提到的两行并不聪明,但你只是证明这个编译器发出了次优代码。

Padding bytes are usually only initialized if that simplifies the assembly or speeds up the code. 填充字节通常只在初始化时才会简化程序集或加速代码。 Eg if you have padding between two zero-filled members, it's often easier to zero-fill the padding as well. 例如,如果在两个零填充成员之间有填充,则通常也更容易对填充进行零填充。 Also, if you have padding at the end and your memset() is optimized for multi-byte writes, it may be faster to overwrite that padding too. 此外,如果最后有填充,并且memset()针对多字节写入进行了优化,则覆盖该填充也可能更快。

Some quick testing indicates that Microsoft's x86 compiler generates different assembly if the initializer list is empty, compared to when it contains a zero. 一些快速测试表明,如果初始化程序列表为空,则Microsoft的x86编译器生成不同的程序集,而不是它包含零。 Maybe their ARM compiler does too. 也许他们的ARM编译器也是如此。 What happens if you do this? 如果你这样做会怎么样?

byte a[10] = { };

Here's the assembly listing I got (with options /EHsc /FAs /O2 on Visual Studio 2008). 这是我得到的汇编列表(在Visual Studio 2008上使用选项/EHsc /FAs /O2 )。 Note that including a zero in the initializer list causes the compiler to use unaligned memory accesses to initialize the array, while the empty initializer list version and the memset() version both use aligned memory accesses: 请注意,在初始化列表中包含零会导致编译器使用未对齐的内存访问来初始化数组,而空的初始化列表版本和memset()版本都使用对齐的内存访问:

; unsigned char a[10] = { };

xor eax, eax
mov DWORD PTR _a$[esp+40], eax
mov DWORD PTR _a$[esp+44], eax
mov WORD PTR _a$[esp+48], ax

; unsigned char b[10] = { 0 };

mov BYTE PTR _b$[esp+40], al
mov DWORD PTR _b$[esp+41], eax
mov DWORD PTR _b$[esp+45], eax
mov BYTE PTR _b$[esp+49], al

; unsigned char c[10];
; memset(c, 0, sizeof(c));

mov DWORD PTR _c$[esp+40], eax
mov DWORD PTR _c$[esp+44], eax
mov WORD PTR _c$[esp+48], ax

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM