[英]Why does 64-bit VC++ compiler add nop instruction after function calls?
I've compiled the following using Visual Studio C++ 2008 SP1, x64
C++
compiler: 我使用Visual Studio C ++ 2008 SP1,
x64
C++
编译器编译了以下内容:
I'm curious, why did compiler add those nop
instructions after those call
s? 我很好奇,为什么编译器会在那些
call
之后添加那些nop
指令?
PS1. PS1。 I would understand that the 2nd and 3rd
nop
s would be to align the code on a 4 byte margin, but the 1st nop
breaks that assumption. 我会理解第二和第三个
nop
将是4字节边距上的代码对齐,但第一个nop
打破了这个假设。
PS2. PS2。 The C++ code that was compiled had no loops or special optimization stuff in it:
编译的C ++代码中没有循环或特殊的优化内容:
CTestDlg::CTestDlg(CWnd* pParent /*=NULL*/)
: CDialog(CTestDlg::IDD, pParent)
{
m_hIcon = AfxGetApp()->LoadIcon(IDR_MAINFRAME);
//This makes no sense. I used it to set a debugger breakpoint
::GdiFlush();
srand(::GetTickCount());
}
PS3. PS3。 Additional Info: First off, thank you everyone for your input.
附加信息: 首先,谢谢大家的意见。
Here's additional observations: 以下是其他观察结果:
My first guess was that incremental linking could've had something to do with it. 我的第一个猜测是增量链接可能与它有关。 But, the
Release
build settings in the Visual Studio
for the project have incremental linking
off. 但是,项目的
Visual Studio
的Release
构建设置具有incremental linking
。
This seems to affect x64
builds only. 这似乎只影响
x64
版本。 The same code built as x86
(or Win32
) does not have those nop
s, even though instructions used are very similar: 构建为
x86
(或Win32
)的相同代码没有那些nop
,即使使用的指令非常相似:
x64
code produced by VS 2013
looks somewhat different, it still adds those nop
s after some call
s: VS 2013
生成的x64
代码看起来有些不同,它仍会在一些call
之后添加那些nop
: dynamic
vs static
linking to MFC made no difference on presence of those nop
s. dynamic
与static
链接到MFC也没有区别存在那些nop
。 This one is built with dynamical linking to MFC dlls with VS 2013
: VS 2013
动态链接到MFC dll: nop
s can appear after near
and far
call
s as well, and they have nothing to do with alignment. nop
S能后出现near
及far
call
S作为很好,他们什么都没有做比对。 Here's a part of the code that I got from IDA
if I step a little bit further on: IDA
获得的代码的一部分,如果我再进一步说明: As you see, the nop
is inserted after a far
call
that happens to "align" the next lea
instruction on the B
address! 如您所见,在
far
call
之后插入nop
,恰好“对齐” B
地址上的下一个lea
指令! That makes no sense if those were added for alignment only. 如果仅为了对齐而添加这些内容毫无意义。
near
relative
call
s (ie those that start with E8
) are somewhat faster than far
call
s (or the ones that start with FF
, 15
in this case) near
relative
call
(即那些以E8
开头的call
)比far
call
s(或以FF
开头的那些,在这种情况下为15
) 更快一些 the linker may try to go with near
call
s first, and since those are one byte shorter than far
call
s, if it succeeds, it may pad the remaining space with nop
s at the end. 链接器可能首先尝试
near
call
s,并且因为它们比far
call
s短一个字节,如果成功,它可以在末尾用nop
s填充剩余空间。 But then the example (5) above kinda defeats this hypothesis. 但是上面的例子(5)有点打败了这个假设。
So I still don't have a clear answer to this. 所以我仍然没有明确的答案。
This is purely a guess, but it might be some kind of a SEH optimization. 这纯粹是猜测,但它可能是某种SEH优化。 I say optimization because SEH seems to work fine without the NOPs too.
我说优化是因为SEH似乎在没有NOP的情况下工作正常。 NOP might help speed up unwinding.
NOP可能有助于加速平仓。
In the following example ( live demo with VC2017 ), there is a NOP
inserted after a call to basic_string::assign
in test1
but not in test2
(identical but declared as non-throwing 1 ). 在下面的示例中( 使用VC2017进行实时演示 ),在
test1
调用basic_string::assign
后插入了NOP
,但在test2
没有(相同但声明为非抛出1 )。
#include <stdio.h>
#include <string>
int test1() {
std::string s = "a"; // NOP insterted here
s += getchar();
return (int)s.length();
}
int test2() throw() {
std::string s = "a";
s += getchar();
return (int)s.length();
}
int main()
{
return test1() + test2();
}
Assembly: 部件:
test1:
. . .
call std::basic_string<char,std::char_traits<char>,std::allocator<char> >::assign
npad 1 ; nop
call getchar
. . .
test2:
. . .
call std::basic_string<char,std::char_traits<char>,std::allocator<char> >::assign
call getchar
Note that MSVS compiles by default with the /EHsc
flag (synchronous exception handling). 请注意,MSVS默认使用
/EHsc
标志进行编译(同步异常处理)。 Without that flag the NOP
s disappear, and with /EHa
(synchronous and asynchronous exception handling), throw()
no longer makes a difference because SEH is always on. 如果没有那个标志,
NOP
消失,并且使用/EHa
(同步和异步异常处理), throw()
不再/EHa
,因为SEH始终打开。
1 For some reason only throw()
seems to reduce the code size, using noexcept
makes the generated code even bigger and summons even more NOP
s. 1由于某些原因,只有
throw()
似乎减少了代码大小,使用noexcept
使生成的代码更大并且召唤更多的NOP
。 MSVC... MSVC ...
这是一个特殊的填充程序,让异常处理程序/展开函数正确检测它是否是函数的序言/结尾/正文。
This is due to a calling convention in x64 which requires the stack to be 16 bytes aligned before any call instruction. 这是由于x64中的调用约定要求堆栈在任何调用指令之前对齐16字节。 This is not (to my knwoledge) a hardware requirement but a software one.
这不是(我的知识)硬件要求,而是软件要求。 This provides a way to be sure that when entering a function (that is, after a call instruction), the value of the stack pointer is always 8 modulo 16. Thus permitting simple data alignement and storage/reads from aligned location in stack.
这提供了一种方法来确保在进入函数时(即,在调用指令之后),堆栈指针的值总是8模16。因此允许从堆栈中的对齐位置进行简单的数据对齐和存储/读取。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.