简体   繁体   English

汇编如何进行参数传递:通过值,引用,指针来表示不同类型/数组?

[英]How does assembly do parameter passing: by value, reference, pointer for different types/arrays?

In attempt to look at this, I wrote this simple code where I just created variables of different types and passed them into a function by value, by reference, and by pointer: 为了看看这个,我写了这个简单的代码,我刚刚创建了不同类型的变量,并通过值,引用和指针将它们传递给函数:

int i = 1;
char c = 'a';
int* p = &i;
float f = 1.1;
TestClass tc; // has 2 private data members: int i = 1 and int j = 2

the function bodies were left blank because i am just looking at how parameters are passed in. 函数体留空,因为我只是看看如何传入参数。

passByValue(i, c, p, f, tc); 
passByReference(i, c, p, f, tc); 
passByPointer(&i, &c, &p, &f, &tc);

wanted to see how this is different for an array and also how the parameters are then accessed. 想知道数组的不同之处以及如何访问参数。

int numbers[] = {1, 2, 3};
passArray(numbers); 

assembly: 部件:

passByValue(i, c, p, f, tc)

mov EAX, DWORD PTR [EBP - 16]
    mov DL, BYTE PTR [EBP - 17]
    mov ECX, DWORD PTR [EBP - 24]
    movss   XMM0, DWORD PTR [EBP - 28]
    mov ESI, DWORD PTR [EBP - 40]
    mov DWORD PTR [EBP - 48], ESI
    mov ESI, DWORD PTR [EBP - 36]
    mov DWORD PTR [EBP - 44], ESI
    lea ESI, DWORD PTR [EBP - 48]
    mov DWORD PTR [ESP], EAX
    movsx   EAX, DL
    mov DWORD PTR [ESP + 4], EAX
    mov DWORD PTR [ESP + 8], ECX
    movss   DWORD PTR [ESP + 12], XMM0
    mov EAX, DWORD PTR [ESI]
    mov DWORD PTR [ESP + 16], EAX
    mov EAX, DWORD PTR [ESI + 4]
    mov DWORD PTR [ESP + 20], EAX
    call    _Z11passByValueicPif9TestClass


passByReference(i, c, p, f, tc)

    lea EAX, DWORD PTR [EBP - 16]
    lea ECX, DWORD PTR [EBP - 17]
    lea ESI, DWORD PTR [EBP - 24]
    lea EDI, DWORD PTR [EBP - 28]
    lea EBX, DWORD PTR [EBP - 40]
    mov DWORD PTR [ESP], EAX
    mov DWORD PTR [ESP + 4], ECX
    mov DWORD PTR [ESP + 8], ESI
    mov DWORD PTR [ESP + 12], EDI
    mov DWORD PTR [ESP + 16], EBX
    call    _Z15passByReferenceRiRcRPiRfR9TestClass

passByPointer(&i, &c, &p, &f, &tc)

    lea EAX, DWORD PTR [EBP - 16]
    lea ECX, DWORD PTR [EBP - 17]
    lea ESI, DWORD PTR [EBP - 24]
    lea EDI, DWORD PTR [EBP - 28]
    lea EBX, DWORD PTR [EBP - 40]
    mov DWORD PTR [ESP], EAX
    mov DWORD PTR [ESP + 4], ECX
    mov DWORD PTR [ESP + 8], ESI
    mov DWORD PTR [ESP + 12], EDI
    mov DWORD PTR [ESP + 16], EBX
    call    _Z13passByPointerPiPcPS_PfP9TestClass

passArray(numbers)

    mov EAX, .L_ZZ4mainE7numbers
    mov DWORD PTR [EBP - 60], EAX
    mov EAX, .L_ZZ4mainE7numbers+4
    mov DWORD PTR [EBP - 56], EAX
    mov EAX, .L_ZZ4mainE7numbers+8
    mov DWORD PTR [EBP - 52], EAX
    lea EAX, DWORD PTR [EBP - 60]
    mov DWORD PTR [ESP], EAX
    call    _Z9passArrayPi

    // parameter access
    push    EAX
    mov EAX, DWORD PTR [ESP + 8]
    mov DWORD PTR [ESP], EAX
    pop EAX

I'm assuming I'm looking at the right assembly pertaining to the parameter passing because there are calls at the end of each! 我假设我正在查看与参数传递相关的正确程序集,因为每个结尾都有调用!

But due to my very limited knowledge of assembly, I can't tell what's going on here. 但由于我对装配的知识非常有限,我不知道这里发生了什么。 I learned about ccall convention, so I'm assuming something is going on that has to do with preserving the caller-saved registers and then pushing the parameters onto the stack. 我学习了ccall约定,所以我假设正在进行的事情与保留调用者保存的寄存器然后将参数推送到堆栈有关。 Because of this, I'm expecting to see things loaded into registers and "push" everywhere, but have no idea what's going on with the mov s and lea s. 因此,我希望看到东西被加载到寄存器中并“推”到各处,但不知道movlea是怎么回事。 Also, I don't know what DWORD PTR is. 另外,我不知道DWORD PTR是什么。

I've only learned about registers: eax, ebx, ecx, edx, esi, edi, esp and ebp , so seeing something like XMM0 or DL just confuses me as well. 我只学习了寄存器: eax, ebx, ecx, edx, esi, edi, espebp ,所以看到像XMM0DL这样的东西也让我感到困惑。 I guess it makes sense to see lea when it comes to passing by reference/pointer because they use memory addresses, but I can't actually tell what is going on. 我想通过引用/指针传递lea是有意义的,因为它们使用内存地址,但我实际上无法分辨出发生了什么。 When it comes to passing by value, it seems like there are many instructions, so this could have to do with copying the value into registers. 当涉及到传递值时,似乎有很多指令,所以这可能与将值复制到寄存器有关。 No idea when it comes to how arrays are passed and accessed as parameters. 不知道何时将数组作为参数传递和访问。

If someone could explain the general idea of what's going on with each block of assembly to me, I would highly appreciate it. 如果有人能够向我解释每个装配块的一般概念,我将非常感激。

Using CPU registers for passing arguments is faster than using memory, ie stack. 使用CPU寄存器传递参数比使用内存更快,即堆栈。 However there is limited number of registers in CPU (especially in x86-compatible CPUs) so when a function has many parameters then stack is used instead of CPU registers. 但是,CPU中的寄存器数量有限(特别是在x86兼容的CPU中),因此当函数有许多参数时,则使用堆栈而不是CPU寄存器。 In your case there are 5 function arguments so the compiler uses stack for the arguments instead of registers. 在您的情况下,有5个函数参数,因此编译器使用堆栈作为参数而不是寄存器。

In principle compilers can use push instructions to push arguments to stack before actual call to function, but many compilers (incl. gnu c++) use mov to push arguments to stack. 原则上,编译器可以使用push指令在实际call函数之前将参数推送到堆栈,但是许多编译器(包括gnu c ++)使用mov将参数推送到堆栈。 This way is convenient as it does not change ESP register (top of the stack) in the part of code which calls the function. 这种方式很方便,因为它不会在调用函数的代码部分中更改ESP寄存器(堆栈顶部)。

In case of passByValue(i, c, p, f, tc) values of arguments are placed on the stack. passByValue(i, c, p, f, tc) ,参数的值被放置在堆栈上。 You can see many mov instruction from a memory location to a register and from the register to an appropriate location of the stack. 您可以看到许多mov指令从存储器位置到寄存器,从寄存器到堆栈的适当位置。 The reason for this is that x86 assembly forbids direct moving from one memory location to another (exception is movs which moves values from one array (or string as you wish) to another). 原因是x86程序集禁止直接从一个内存位置移动到另一个内存位置(例外情况是将值从一个数组(或字符串)移动到另一个数组的movs )。

In case of passByReference(i, c, p, f, tc) you can see many 5 lea instructions which copy addresses of arguments to CPU registers, and these values of the registers are moved into stack. passByReference(i, c, p, f, tc)您可以看到许多5个lea指令,它们将参数的地址复制到CPU寄存器,并将这些寄存器值移入堆栈。

The case of passByPointer(&i, &c, &p, &f, &tc) is similar to passByValue(i, c, p, f, tc) . passByPointer(&i, &c, &p, &f, &tc)情况类似于passByValue(i, c, p, f, tc) Internally, on the assembly level, pass by reference uses pointers, while on the higher, C++, level a programmer does not need to use explicitely the & and * operators on references. 在内部,在程序集级别,按引用传递使用指针,而在较高的C ++级别,程序员不需要明确地使用引用上的&*运算符。

After the parameters are moved to the stack call is issued, which pushes instruction pointer EIP to stack before transferring the program execution to the subroutine. 在将参数移动到堆栈之后,发出call ,在将程序执行转移到子例程之前,将call指令指针EIP All moves of the parameters to the stack account for the coming EIP on stack after the call instruction. call指令之后,所有参数moves到堆栈都会考虑堆栈中即将到来的EIP

There's too much in your example above to dissect all of them. 在上面的例子中有太多的东西来剖析所有这些。 Instead I'll just go over passByValue since that seems to be the most interesting. 相反,我只是通过passByValue因为这似乎是最有趣的。 Afterwards, you should be able to figure out the rest. 之后,你应该能够弄清楚其余部分。

First some important points to keep in mind while studying the disassembly so you don't get completely lost in the sea of code: 首先要研究反汇编时要记住的一些要点,这样你就不会完全迷失在代码的海洋中:

  • There are no instructions to directly copy data from one mem location to another mem location. 没有指令将数据从一个mem位置直接复制到另一个mem位置。 eg. 例如。 mov [ebp - 44], [ebp - 36] is not a legal instruction. mov [ebp - 44], [ebp - 36] 不是法律指令。 An intermediate register is needed to store the data first and then subsequently copied into the memory destination. 首先需要一个中间寄存器来存储数据,然后将其复制到存储器目的地。
  • Bracket operator [] in conjunction with a mov means to access data from a computed memory address. 支架运算符[]mov一起用于从计算的存储器地址访问数据。 This is analogous to derefing a pointer in C/C++. 这类似于在C / C ++中解析指针。
  • When you see lea x, [y] that usually means compute address of y and save into x . 当你看到lea x, [y] 通常意味着 y的计算地址并保存到x中 This is analogous to taking the address of a variable in C/C++. 这类似于在C / C ++中获取变量的地址。
  • Data and objects that needs to be copied but are too big to fit into a register are copied onto the stack in a piece-meal fashion. 需要复制但又太大而无法放入寄存器的数据和对象会以零碎的方式复制到堆栈中。 IOW, it'll copy a native machine word at a time until all the bytes representing the object/data is copied. IOW,它将一次复制本机机器字,直到复制表示对象/数据的所有字节。 Usually that means either 4 or 8 bytes on modern processors. 通常这意味着现代处理器上有4或8个字节。
  • The compiler will typically interleave instructions together to keep the processor pipeline busy and to minimize stalls. 编译器通常将指令交织在一起以保持处理器流水线繁忙并最小化停顿。 Good for code efficiency but bad if you're trying to understand the disassembly. 如果您正在尝试理解反汇编,那么对代码效率有好处,但不好。

With the above in mind here's the call to passByValue function rearranged a bit to make it more understandable: 考虑到上面这一点,对passByValue函数的调用重新排列了一些,使其更容易理解:

.define arg1  esp
.define arg2  esp + 4
.define arg3  esp + 8
.define arg4  esp + 12
.define arg5.1  esp + 16
.define arg5.2  esp + 20


; copy first parameter
mov EAX, [EBP - 16]
mov [arg1], EAX

; copy second parameter
mov DL, [EBP - 17]
movsx   EAX, DL
mov [arg2], EAX

; copy third
mov ECX, [EBP - 24]
mov [arg3], ECX

; copy fourth
movss   XMM0, DWORD PTR [EBP - 28]
movss   DWORD PTR [arg4], XMM0

; intermediate copy of TestClass?
mov ESI, [EBP - 40]
mov [EBP - 48], ESI
mov ESI, [EBP - 36]
mov [EBP - 44], ESI

;copy fifth
lea ESI, [EBP - 48]
mov EAX, [ESI]
mov [arg5.1], EAX
mov EAX, [ESI + 4]
mov [arg5.2], EAX
call    passByValue(int, char, int*, float, TestClass)

The code above is unmangled and instruction mixing undone to make it clear what is actually happening but some still needs explaining. 上面的代码是无法解释的,并且指令混合,以清楚地说明实际发生了什么,但有些仍然需要解释。 First, the char is signed and it is a single byte in size. 首先,char被signed ,它的大小是一个字节。 The instructions here: 这里的说明:

; copy second parameter
mov DL, [EBP - 17]
movsx   EAX, DL
mov [arg2], EAX

reads a byte from [ebp - 17] (somewhere on stack) and stores it into the lower first byte of edx . [ebp - 17] (堆栈中的某处)读取一个字节并将其存储到edx的低位字节。 That byte is then copied into eax using sign-extended move. 然后使用符号扩展移动eax字节复制到eax The full 32-bit value in eax is finally copied onto the stack that passByValue can access. eax的完整32位值最终被复制到passByValue可以访问的堆栈中。 See register layout if you need more detail. 如果需要更多细节, 请参阅寄存器布局

The fourth argument: 第四个论点:

movss   XMM0, DWORD PTR [EBP - 28]
movss   DWORD PTR [arg4], XMM0

Uses the SSE movss instruction to copy the floating point value from stack into a xmm0 register. 使用SSE movss指令将堆栈中的浮点值复制到xmm0寄存器中。 In brief, SSE instructions let you perform the same operation on multiple pieces of data simultaneously but here the compiler is using it as an intermediate storage for copying floating-point values on the stack. 简而言之,SSE指令允许您同时对多个数据执行相同的操作,但此处编译器将其用作复制堆栈上浮点值的中间存储。

The last argument: 最后一个论点:

; copy intermediate copy of TestClass?
mov ESI, [EBP - 40]
mov [EBP - 48], ESI
mov ESI, [EBP - 36]
mov [EBP - 44], ESI

corresponds to the TestClass . 对应于TestClass Apparently this class is 8-bytes in size located on the stack from [ebp - 40] to [ebp - 33] . 显然这个类的大小是8字节,位于从[ebp - 40][ebp - 33]的堆栈中。 The class here is being copied 4-bytes at a time since the object cannot fit into a single register. 这里的类一次被复制4个字节,因为该对象不能适合单个寄存器。

Here's what the stack approximately looks like prior to call passByValue : 这是call passByValue之前堆栈大致的样子:

lower addr    esp       =>  int:arg1            <--.
              esp + 4       char:arg2              |
              esp + 8       int*:arg3              |    copies passed
              esp + 12      float:arg4             |    to 'passByValue'
              esp + 16      TestClass:arg5.1       |
              esp + 20      TestClass:arg5.2    <--.
              ...
              ...
              ebp - 48      TestClass:arg5.1    <--   intermediate copy of 
              ebp - 44      TestClass:arg5.2    <--   TestClass?
              ebp - 40      original TestClass:arg5.1
              ebp - 36      original TestClass:arg5.2
              ...
              ebp - 28      original arg4     <--.
              ebp - 24      original arg3        |  original (local?) variables
              ebp - 20      original arg2        |  from calling function
              ebp - 16      original arg1     <--.
              ...
higher addr   ebp           prev frame

What you're looking for are ABI calling conventions . 您正在寻找的是ABI呼叫约定 Different platforms have different conventions. 不同的平台有不同的约定。 eg Windows on x86-64 has different conventions than Unix/Linux on x86-64. 例如x86-64上的Windows与x86-64上的Unix / Linux有不同的约定。

http://www.agner.org/optimize/ has a calling-conventions doc detailing the various ones for x86 / amd64. http://www.agner.org/optimize/有一个调用约定文档,详细介绍了x86 / amd64的各种文档。

You can write code in ASM that does whatever you want, but if you want to call other functions, and be called by them, then pass parameters / return values according to the ABI. 您可以在ASM中编写任何您想要的代码,但如果您想调用其他函数并由它们调用,则根据ABI传递参数/返回值。

It could be useful to make an internal-use-only helper function that doesn't use the standard ABI, but instead uses values in the registers that the calling function allocates them in. This is esp. 制作一个不使用标准ABI的内部使用辅助函数可能很有用,而是使用调用函数分配它们的寄存器中的值。这是特别的。 likely if you're writing the main program in something other than ASM, with just a small part in ASM. 可能如果你用ASM以外的其他东西编写主程序,只有一小部分在ASM中。 Then the asm part only needs to care about being portable to systems with different ABIs for being called from the main program, not for its own internals. 然后,asm部分只需要关心是否可以移植到具有不同ABI的系统,以便从主程序调用,而不是为了它自己的内部。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM