简体   繁体   中英

C Code represented as Assembler Code - How to interpret?

I got this short C Code.

#include <stdint.h>
uint64_t multiply(uint32_t x, uint32_t y) {

uint64_t res;
res = x*y;
return res;
}

int main() {

uint32_t a = 3, b = 5, z;
z = multiply(a,b);
return 0;
}

There is also an Assembler Code for the given C code above. I don't understand everything of that assembler code. I commented each line and you will find my question in the comments for each line.

The Assembler Code is:

.text
multiply:
     pushl  %ebp  // stores the stack frame of the calling function on the stack
     movl   %esp, %ebp // takes the current stack pointer and uses it as the frame for the called function
     subl   $16, %esp // it leaves room on the stack, but why 16Bytes. sizeof(res) = 8Bytes
     movl   8(%ebp), %eax // I don't know quite what "8(%ebp) mean? It has to do something with res, because
     imull  12(%ebp), %eax // here is the multiplication done. And again "12(%ebp).
     movl   %eax, -8(%ebp) // Now, we got a negative number in front of. How to interpret this?
     movl   $0, -4(%ebp) // here as well
     movl   -8(%ebp), %eax // and here again.
     movl   -4(%ebp), %edx // also here
     leave
     ret
main:
     pushl  %ebp // stores the stack frame of the calling function on the stack
     movl   %esp, %ebp // // takes the current stack pointer and uses it as the frame for the called function
     andl   $-8, %esp // what happens here and why?
     subl   $24, %esp // here, it leaves room for local variables, but why 24 bytes? a, b, c: the size of each of them is 4 Bytes. So 3*4 = 12
     movl   $3, 20(%esp) // 3 gets pushed on the stack
     movl   $5, 16(%esp) // 5 also get pushed on the stack
     movl   16(%esp), %eax // what does 16(%esp) mean and what happened with z?
     movl   %eax, 4(%esp) // we got the here as well
     movl   20(%esp), %eax // and also here
     movl   %eax, (%esp) // what does happen in this line?
     call   multiply  // thats clear, the function multiply gets called
     movl   %eax, 12(%esp) // it looks like the same as two lines before, except it contains the number 12
     movl   $0, %eax // I suppose, this line is because of "return 0;"
     leave
     ret

Negative references relative to %ebp are for local variables on the stack.

 movl   8(%ebp), %eax // I don't know quite what "8(%ebp) mean? It has to do something with res, because`

%eax = x

 imull  12(%ebp), %eax // here is the multiplication done. And again "12(%ebp).

%eax = %eax * y

 movl   %eax, -8(%ebp) // Now, we got a negative number in front of. How to interpret this?

(u_int32_t)res = %eax // sets low 32 bits of res

 movl   $0, -4(%ebp) // here as well

clears upper 32 bits of res to extend 32-bit multiplication result to uint64_t

 movl   -8(%ebp), %eax // and here again.
 movl   -4(%ebp), %edx // also here

return ret; //64-bit results are returned as a pair of 32-bit registers %edx:%eax

As for the main, see x86 calling convention which may help making sense of what happens.

 andl   $-8, %esp // what happens here and why?

stack boundary is aligned by 8. I believe it's ABI requirement

 subl   $24, %esp // here, it leaves room for local variables, but why 24 bytes? a, b, c: the size of each of them is 4 Bytes. So 3*4 = 12

Multiples of 8 (probably due to alignment requirements)

 movl   $3, 20(%esp) // 3 gets pushed on the stack

a = 3

 movl   $5, 16(%esp) // 5 also get pushed on the stack

b = 5

 movl   16(%esp), %eax // what does 16(%esp) mean and what happened with z?

%eax = b

z is at 12(%esp) and is not used yet.

 movl   %eax, 4(%esp) // we got the here as well

put b on the stack (second argument to multiply())

 movl   20(%esp), %eax // and also here

%eax = a

 movl   %eax, (%esp) // what does happen in this line?

put a on the stack (first argument to multiply())

 call   multiply  // thats clear, the function multiply gets called

multiply returns 64-bit result in %edx:%eax

 movl   %eax, 12(%esp) // it looks like the same as two lines before, except it contains the number 12

z = (uint32_t) multiply()

 movl   $0, %eax // I suppose, this line is because of "return 0;"

yup. return 0;

Arguments are pushed onto the stack when the function is called. Inside the function, the stack pointer at that time is saved as the base pointer. (You got that much already.) The base pointer is used as a fixed location from which to reference arguments (which are above it, hence the positive offsets) and local variables (which are below it, hence the negative offsets).

The advantage of using a base pointer is that it is stable throughout the entire function, even when the stack pointer changes (due to function calls and new scopes).

So 8(%ebp) is one argument, and 12(%ebp) is the other.

The code is likely using more space on the stack than it needs to, because it is using temporary variables that could be optimized out of you had optimization turned on.

You might find this helpful: http://en.wikibooks.org/wiki/X86_Disassembly/Functions_and_Stack_Frames

I started typing this as a comment but it was getting too long to fit.

You can compile your example with -masm=intel so the assembly is more readable. Also, don't confuse the push and pop instructions with mov . push and pop always increments and decrements esp respectively before derefing the address whereas mov does not.

There are two ways to store values onto the stack. You can either push each item onto it one item at a time or you can allocate up-front the space required and then load each value onto the stackslot using mov + relative offset from either esp or ebp .

In your example, gcc chose the second method since that's usually faster because, unlike the first method, you're not constantly incrementing esp before saving the value onto the stack.

To address your other question in comment, x86 instruction set does not have a mov instruction for copying values from memory location a to another memory location b directly. It is not uncommon to see code like:

  mov   eax, [esp+16]
  mov   [esp+4], eax
  mov   eax, [esp+20]
  mov   [esp], eax
  call  multiply(unsigned int, unsigned int)
  mov   [esp+12], eax

Register eax is being used as an intermediate temporary variable to help copy data between the two stack locations. You can mentally translate the above as:

esp[4] = esp[16]; // argument 2
esp[0] = esp[20]; // argument 1
call multiply
esp[12] = eax;    // eax has return value

Here's what the stack approximately looks like right before the call to multiply :

lower addr    esp       =>  uint32_t:a_copy = 3 <--.  arg1 to 'multiply'
              esp + 4       uint32_t:b_copy = 5 <--.  arg2 to 'multiply'
    ^         esp + 8       ????
    ^         esp + 12      uint32_t:z = ?      <--.
    |         esp + 16      uint32_t:b = 5         |  local variables in 'main'
    |         esp + 20      uint32_t:a = 3      <--.
    |         ...
    |         ...
higher addr   ebp           previous frame

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM