Tracing program in assembly.

Question

I'am trying to understand how a C program looks like at assembly level so i run gdb and used disassemble on main and get_input. The program is short so that i can follow it better. There are 2 lines that i don't understand. First on in main() is:

0x00000000004005a3 <+4>: mov $0x0,%eax

We save the old value of rbp and save current value of rsp to rbp. What is the purpose of that instruction?

The other in get_input() is:

000000000400581 <+4>:   sub    $0x10,%rsp

Here too we start by saving old value of rbp, by pushing it to the stack. Then giving rbp the current value of rsp. Then 16 bytes are subtracted from rsp. I understand this is space allocated but why is it 16 bytes and not 8 bytes? I made the buffer 8 bytes only, what are the purpose of the other 8 bytes?

#include <stdio.h>
void get_input()
{
    char buffer[8];

    gets(buffer);
    puts(buffer);
}

int main()
{
    get_input();
    return 0;
}

Dump of assembler code for function main:

   0x000000000040059f <+0>: push   %rbp
   0x00000000004005a0 <+1>: mov    %rsp,%rbp
   0x00000000004005a3 <+4>: mov    $0x0,%eax
   0x00000000004005a8 <+9>: callq  0x40057d <get_input>
   0x00000000004005ad <+14>:    mov    $0x0,%eax
   0x00000000004005b2 <+19>:    pop    %rbp
   0x00000000004005b3 <+20>:    retq   
End of assembler dump.

Dump of assembler code for function get_input:

   0x000000000040057d <+0>: push   %rbp
   0x000000000040057e <+1>: mov    %rsp,%rbp
   0x0000000000400581 <+4>: sub    $0x10,%rsp
   0x0000000000400585 <+8>: lea    -0x10(%rbp),%rax
   0x0000000000400589 <+12>:    mov    %rax,%rdi
   0x000000000040058c <+15>:    callq  0x400480 <gets@plt>
   0x0000000000400591 <+20>:    lea    -0x10(%rbp),%rax
   0x0000000000400595 <+24>:    mov    %rax,%rdi
   0x0000000000400598 <+27>:    callq  0x400450 <puts@plt>
   0x000000000040059d <+32>:    leaveq 
   0x000000000040059e <+33>:    retq

Answer 1

For main() ...

0x000000000040059f <+0>: push   %rbp

Push %RBP 's value onto the stack.

0x00000000004005a0 <+1>: mov    %rsp,%rbp

Copy %RSP 's value into %RBP (create a new stack frame).

0x00000000004005a3 <+4>: mov    $0x0,%eax

Move the immediate value 0x0 into %EAX . That is, it zeroes %EAX . As you're in 64-bit mode, this also clears all of %RAX .

0x00000000004005a8 <+9>: callq  0x40057d <get_input>

Push %RIP 's value (undoable directly), then jump to label/function get_input() .

0x00000000004005ad <+14>:    mov    $0x0,%eax

According to the AMD64 System V ABI , a function's return value is stored in %RAX (not taking into account floating point and large structures). It also says that there are two groups of registers: caller-saved and callee-saved. When you call a function, you can't expected caller-saved registers to remain the same, you must save them yourself in the stack if necessary. Likewise, a function that gets called must preserve callee-saved registers if it uses them. The caller-saved registers are %RAX , %RDI , %RSI , %RDX , %RCX , %R8 , %R9 , %R10 , and %R11 . The callee-saved registers are %RBX , %RSP , %RBP , %R12 , %R13 , %R14 , and %R15 .

Now, as main() apparently performs return 0 , it must return that 0 in %RAX , right? However, two things should be taken into account. Firstly, in the AMD64 System V ABI, sizeof(int) == 4 . %RAX is 8 bytes wide, but %EAX is 4 bytes wide, so %EAX should be used for manipulating int -wide stuff, such as main() 's return value. Secondly, %EAX is part of %RAX , and %RAX is caller-saved, thus we can't rely on its value after a call. So, we perform MOV $0x0, %EAX in order to set the function's return value to zero.

0x00000000004005b2 <+19>:    pop    %rbp

Restore main() 's caller's %RBP , that is, destroy main() 's stack frame.

0x00000000004005b3 <+20>:    retq

Return from main() with a return value of 0 .

Then, we have get_input() ...

0x000000000040057d <+0>: push   %rbp

Push %RBP 's value onto the stack.

0x000000000040057e <+1>: mov    %rsp,%rbp

Copy %RSP 's value into %RBP (create a new stack frame).

0x0000000000400581 <+4>: sub    $0x10,%rsp

Subtract 16 from %RSP (reserve 16 bytes of temporary storage for the current frame).

0x0000000000400585 <+8>: lea    -0x10(%rbp),%rax

Load the effective address -0x10(%RBP) into %RAX . That is, it loads into %RAX the result of subtracting 16 from %RBP 's value. This means that %RAX now points to the first byte of local temporary storage.

0x0000000000400589 <+12>:    mov    %rax,%rdi

According to the ABI, a function's first argument is given on %RDI , the second on %RSI , etc... In this case, %RAX 's value is given as the first argument to the to-be-called function.

0x000000000040058c <+15>:    callq  0x400480 <gets@plt>

Call function gets() .

0x0000000000400591 <+20>:    lea    -0x10(%rbp),%rax

The same as above.

0x0000000000400595 <+24>:    mov    %rax,%rdi

Pass %RAX as the first argument.

0x0000000000400598 <+27>:    callq  0x400450 <puts@plt>

Call function puts() .

0x000000000040059d <+32>:    leaveq

Equivalent to MOV %RBP, %RSP then POP %RBP , that is, destroys the stack frame.

0x000000000040059e <+33>:    retq

Return from function get_input() without a proper return value.

Now...

MOV $0x0, %EAX What is the purpose of that instruction?

The second instance of that instruction is quite important, as it sets the return value of main() . However, the first one is actually redundant. You probably have optimizations disabled on your compiler.

Then 16 bytes are subtracted from rsp. I understand this is space allocated but why is it 16 bytes and not 8 bytes? I made the buffer 8 bytes only, what are the purpose of the other 8 bytes?

The ABI requires that %RSP shall be positioned on a 16-byte boundary before each function call. BTW, you should get away from statically-sized buffers and gets() .

Answer 2

The first instruction, mov $0x0, %eax , moves a zero into EAX in order to set the return code.

The second instruction, sub $0x10,%rsp is allocating memory and aligning the stack for system calls. The calling standard requires 16 byte alignment, not 8.

Tracing program in assembly.

Question

2 answers

solution1
3 ACCPTED 2016-04-02 14:36:22

solution2
1 2016-04-02 14:53:47

Tracing program in assembly.

Question

2 answers

solution1 3 ACCPTED 2016-04-02 14:36:22

solution2 1 2016-04-02 14:53:47

solution1
3 ACCPTED 2016-04-02 14:36:22

solution2
1 2016-04-02 14:53:47