简体   繁体   中英

What is the order of operations when creating a struct in memory (under the hood)?

Say I have a simple system in C:

#include <cstddef>

typedef struct Point { 
  Point *a;
  Point *b;
  int x;
  int y;
} Point; 

int main() { 
  Point p1 = {NULL, NULL, 3, 5};
  return 0; 
}

Godbolt compiles to:

main:
  push rbp
  mov  rbp, rsp
  mov  QWORD PTR [rbp-32], 0
  mov  QWORD PTR [rbp-24], 0
  mov  DWORD PTR [rbp-16], 3
  mov  DWORD PTR [rbp-12], 5
  mov  eax, 0
  pop  rbp
  ret

A tiny step further and we have:

int main() { 
  Point v = {NULL, NULL, 3, 5};
  Point m = {NULL, NULL, 7, 9};
  Point s = {&v, &s, 11, 12};
  return 0; 
}

Compiled to:

main:
  push rbp                    ; save the base pointer to the stack.
  mov  rbp, rsp               ; put the previous stack pointer into the base pointer.
  mov  QWORD PTR [rbp-32], 0
  mov  QWORD PTR [rbp-24], 0
  mov  DWORD PTR [rbp-16], 3
  mov  DWORD PTR [rbp-12], 5
  mov  QWORD PTR [rbp-64], 0
  mov  QWORD PTR [rbp-56], 0
  mov  DWORD PTR [rbp-48], 7
  mov  DWORD PTR [rbp-44], 9
  mov  QWORD PTR [rbp-96], 0
  mov  QWORD PTR [rbp-88], 0
  mov  QWORD PTR [rbp-80], 0
  mov  DWORD PTR [rbp-80], 11
  mov  DWORD PTR [rbp-76], 12
  lea  rax, [rbp-32]
  mov  QWORD PTR [rbp-96], rax
  lea  rax, [rbp-96]
  mov  QWORD PTR [rbp-88], rax
  mov  eax, 0
  pop  rbp
  ret

I can't exactly tell what's going on yet, but this helps (a little). Could one explain what is happening in the last example? I don't quite understand what the base pointer is, I know what the stack pointer is. I am not sure what QWORD PTR [...] does, but it's saying it's a quad-word size and a pointer/address. But why is it picking those specific offsets from rbp ? I don't understand why it chose that.

Then the second part is the lea rax, [rbp-32] . It looks like it's handling the part where I did {&v, &s} .

So my question is:

  1. What is the QWORD/DWORD PTR loading into ? Is this loading into the heap, the stack, or something else?
  2. Why is it choosing to be an offset of rbp ?
  3. Do the order of operations always go from the smallest object (most primitive object) to the most complex object? Or can you think of a case where the assembly code would first construct the complex object and then construct the more primitive objects?

I am wondering because I'm trying to wrap my head around how to create a tree in assembly. In functional programming or in JavaScript, you have a(b(c(), d(), e(f(g(), h()), ...))) . The deepest functions get evaluated first, then a gets evaluated last, passed in the arguments. But I'm having a hard time visualizing how this would look in assembly.

More specifically, I am trying to create like a simple key/value store in assembly, to get a deeper understanding of how "objects" are created at this low level. It's easy in JavaScript:

db[key] = value

But this is because value already exists somewhere in memory. The question I have is, should I be creating this directly in the key-value store up-front? Or do you always create it in a random free spot in memory (like the offsets from rbp ) and then later move them to the correct position (or point them to the right places)? I keep thinking I should be creating the tree leaf node directly on the branch, like I am pasting a leaf on the branch (visually). But the leaf already exists! Where does it exist before it is on the branch!? Can it ever exist on the branch before it is constructed elsewhere? I am getting confused.

So, start with a leaf .

🍁

Paste it on a branch.

  🍁 
  /  
\ | |
 \|/
  |
  | 

Where is the leaf being created in the first place? That's what I was trying to see with the assembly example.

Basically I'm wondering how it looks to directly create something on the heap, rather than the stack.

Most compiler use the stack for local variables.

Space on the stack is usually managed by two pointers: The stack pointer; and a "base" pointer that points to the base of the "allocated" memory on the stack.

Also worth to note is that the stack on almost all systems grows downward , which is why there are negative offsets from the base pointer (register rbp in your generated code).

The amount of space reserved is calculated by the compiler, which add code to initialize the two pointers either inside the function or before the function is called (it depends on calling conventions).

When the function returns the pointers are reset, which is a very simple way to "free" the memory for the local variables.


Somewhat illustrated, it looks like this:

base pointer ---> +---------------------+
                  | Space for variables |
                  | ...                 |
                  | ...                 |
                  | ...                 |
stack pointer --> +---------------------+

Basically I'm wondering how it looks to directly create something on the heap, rather than the stack.

You can't have a named variable "on the heap" in C. You can only have pointers to dynamically-allocated storage. (Where the pointer variable itself is either local or global, automatic or static storage, but the value it holds can be a pointer to the return value of malloc )

eg int *buffer = malloc(100*sizeof(*buffer)); inside a function: buffer is a local variable (automatic storage, which means stack space or just a register on "normal" C implementations on mainstream ISAs).

*buffer is the first int of that block of dynamic storage.


Some managed languages don't distinguish dynamic vs. automatic storage the way C does. eg in C# or Java you can always return a reference to a local variable. It's up to the compiler to do "escape analysis" to find out if a reference to a variable is visible outside the function, and if so to actually allocate it on "the heap", otherwise it can optimize it away or just use the stack.

In C, returning a pointer to a local variable doesn't work; the object doesn't exist after leaving the function's scope. You can do it without compile errors (just warnings) but dereferencing the pointer is UB.

eg

int *bad_return_local() {
    int buf[100];    // on the stack; destroyed when the function returns
    return buf;      // caller can't use this pointer to out-of-scope automatic storage
}

int *good_return_dynamic() {
    int *buf = malloc(100*sizeof(*buf));  // on "the heap"
    if (!buf) /* error: couldn't allocate memory */;
    return buf;      // caller must manually free() the return value at some point
}


int *return_static() {
    static int buf[100];   // static storage, e.g. in the BSS, same as global scope
    return buf;            // return the same pointer to the same storage every call
}
main:
  push rbp                    ; save the base pointer to the stack.
  mov  rbp, rsp               ; put the previous stack pointer into the base pointer.
  mov  QWORD PTR [rbp-32], 0  ; Write 0 (NULL) to v.a
  mov  QWORD PTR [rbp-24], 0  ; Write 0 (NULL) to v.b
  mov  DWORD PTR [rbp-16], 3  ; Write 3 to v.x
  mov  DWORD PTR [rbp-12], 5  ; Write 5 to v.y
  mov  QWORD PTR [rbp-64], 0  ; Write 0 (NULL) m.a
  mov  QWORD PTR [rbp-56], 0  ; Write 0 (NULL) to m.b
  mov  DWORD PTR [rbp-48], 7  ; Write 7 to m.x
  mov  DWORD PTR [rbp-44], 9  ; Write 9 to m.y
  mov  QWORD PTR [rbp-96], 0  ; Write 0 (NULL) to s.a
  mov  QWORD PTR [rbp-88], 0  ; Write 0 (NULL) to s.b
  mov  QWORD PTR [rbp-80], 0  ; Write 0 to s.x
  mov  DWORD PTR [rbp-80], 11 ; Write 11 to s.x
  mov  DWORD PTR [rbp-76], 12 ; Write 11 to s.y
  lea  rax, [rbp-32]          ; Load effective address of v.a into rax
  mov  QWORD PTR [rbp-96], rax ; Write address of v.a into s.a
  lea  rax, [rbp-96]          ; Load effective address of s.a into rax
  mov  QWORD PTR [rbp-88], rax ; Write address of m.a into s.b
  mov  eax, 0                 
  pop  rbp
  ret

In a function (typically), parameters and local variables are organized into a stack frame (along with the address of the previous frame and the address of the next instruction) and are referenced via an offset from a base ( or frame ) pointer . rbp stores the address of the stack frame, and you reference objects by offsetting from that address. Why not just offset from the stack pointer ( rsp )? Depending on what you do in the function, the stack pointer can change (not so much in compiled code, more in hand-hacked assembly). The base or frame pointer gives you a stable, unchanging reference point for doing the offsets. So what

mov QWORD PTR [rbp-32], 0

means is "Write the value of the immediate operand 0, expanded to a QWORD (8 bytes), to the address computed from rbp-32 ". If rbp is 0xdeadbeef , then that means zero out the 8 bytes starting at 0xdeadbeef - 0x20 , or 0xdeadbecf .

There is some weirdness in the generated code - not sure why it's zeroing out sx before writing 11 to it. Also not sure why it's bothering to zero out sa and sb before copying the addresses of m and s (the address of a struct object and the address of its first member are always the same). Turning on optimization may fix that.

This is how one compiler does it. Different compilers may do something different - for example, this is output from gcc (LLVM) on a Mac:

        .section        __TEXT,__text,regular,pure_instructions
        .build_version macos, 10, 14    sdk_version 10, 14
        .globl  _main                   ## -- Begin function main
        .p2align        4, 0x90
_main:                                  ## @main
        .cfi_startproc
## %bb.0:
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset %rbp, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register %rbp
        xorl    %eax, %eax
        movl    $0, -4(%rbp)
        movq    l___const.main.v(%rip), %rcx
        movq    %rcx, -32(%rbp)
        movq    l___const.main.v+8(%rip), %rcx
        movq    %rcx, -24(%rbp)
        movq    l___const.main.v+16(%rip), %rcx
        movq    %rcx, -16(%rbp)
        movq    l___const.main.s(%rip), %rcx
        movq    %rcx, -56(%rbp)
        movq    l___const.main.s+8(%rip), %rcx
        movq    %rcx, -48(%rbp)
        movq    l___const.main.s+16(%rip), %rcx
        movq    %rcx, -40(%rbp)
        leaq    -32(%rbp), %rcx
        movq    %rcx, -80(%rbp)
        leaq    -56(%rbp), %rcx
        movq    %rcx, -72(%rbp)
        movl    $11, -64(%rbp)
        movl    $12, -60(%rbp)
        popq    %rbp
        retq
        .cfi_endproc
                                        ## -- End function
        .section        __TEXT,__const
        .p2align        3               ## @__const.main.v
l___const.main.v:
        .quad   0
        .quad   0
        .long   3                       ## 0x3
        .long   5                       ## 0x5

        .p2align        3               ## @__const.main.s
l___const.main.s:
        .quad   0
        .quad   0
        .long   7                       ## 0x7
        .long   9                       ## 0x9


.subsections_via_symbols

Different syntax, different approach, same end result.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM