简体   繁体   中英

Will an executable access shared-libraries' global variable via GOT?

I was learning dynamic linking recently and gave it a try:

dynamic.c

int global_variable = 10;

int XOR(int a) {
        return global_variable;
}

test.c

#include <stdio.h>
extern int global_variable;
extern int XOR(int);

int main() {
        global_variable = 3;
        printf("%d\n", XOR(0x10));
}

The compiling commands are:

clang -shared -fPIC -o dynamic.so dynamic.c
clang -o test test.c dynamic.so

I was expecting that in executable test the main function will access global_variable via GOT. However, on the contrary, the global_variable is placed in test's data section and XOR in dynamic.so access the global_variable indirectly.

Could anyone tell me why the compiler didn't ask the test to access global_variable via GOT, but asked the shared object file to do so?

Part of the point of a shared library is that one copy gets loaded into memory, and multiple processes can access that one copy. But every program has its own copy of each of the library's variables. If they were accessed relative to the library's GOT then those would instead be shared among the processes using the library, just like the functions are.

There are other possibilities, but it is clean and consistent for each executable to provide for itself all the variables it needs. That requires the library functions to access all of its variables with static storage duration (not just external ones) indirectly, relative to the program. This is ordinary dynamic linking, just going the opposite direction from what you usually think of.

I tried reproducing your problem with exactly the same code and compilation commands as the ones you provided, but it seems like both main and XOR use the GOT to access the global_variable . I will answer by providing example output of commands that i used to inspect the data flow. If your outputs differ from mine, it means there is some other difference between our environments (i mean a big difference, if only addresses/values are different then its ok). Best way to find that difference is for you to provide commands you originally used as well as their output.

First step is to check what address is accessed whenever a write or read to global_variable happens. For that we can use objdump -D -j.text test command to disassemble the code and look at the main function:

0000000000001150 <main>:
    1150:       55                      push   %rbp
    1151:       48 89 e5                mov    %rsp,%rbp
    1154:       48 8b 05 8d 2e 00 00    mov    0x2e8d(%rip),%rax        # 3fe8 <global_variable>
    115b:       c7 00 03 00 00 00       movl   $0x3,(%rax)
    1161:       bf 10 00 00 00          mov    $0x10,%edi
    1166:       e8 d5 fe ff ff          call   1040 <XOR@plt>
    116b:       89 c6                   mov    %eax,%esi
    116d:       48 8d 3d 90 0e 00 00    lea    0xe90(%rip),%rdi        # 2004 <_IO_stdin_used+0x4>
    1174:       b0 00                   mov    $0x0,%al
    1176:       e8 b5 fe ff ff          call   1030 <printf@plt>
    117b:       31 c0                   xor    %eax,%eax
    117d:       5d                      pop    %rbp
    117e:       c3                      ret    
    117f:       90                      nop

Numbers in the first column are not absolute addresses - instead they are offsets relative to the base address at which the executable will be loaded. For the sake of explanation i will refer to them as "offsets".

The assembly at offset 0x115b and 0x1161 comes directly from the line global_variable = 3; in your code. To confirm that, you could compile the program with -g for debug symbols and invoke objdump with -S . This will display source code above corresponding assembly.

We will focus on what these two instructions are doing. First instruction is a mov of 8 bytes from a location in memory to the rax register. The location in memory is given as relative to the current rip value, offset by a constant 0x2e8d. Objdump already calculated the value for us, and it is equal to 0x3fe8. So this will take 8 bytes present in memory at the 0x3fe8 offset and store them in the rax register.

Next instruction is again a mov , the suffix l tells us that data size is 4 bytes this time. It stores a 4 byte integer with value equal to 0x3 in the location pointed to by the current value of rax (not in the rax itself! brackets around a register such as those in (%rax) signify that the location in the instruction is not the register itself, but rather where its contents are pointing to.).

To summarize, we read a pointer to a 4 byte variable from a certain location at offset 0x3fe8 and later store an immediate value of 0x3 at the location specified by said pointer. Now the question is: where does that offset of 0x3fe8 come from?

It actually comes from GOT. To show the contents of the .got section we can use the objdump -s -j.got test command. -s means we want to focus on actual raw contents of the section, without any disassembling. The output in my case is:

test:     file format elf64-x86-64

Contents of section .got:
 3fd0 00000000 00000000 00000000 00000000  ................
 3fe0 00000000 00000000 00000000 00000000  ................
 3ff0 00000000 00000000 00000000 00000000  ................

The whole section is obviously set to zero, as GOT is populated with data after loading the program into memory, but what is important is the address range. We can see that .got starts at 0x3fd0 offset and ends at 0x3ff0. This means it also includes the 0x3fe8 offset - which means the location of global_variable is indeed stored in GOT.

Another way of finding this information is to use readelf -S test to show sections of the executable file and scroll down to the .got section:

[Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
(...lots of sections...)
[22] .got              PROGBITS         0000000000003fd0  00002fd0
       0000000000000030  0000000000000008  WA       0     0     8

Looking at the Address and Size columns, we can see that the section is loaded at offset 0x3fd0 in memory and its size is 0x30 - which corresponds to what objdump displayed. Note that in readelf ouput "Offset" is actually the offset into the file form which the program is loaded - not the offset in memory that we are interested in.

by issuing the same commands on the dynamic.so library we get similar results:

00000000000010f0 <XOR>:
    10f0:       55                      push   %rbp
    10f1:       48 89 e5                mov    %rsp,%rbp
    10f4:       89 7d fc                mov    %edi,-0x4(%rbp)
    10f7:       48 8b 05 ea 2e 00 00    mov    0x2eea(%rip),%rax        # 3fe8 <global_variable@@Base-0x38>
    10fe:       8b 00                   mov    (%rax),%eax
    1100:       5d                      pop    %rbp
    1101:       c3                      ret

So we see that both main and XOR use GOT to find the location of global_variable .

As for the location of global_variable we need to run the program to populate GOT. For that we can use GDB. We can run our program in GDB by invoking it this way:

LD_LIBRARY_PATH="$LD_LIBRARY_PATH:." gdb ./test

LD_LIBRARY_PATH environment variable tells linker where to look for shared objects, so we extend it to include the current directory "." so that it may find dynamic.so .

After the GDB loads our code, we may invoke break main to set up a breakpoint at main and run to run the program. The program execution should pause at the beginning of the main function, giving us a view into our executable after it was fully loaded into memory, with GOT populated.

Running disassemble main in this state will show us the actual absolute offsets into memory:

Dump of assembler code for function main:
   0x0000555555555150 <+0>:     push   %rbp
   0x0000555555555151 <+1>:     mov    %rsp,%rbp
=> 0x0000555555555154 <+4>:     mov    0x2e8d(%rip),%rax        # 0x555555557fe8
   0x000055555555515b <+11>:    movl   $0x3,(%rax)
   0x0000555555555161 <+17>:    mov    $0x10,%edi
   0x0000555555555166 <+22>:    call   0x555555555040 <XOR@plt>
   0x000055555555516b <+27>:    mov    %eax,%esi
   0x000055555555516d <+29>:    lea    0xe90(%rip),%rdi        # 0x555555556004
   0x0000555555555174 <+36>:    mov    $0x0,%al
   0x0000555555555176 <+38>:    call   0x555555555030 <printf@plt>
   0x000055555555517b <+43>:    xor    %eax,%eax
   0x000055555555517d <+45>:    pop    %rbp
   0x000055555555517e <+46>:    ret    
End of assembler dump.
(gdb) 

Our 0x3fe8 offset has turned into an absolute address of equal to 0x555555557fe8. We may again check that this location comes from the .got section by issuing maintenance info sections inside GDB, which will list a long list of sections and their memory mappings. For me .got is placed in this address range:

[21]     0x555555557fd0->0x555555558000 at 0x00002fd0: .got ALLOC LOAD DATA HAS_CONTENTS

Which contains 0x555555557fe8.

To finally inspect the address of global_variable itself we may e x amine the contents of that memory by issuing x/xag 0x555555557fe8 . Arguments xag of the x command deal with the size, format and type of data being inspected - for explanation invoke help x in GDB. On my machine the command returns:

0x555555557fe8: 0x7ffff7fc4020 <global_variable>

On your machine it may only display the address and the data, without the "<global_variable>" helper, which probably comes from an extension i have installed called pwndbg. It is ok, because the value at that address is all we need. We now know that the global_variable is located in memory under the address 0x7ffff7fc4020. Now we may issue info proc mappings in GDB to find out what address range does this address belong to. My output is pretty long, but among all the ranges listed there is one of interest to us:

0x7ffff7fc4000     0x7ffff7fc5000     0x1000     0x3000 /home/user/test_got/dynamic.so

The address is inside of that memory area, and GDB tells us that it comes from the dynamic.so library.

In case any of the outputs of said commands are different for you (change in a value is ok - i mean a fundamental difference like addresses not belonging to certain address ranges etc.) please provide more information about what exactly did you do to come to the conclusion that global_variable is stored in the .data section - what commands did you invoke and what outputs they produced.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM