简体   繁体   English

可执行文件会通过 GOT 访问共享库的全局变量吗?

[英]Will an executable access shared-libraries' global variable via GOT?

I was learning dynamic linking recently and gave it a try:我最近在学习动态链接并尝试了一下:

dynamic.c动态的.c

int global_variable = 10;

int XOR(int a) {
        return global_variable;
}

test.c测试.c

#include <stdio.h>
extern int global_variable;
extern int XOR(int);

int main() {
        global_variable = 3;
        printf("%d\n", XOR(0x10));
}

The compiling commands are:编译命令为:

clang -shared -fPIC -o dynamic.so dynamic.c
clang -o test test.c dynamic.so

I was expecting that in executable test the main function will access global_variable via GOT.我期待在可执行测试中,主要的 function 将通过 GOT 访问 global_variable。 However, on the contrary, the global_variable is placed in test's data section and XOR in dynamic.so access the global_variable indirectly.然而,相反,global_variable 放在 test 的数据部分,并在 dynamic.so 中进行异或。所以间接访问 global_variable。

Could anyone tell me why the compiler didn't ask the test to access global_variable via GOT, but asked the shared object file to do so?谁能告诉我为什么编译器没有要求测试通过 GOT 访问 global_variable,而是要求共享的 object 文件这样做?

Part of the point of a shared library is that one copy gets loaded into memory, and multiple processes can access that one copy.共享库的部分要点是一个副本被加载到 memory 中,并且多个进程可以访问该副本。 But every program has its own copy of each of the library's variables.但是每个程序都有自己的每个库变量的副本。 If they were accessed relative to the library's GOT then those would instead be shared among the processes using the library, just like the functions are.如果它们是相对于库的 GOT 访问的,那么它们将在使用库的进程之间共享,就像函数一样。

There are other possibilities, but it is clean and consistent for each executable to provide for itself all the variables it needs.还有其他可能性,但每个可执行文件都可以为自己提供所需的所有变量,这是干净且一致的。 That requires the library functions to access all of its variables with static storage duration (not just external ones) indirectly, relative to the program.这需要库函数使用 static 存储持续时间(不仅仅是外部变量)间接访问其所有变量,相对于程序。 This is ordinary dynamic linking, just going the opposite direction from what you usually think of.这是普通的动态链接,只是与您通常认为的相反。

I tried reproducing your problem with exactly the same code and compilation commands as the ones you provided, but it seems like both main and XOR use the GOT to access the global_variable .我尝试使用与您提供的完全相同的代码和编译命令来重现您的问题,但似乎mainXOR都使用 GOT 来访问global_variable I will answer by providing example output of commands that i used to inspect the data flow.我将通过提供用于检查数据流的命令示例 output 来回答。 If your outputs differ from mine, it means there is some other difference between our environments (i mean a big difference, if only addresses/values are different then its ok).如果您的输出与我的不同,则意味着我们的环境之间存在其他一些差异(我的意思是很大的差异,如果只有地址/值不同,那么就可以了)。 Best way to find that difference is for you to provide commands you originally used as well as their output.找到差异的最佳方法是提供您最初使用的命令以及它们的 output。

First step is to check what address is accessed whenever a write or read to global_variable happens.第一步是检查在对global_variable进行写入或读取时访问的地址。 For that we can use objdump -D -j.text test command to disassemble the code and look at the main function:为此,我们可以使用objdump -D -j.text test命令反汇编代码并查看main的 function:

0000000000001150 <main>:
    1150:       55                      push   %rbp
    1151:       48 89 e5                mov    %rsp,%rbp
    1154:       48 8b 05 8d 2e 00 00    mov    0x2e8d(%rip),%rax        # 3fe8 <global_variable>
    115b:       c7 00 03 00 00 00       movl   $0x3,(%rax)
    1161:       bf 10 00 00 00          mov    $0x10,%edi
    1166:       e8 d5 fe ff ff          call   1040 <XOR@plt>
    116b:       89 c6                   mov    %eax,%esi
    116d:       48 8d 3d 90 0e 00 00    lea    0xe90(%rip),%rdi        # 2004 <_IO_stdin_used+0x4>
    1174:       b0 00                   mov    $0x0,%al
    1176:       e8 b5 fe ff ff          call   1030 <printf@plt>
    117b:       31 c0                   xor    %eax,%eax
    117d:       5d                      pop    %rbp
    117e:       c3                      ret    
    117f:       90                      nop

Numbers in the first column are not absolute addresses - instead they are offsets relative to the base address at which the executable will be loaded.第一列中的数字不是绝对地址 - 相反,它们是相对于将加载可执行文件的基地址的偏移量。 For the sake of explanation i will refer to them as "offsets".为了解释起见,我将它们称为“偏移量”。

The assembly at offset 0x115b and 0x1161 comes directly from the line global_variable = 3;偏移量 0x115b 和 0x1161 处的程序集直接来自行global_variable = 3; in your code.在你的代码中。 To confirm that, you could compile the program with -g for debug symbols and invoke objdump with -S .为了确认这一点,您可以使用-g为调试符号编译程序,并使用-S调用 objdump。 This will display source code above corresponding assembly.这将在相应程序集上方显示源代码。

We will focus on what these two instructions are doing.我们将专注于这两条指令的作用。 First instruction is a mov of 8 bytes from a location in memory to the rax register.第一条指令是从 memory 中的位置到 rax 寄存器的 8 个字节的mov The location in memory is given as relative to the current rip value, offset by a constant 0x2e8d. memory 中的位置相对于当前 rip 值给出,偏移量为常数 0x2e8d。 Objdump already calculated the value for us, and it is equal to 0x3fe8. Objdump 已经为我们计算了值,它等于 0x3fe8。 So this will take 8 bytes present in memory at the 0x3fe8 offset and store them in the rax register.因此,这将占用 memory 中 0x3fe8 偏移量的 8 个字节,并将它们存储在 rax 寄存器中。

Next instruction is again a mov , the suffix l tells us that data size is 4 bytes this time.下一条指令又是一个mov ,后缀l告诉我们这次数据大小是 4 个字节。 It stores a 4 byte integer with value equal to 0x3 in the location pointed to by the current value of rax (not in the rax itself! brackets around a register such as those in (%rax) signify that the location in the instruction is not the register itself, but rather where its contents are pointing to.).它在 rax 的当前值所指向的位置存储了一个 4 字节 integer,其值等于 0x3(不在 rax 本身中!寄存器周围的括号,例如(%rax)中的括号表示指令中的位置不是寄存器本身,而是其内容指向的位置。)。

To summarize, we read a pointer to a 4 byte variable from a certain location at offset 0x3fe8 and later store an immediate value of 0x3 at the location specified by said pointer.总而言之,我们从偏移 0x3fe8 的某个位置读取指向 4 字节变量的指针,然后在所述指针指定的位置存储立即值 0x3。 Now the question is: where does that offset of 0x3fe8 come from?现在的问题是:0x3fe8 的偏移量是从哪里来的?

It actually comes from GOT.它实际上来自 GOT。 To show the contents of the .got section we can use the objdump -s -j.got test command.要显示.got部分的内容,我们可以使用objdump -s -j.got test命令。 -s means we want to focus on actual raw contents of the section, without any disassembling. -s表示我们要关注该部分的实际原始内容,而不进行任何反汇编。 The output in my case is:就我而言,output 是:

test:     file format elf64-x86-64

Contents of section .got:
 3fd0 00000000 00000000 00000000 00000000  ................
 3fe0 00000000 00000000 00000000 00000000  ................
 3ff0 00000000 00000000 00000000 00000000  ................

The whole section is obviously set to zero, as GOT is populated with data after loading the program into memory, but what is important is the address range.整个部分显然设置为零,因为 GOT 在将程序加载到 memory 后填充了数据,但重要的是地址范围。 We can see that .got starts at 0x3fd0 offset and ends at 0x3ff0.我们可以看到.got从偏移量 0x3fd0 开始,到 0x3ff0 结束。 This means it also includes the 0x3fe8 offset - which means the location of global_variable is indeed stored in GOT.这意味着它还包括 0x3fe8 偏移量——这意味着global_variable的位置确实存储在 GOT 中。

Another way of finding this information is to use readelf -S test to show sections of the executable file and scroll down to the .got section:查找此信息的另一种方法是使用readelf -S test显示可执行文件的部分并向下滚动到.got部分:

[Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
(...lots of sections...)
[22] .got              PROGBITS         0000000000003fd0  00002fd0
       0000000000000030  0000000000000008  WA       0     0     8

Looking at the Address and Size columns, we can see that the section is loaded at offset 0x3fd0 in memory and its size is 0x30 - which corresponds to what objdump displayed.查看地址和大小列,我们可以看到该部分在 memory 中的偏移量 0x3fd0 处加载,其大小为 0x30 - 对应于 objdump 显示的内容。 Note that in readelf ouput "Offset" is actually the offset into the file form which the program is loaded - not the offset in memory that we are interested in.请注意,在 readelf 输出中,“偏移量”实际上是加载程序的文件形式的偏移量 - 不是我们感兴趣的 memory 中的偏移量。

by issuing the same commands on the dynamic.so library we get similar results:通过在dynamic.so库上发出相同的命令,我们得到类似的结果:

00000000000010f0 <XOR>:
    10f0:       55                      push   %rbp
    10f1:       48 89 e5                mov    %rsp,%rbp
    10f4:       89 7d fc                mov    %edi,-0x4(%rbp)
    10f7:       48 8b 05 ea 2e 00 00    mov    0x2eea(%rip),%rax        # 3fe8 <global_variable@@Base-0x38>
    10fe:       8b 00                   mov    (%rax),%eax
    1100:       5d                      pop    %rbp
    1101:       c3                      ret

So we see that both main and XOR use GOT to find the location of global_variable .所以我们看到mainXOR都使用 GOT 来查找global_variable的位置。

As for the location of global_variable we need to run the program to populate GOT.至于global_variable的位置,我们需要运行程序来填充 GOT。 For that we can use GDB.为此,我们可以使用 GDB。 We can run our program in GDB by invoking it this way:我们可以通过以下方式在 GDB 中运行我们的程序:

LD_LIBRARY_PATH="$LD_LIBRARY_PATH:." gdb ./test

LD_LIBRARY_PATH environment variable tells linker where to look for shared objects, so we extend it to include the current directory "." LD_LIBRARY_PATH 环境变量告诉 linker 在哪里寻找共享对象,所以我们扩展它以包含当前目录“。” so that it may find dynamic.so .这样它就可以找到dynamic.so

After the GDB loads our code, we may invoke break main to set up a breakpoint at main and run to run the program. GDB加载我们的代码后,我们可以调用break main在main处设置断点并run运行程序。 The program execution should pause at the beginning of the main function, giving us a view into our executable after it was fully loaded into memory, with GOT populated.程序执行应该在main function 的开头暂停,让我们看到我们的可执行文件完全加载到 memory 后,并填充了 GOT。

Running disassemble main in this state will show us the actual absolute offsets into memory:在 state 中运行disassemble main将向我们展示 memory 的实际绝对偏移量:

Dump of assembler code for function main:
   0x0000555555555150 <+0>:     push   %rbp
   0x0000555555555151 <+1>:     mov    %rsp,%rbp
=> 0x0000555555555154 <+4>:     mov    0x2e8d(%rip),%rax        # 0x555555557fe8
   0x000055555555515b <+11>:    movl   $0x3,(%rax)
   0x0000555555555161 <+17>:    mov    $0x10,%edi
   0x0000555555555166 <+22>:    call   0x555555555040 <XOR@plt>
   0x000055555555516b <+27>:    mov    %eax,%esi
   0x000055555555516d <+29>:    lea    0xe90(%rip),%rdi        # 0x555555556004
   0x0000555555555174 <+36>:    mov    $0x0,%al
   0x0000555555555176 <+38>:    call   0x555555555030 <printf@plt>
   0x000055555555517b <+43>:    xor    %eax,%eax
   0x000055555555517d <+45>:    pop    %rbp
   0x000055555555517e <+46>:    ret    
End of assembler dump.
(gdb) 

Our 0x3fe8 offset has turned into an absolute address of equal to 0x555555557fe8.我们的 0x3fe8 偏移量变成了等于 0x555555557fe8 的绝对地址。 We may again check that this location comes from the .got section by issuing maintenance info sections inside GDB, which will list a long list of sections and their memory mappings.我们可以通过在 GDB 中发出maintenance info sections来再次检查此位置是否来自.got部分,这将列出一长串部分及其 memory 映射。 For me .got is placed in this address range:对我来说.got被放置在这个地址范围内:

[21]     0x555555557fd0->0x555555558000 at 0x00002fd0: .got ALLOC LOAD DATA HAS_CONTENTS

Which contains 0x555555557fe8.其中包含 0x555555557fe8。

To finally inspect the address of global_variable itself we may e x amine the contents of that memory by issuing x/xag 0x555555557fe8 .为了最终检查global_variable本身的地址,我们可以通过发出x x/xag 0x555555557fe8的内容。 Arguments xag of the x command deal with the size, format and type of data being inspected - for explanation invoke help x in GDB. x命令的 Arguments xag处理正在检查的数据的大小、格式和类型 - 用于解释调用 GDB 中的help x On my machine the command returns:在我的机器上,命令返回:

0x555555557fe8: 0x7ffff7fc4020 <global_variable>

On your machine it may only display the address and the data, without the "<global_variable>" helper, which probably comes from an extension i have installed called pwndbg.在您的机器上,它可能只显示地址和数据,没有“<global_variable>”帮助器,它可能来自我安装的名为 pwndbg 的扩展。 It is ok, because the value at that address is all we need.没关系,因为该地址的值就是我们所需要的。 We now know that the global_variable is located in memory under the address 0x7ffff7fc4020.我们现在知道global_variable位于地址 0x7ffff7fc4020 下的 memory 中。 Now we may issue info proc mappings in GDB to find out what address range does this address belong to.现在我们可以在 GDB 中发出info proc mappings来找出这个地址属于哪个地址范围。 My output is pretty long, but among all the ranges listed there is one of interest to us:我的 output 很长,但在列出的所有范围中,我们有一个感兴趣的范围:

0x7ffff7fc4000     0x7ffff7fc5000     0x1000     0x3000 /home/user/test_got/dynamic.so

The address is inside of that memory area, and GDB tells us that it comes from the dynamic.so library.地址在 memory 区域内,GDB 告诉我们它来自dynamic.so库。

In case any of the outputs of said commands are different for you (change in a value is ok - i mean a fundamental difference like addresses not belonging to certain address ranges etc.) please provide more information about what exactly did you do to come to the conclusion that global_variable is stored in the .data section - what commands did you invoke and what outputs they produced.如果上述命令的任何输出对您来说不同(值的变化是可以的 - 我的意思是根本区别,例如不属于某些地址范围的地址等),请提供有关您到底做了什么的更多信息global_variable存储在.data部分的结论-您调用了哪些命令以及它们产生了哪些输出。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM