[英]Where do I find the assembly that creates a static variable in the .data section of my C program?
First time poster. 第一次海报。 2nd year CS student.
第二年CS学生。
I am exploring the creation of static variables in the .data section of the Virtual Address Space in the context of a C source->GCC compilation->Linux execution environment. 我正在探索在C源代码 - > GCC编译 - > Linux执行环境的上下文中虚拟地址空间的.data部分中创建静态变量。
C program is test.c C程序是test.c
int main()
{
register int i = 0;
register int sum = 0;
static int staticVar[10] = {1,2,3,4,5,6,7,8,9,-1};
Loop:
sum = sum + staticVar[i]; //optimized away
i = i+1;
if(i != 10)
goto Loop;
return 0;
}
Asking GDB to ' disass /m
' reveals that there is no code for the staticVar[] creation because inspecting the .s file reveals the variable resides in the read/write .data segment of the virtual address space having been placed there at the time of process creation(this process is what I'm interested in). 要求
disass /m
'显示staticVar []创建没有代码,因为检查.s文件显示变量位于当时放置在那里的虚拟地址空间的读/写.data段中流程创建(这个过程是我感兴趣的)。
Examining the output of (I though it was ' readelf -A test.o
') the object file contains assembly for what I assume is the creation of the array in the data segment. 检查(虽然它是'
readelf -A test.o
')的输出,目标文件包含我假设在数据段中创建数组的程序集。 Here is the ELF output. 这是ELF输出。
(Bonus if you can tell me what command generates this output. I can not duplicate it using readelf. I picked up the command off a website and saved output. I cant remember how generated) (如果你可以告诉我什么命令生成这个输出,我可以使用readelf来复制它。我从网站上获取命令并保存输出。我不记得是如何生成的)
[snip] [剪断]
00000000 <staticVar.1359>:
0:01 00 add %eax,(%eax)
2:00 00 add %al,(%eax)
4:02 00 add (%eax),%al
6:00 00 add %al,(%eax)
8:03 00 add (%eax),%eax
a:00 00 add %al,(%eax)
c:04 00 add $0x0,%al
e:00 00 add %al,(%eax)
10:05 00 00 00 06 add $0x6000000,%eax
15:00 00 add %al,(%eax)
17:00 07 add %al,(%edi)
19:00 00 add %al,(%eax)
1b:00 08 add %cl,(%eax)
1d:00 00 add %al,(%eax)
1f:00 09 add %cl,(%ecx)
21:00 00 add %al,(%eax)
23:00 ff add %bh,%bh
25:ff (bad)
26:ff (bad)
27:ff .byte 0xff
[snip] [剪断]
Assumptions(please correct) : This assembly exists in the executable and is run by load_elf_binary(), or some part of the execve() initiated series of functions. 假设(请更正) :此程序集存在于可执行文件中,由load_elf_binary()或execve()启动的一系列函数运行。 I have no at&t (basic intel) syntax knowledge but even intuitively I dont see how these instructions can build an array.
我没有at&t(基本的英特尔)语法知识,但是直觉上我也看不出这些指令如何构建数组。 Looks like they are just adding register values together.
看起来他们只是一起添加寄存器值。
Bottom Line: I would like to know as much as possible about the life cycle of this static array especially where is the "missing code" that builds it and how can I look at it? 结论:我想尽可能多地了解这个静态数组的生命周期,尤其是构建它的“缺失代码”在哪里,我该如何看待它? Or better yet how can I debug (step through) the loader process?
或者更好的是我如何调试(逐步执行)加载程序进程? I have tried setting a breakpoint before main at the __start_libc entry (or something like that) but could not identify anything promising in this area.
我已经尝试在main之前在__start_libc条目(或类似的东西)设置断点,但无法识别该区域中的任何有希望的东西。
Links to additional info are great! 链接到其他信息非常棒! Thanks for your time!
谢谢你的时间!
The initializers for staticVar
is stored in the .data
section of the executable. staticVar
的初始化程序存储在可执行文件的.data
部分中。 Using objdump
(eg How can I examine contents of a data section of an ELF file on Linux? ) should reveal something like this for your file: 使用
objdump
(例如, 如何检查Linux上ELF文件的数据部分的内容? )应该为您的文件显示以下内容:
./test: file format elf64-x86-64
Contents of section .data:
00d2c0 00000000 00000000 00000000 00000000 ................
00d2d0 00000000 00000000 00000000 00000000 ................
00d2e0 01000000 02000000 03000000 04000000 ................
00d2f0 05000000 06000000 07000000 08000000 ................
00d300 09000000 ffffffff 00000000 00000000 ................
00d310 00000000 00000000 00000000 00000000 ................
That content from the executable is directly mapped into the address space of your process, so there is no need for any code to create the data. 可执行文件中的内容直接映射到进程的地址空间,因此不需要任何代码来创建数据。 Codes that operate on
staticVar
will refer to the content directly using memory pointers; 在
staticVar
上运行的代码将直接使用内存指针引用内容; eg for the loop you posted, gcc -S
gave me this: 例如,对于你发布的循环,
gcc -S
给了我这个:
18 .L5:
19 0013 90 nop
20 .L2:
21 0014 4863C3 movslq %ebx, %rax
22 0017 8B148500 movl staticVar.1707(,%rax,4), %edx
22 000000
23 001e 8B45F4 movl -12(%rbp), %eax
24 0021 01D0 addl %edx, %eax
25 0023 8945F4 movl %eax, -12(%rbp)
26 0026 83C301 addl $1, %ebx
27 0029 83FB0A cmpl $10, %ebx
28 002c 75E5 jne .L5
Lifetime of this static array would be the lifetime of your process, similar to a global variable. 此静态数组的生命周期将是您的进程的生命周期,类似于全局变量。 In any case, there is no code that builds it.
无论如何,没有构建它的代码 。 It's just some data in memory.
它只是内存中的一些数据。
P/S: You may need to add volatile to sum
like such: volatile int sum = 0;
P / S:您可能需要将volatile添加到
sum
如下所示: volatile int sum = 0;
Otherwise gcc
would probably optimize it away since the resulting value of sum is never used. 否则
gcc
可能会优化它,因为从不使用sum的结果值。
My executable format is limited to old formats that aren't used anymore, but I'm pretty sure the code you're looking for does not exist in the ELF executable itself. 我的可执行格式仅限于不再使用的旧格式,但我很确定您正在寻找的代码不存在于ELF可执行文件本身中。
Executable files define sections that are copied/mapped to your process's memory verbatim. 可执行文件定义了逐字复制/映射到进程内存的部分。 Just like your executable doesn't include code to populate the instruction stream of the functions it defines, it does not include code to populate static non-executable data.
就像您的可执行文件不包含填充其定义的函数的指令流的代码一样,它不包括用于填充静态非可执行数据的代码。 To that end, remember that there isn't any fundamental difference between data symbols and executable symbols: as far as storage is concerned, they're both data.
为此,请记住数据符号和可执行符号之间没有任何根本区别:就存储而言,它们都是数据。
Some languages, like C++, will allow you to use dynamic initializers. 某些语言(如C ++)允许您使用动态初始值设定项。 Dynamic initializers are run before the entry point of your executable and populate data symbols with information that couldn't be inferred at compile-time.
动态初始值设定项在可执行文件的入口点之前运行,并使用在编译时无法推断的信息填充数据符号。 In that case, yes, there will be code for it.
在那种情况下,是的,会有代码。 However, statically-initialized symbols don't need that, they can be copied or mapped straight to your process's address space.
但是,静态初始化的符号不需要它们,它们可以直接复制或映射到进程的地址空间。
Look at the data for staticVar
closer. 查看
staticVar
的数据。 Forget the instructions for a bit; 暂时忘掉说明; what happens if you put all the bytes next to one another?
如果你将所有字节放在一起,会发生什么?
01 00 00 00
02 00 00 00
03 00 00 00
04 00 00 00
05 00 00 00
06 00 00 00
07 00 00 00
08 00 00 00
09 00 00 00
ff ff ff ff
This is the hexadecimal representation of the sequence of little-endian integers {1, 2, 3, 4, 5, 6, 7, 8, 9, -1}
, laid out in group of 4 bytes to make it easier to spot. 这是小端整数序列
{1, 2, 3, 4, 5, 6, 7, 8, 9, -1}
的十六进制表示,以4个字节为一组,以便更容易发现。
As zneak answered in the object file there is no code initializing the staticVar
variable. 由于zneak在目标文件中回答,因此没有初始化
staticVar
变量的代码。 The actual bytes of the .data
section are directly in the test.o
file. .data
部分的实际字节直接位于test.o
文件中。 If you define a const
variable gcc
will put it into .rodata
section by default. 如果定义一个
const
变量, gcc
默认将它放入.rodata
部分。
The disassembly you obtained most probably by objdump --disassemble-all file.o
. 您最有可能通过
objdump --disassemble-all file.o
获得的反汇编。 Maybe the disassembly of data confused you into thinking that it is a real code. 也许数据的反汇编让你误以为它是一个真正的代码。 For normal use I would recommend
objdump --disassemble file.o
which disassembles just sections containing actual machine code. 对于正常使用,我建议使用
objdump --disassemble file.o
, objdump --disassemble file.o
拆解包含实际机器代码的部分。
You can get detailed relevant information by running: objdump -xdst test.o
您可以通过运行以下
objdump -xdst test.o
获取详细的相关信息: objdump -xdst test.o
staticVar
staticVar
的生命 .data
section as it is a static initialized variable. .data
节,因为它是一个静态初始化变量。 The size of the section is just for the variable, the address is not determined yet. objdump -xdst test.o
. objdump -xdst test.o
.data
section (and others) as the object files are known. .data
部分(和其他部分)的地址,因为目标文件是已知的。 In the .data
section other variables from other object files appears. .data
部分中,显示来自其他对象文件的其他变量。 See objdump -xdst test
. objdump -xdst test
。 test
is executed the contained segments are directly mapped ( mmap()
) to the address space of the process so the value of staticVar
is directly read from the test
binary (possibly when needed). test
,包含的段直接映射( mmap()
)到进程的地址空间,因此staticVar
的值直接从test
二进制文件读取(可能在需要时)。 Then the dynamic linked ld-linux
maps shared libraries like libc
. ld-linux
映射共享库,如libc
。 See cat /proc/$PID/maps
or pmap $PID
when the process with $PID
is loaded in memory. cat /proc/$PID/maps
或pmap $PID
时与过程$PID
加载到内存中。 Just try to change the staticVar to a local non-static variable (ie stored on the stack by default): 只是尝试将staticVar更改为本地非静态变量(即默认存储在堆栈中):
int staticVar[10] = {1,2,3,4,5,6,7,8,9,-1};
By default gcc will generate an initialization code directly in the main()
function. 默认情况下,gcc将直接在
main()
函数中生成初始化代码。 Just see objdump -xdst test.o
. 只需看看
objdump -xdst test.o
This is because the stack variables are allocated (so their addresses are determined) at run-time. 这是因为在运行时分配堆栈变量(因此确定了它们的地址)。
Just to add that on bare-metal hardware (ie. microcontrollers) (where you don't have the benefits of binary loaders), you will see the code that zeroes the .bbs and copies .data section from RO FLASH into RAM. 只是在裸机硬件(即微控制器)(你没有二进制加载器的好处)上添加它,你会看到代码将.bbs和RO FLASH中的.data部分归零并复制到RAM中。
That code will be something similar to http://repo.or.cz/w/cbaos.git/blob/HEAD:/kernel/init.c#l23 . 该代码与http://repo.or.cz/w/cbaos.git/blob/HEAD:/kernel/init.c#l23类似。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.