简体   繁体   English

Unix 可执行文件“Exec”,十六进制转储显示 C 代码,而不是汇编

[英]Unix Executable File 'Exec', hex dump shows C code, not Assembly

In Os X, I compiled a C program with the command: gcc -o binaryoutName inputfile which I created and then ran a hex dump on the resultant binary "Exec" file.在 OS X 中,我使用以下命令编译了一个 C 程序:我创建的gcc -o binaryoutName inputfile ,然后在生成的二进制“Exec”文件上运行十六进制转储。 As I understand it, an Exec file is a 'UNIX Executable', which is the UNIX equivalent of an executable file.据我了解,Exec 文件是一个“UNIX 可执行文件”,它是 UNIX 上的可执行文件的等价物。

When I ran the hex dump using the command xxd -b binary , it returned the ASCII content of the binary, however this ASCII represented the literal C code which i first programmed the .c file in.当我使用命令xxd -b binary运行十六进制转储时,它返回了xxd -b binary的 ASCII 内容,但是这个 ASCII 代表了我首先对 .c 文件进行编程的文字 C 代码。

Hex dump Extract:十六进制转储提取物:

0007c4a: 01101100 01110101 01110011 01101000 00000000 01011111 lush._ 0007c50: 01100110 01101111 01110000 01100101 01101110 00000000 fopen. 0007c56: 01011111 01100110 01110000 01110010 01101001 01101110 _fprin 0007c5c: 01110100 01100110 00000000 01011111 01100111 01100101 tf._ge 0007c62: 01110100 01100011 01101000 01100001 01110010 00000000 tchar. 0007c68: 01011111 01100111 01100101 01110100 01100011 01110111 _getcw 0007c6e: 01100100 00000000 01011111 01100111 01100101 01110100 d._get 0007c74: 01100101 01101110 01110110 00000000 01011111 01101100 env._l 0007c7a: 01101111 01100011 01100001 01101100 01110100 01101001 ocalti 0007c80: 01101101 01100101 00000000 01011111 01101101 01100101 me._me 0007c86: 01101101 01100011 01110000 01111001 00000000 01011111 mcpy._ 0007c8c: 01110000 01110010 01101001 01101110 01110100 01100110 printf 0007c92: 00000000 01011111 01110000 01110101 01110100 01100011 ._putc 0007c98: 01101000 01100001 01110010 00000000 01011111 01110011 har._s 0007c9e: 01100011 01100001 01101110 01100110 00000000 01011111 canf._ 0007ca4: 01110011 01101100 01100101 01100101 01110000 00000000 sleep.

Note that the ASCII translation on the far-right column is extremely similar to the code inside the .c file I initially compiled.请注意,最右侧列中的 ASCII 转换与我最初编译的 .c 文件中的代码极为相似。 This is counter-intuitive as I expected the hex dump to contain the ASCII binary of the Assembly code which the compiler would logically compile it too.这是违反直觉的,因为我希望十六进制转储包含汇编代码的 ASCII 二进制文件,编译器也会在逻辑上对其进行编译。

This is a question at the very limits of my understanding of the compilation process and I expect to have a few incorrect details, for which I apologies.这是我对编译过程的理解非常有限的问题,我希望有一些不正确的细节,对此我深表歉意。

My Question: Why did the hex dump return ASCII for C code instead of Assembly?我的问题:为什么十六进制转储为 C 代码而不是汇编返回 ASCII?

Thanks in Advance.提前致谢。

I believe what you saw is the .strtab section (or similar sections) of your executable or object file, which does include string.我相信您看到的是可执行文件或目标文件的.strtab部分(或类似部分),其中确实包含字符串。

For example, for the following C program:例如,对于以下 C 程序:

#include <stdio.h>

int main(void)
{
    printf("Hello world!\n");
}

Compiled with the following command:使用以下命令编译:

gcc -Wall -g -std=c11 c00.c

If we hexdump it, we will find something like:如果我们 hexdump 它,我们会发现类似的东西:

$ xxd a.out
...
00022e0: 0000 0000 0000 0000 0063 7274 7374 7566  .........crtstuf
00022f0: 662e 6300 5f5f 4a43 525f 4c49 5354 5f5f  f.c.__JCR_LIST__
0002300: 0064 6572 6567 6973 7465 725f 746d 5f63  .deregister_tm_c
0002310: 6c6f 6e65 7300 7265 6769 7374 6572 5f74  lones.register_t
0002320: 6d5f 636c 6f6e 6573 005f 5f64 6f5f 676c  m_clones.__do_gl
...

And we can find out the section related information through我们可以通过

$ readelf -WS a.out 
...
  [34] .strtab           STRTAB          0000000000000000 0022e8 000235 00      0   0  1

Notices the offset of .strtab is 0x0022e8, which matches what we saw from the output of xxd .注意.strtab的偏移量是 0x0022e8,它与我们从xxd的输出中看到的相匹配。

What you see is not C code but the table of symbols of your executable or object file (symbols that have external linkage property).您看到的不是 C 代码,而是可执行文件或目标文件的符号表(具有外部链接属性的符号)。 An object file or executable is formatted (Linux uses ELF standard for example) into different sections: symbol table, global variables, code, etc. For the symbol table, the compiler generates it for different purpose as to generate linkable files or ease debug.一个目标文件或可执行文件被格式化(例如 Linux 使用 ELF 标准)到不同的部分:符号表、全局变量、代码等。对于符号表,编译器为不同的目的生成它,例如生成可链接的文件或便于调试。

In an executable, these symbols are not mandatory and you can easily remove them with command strip , if you strip an object file you will be unable to link it.在可执行文件中,这些符号不是强制性的,您可以使用命令strip轻松删除它们,如果您删除目标文件,您将无法链接它。

You can show in a more readable form the content of the symbol table with commands like nm .您可以使用nm命令以更易读的形式显示符号表的内容。

Read online manual for strip and nm commands, how compilers link object files...阅读关于stripnm命令的在线手册,编译器如何链接目标文件......

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM