简体   繁体   English

如何获得c程序的最小可执行操作码?

[英]how to get minimum executable opcodes for c program?

to get opcodes author here does following: 获取操作码作者在这里做如下:

[bodo@bakawali testbed8]$ as testshell2.s -o testshell2.o
[bodo@bakawali testbed8]$ ld testshell2.o -o testshell2
[bodo@bakawali testbed8]$ objdump -d testshell2

and then he gets three sections (or mentions only these 3): 然后他得到三个部分(或仅提到这三个部分):

  • <_start> <_start>

  • < starter> <starter>

  • < ender> <ender>

I have tried to get hex opcodes the same way but cannot ld correctly. 我曾试图让十六进制操作码以同样的方式,但不能ld正确。 Of course I can produce .o and prog file for example with: 当然我可以生成.o和prog文件,例如:

gcc main.o -o prog -g

however when 但是什么时候

objdump --prefix-addresses --show-raw-insn -Srl prog

to see complete code with annotations and symbols, I have many additional sections there , for example: 要查看包含注释和符号的完整代码, 我还有许多其他部分 ,例如:

  • .init 。在里面

  • .plt .PLT

  • .text (yes, I know, main is here) [many parts here: _start(), call_gmon_start(), __do_global_dtors_aux(), frame_dummy(), main(), __libc_csu_init(), __libc_csu_fini(), __do_global_ctors_aux()] .text(是的,我知道,主要在这里)[这里很多部分:_start(),call_gmon_start(),__ do_global_dtors_aux(),frame_dummy(),main(),__ libc_csu_init(),__ libc_csu_fini(),__ do_global_ctors_aux()]

  • .fini 调用.fini

I assume these are additions introduced by gcc linking to runtime libraries. 我假设这些是gcc链接到运行时库引入的附加内容。 I think i don't need these all sections to call opcode from c code (author uses only those 3 sections) however my problem is I don't know which exactly I might discard and which are necessary. 我想我不需要这些所有部分从c代码调用操作码(作者只使用那3个部分)但是我的问题是我不知道我可能丢弃哪个,哪些是必要的。 I want to use it like this: 我想这样使用它:

#include <unistd.h>

char code[] = "\x31\xed\x49\x89\x...x00\x00";

int main(int argc, char **argv)
{
/*creating a function pointer*/
int (*func)();
func = (int (*)()) code;
(int)(*func)();

return 0;
} 

so I have created this : 所以我创造了这个:

#include <unistd.h>
/*
 * 
 */
int main() {

    char *shell[2];

    shell[0] = "/bin/sh";
    shell[1] = NULL;
    execve(shell[0], shell, NULL);

    return 0;
}

and I did disassembly as I described. 我按照我的描述进行了反汇编。 I tried to use opcode from .text main(), this gave me segmentation fault, then .text main() + additionally .text _start(), with same result. 我试图使用.text main()的操作码,这给了我分段错误,然后.text main()+另外.text _start(),结果相同。

So, what to choose from above sections, or how to generate only as minimized "prog" as with three sections? 那么,从上面的部分中选择什么,或者如何仅生成与三个部分一样最小化的“prog”?

char code[] = "\\x31\\xed\\x49\\x89\\x...x00\\x00"; char code [] =“\\ x31 \\ xed \\ x49 \\ x89 \\ x ... x00 \\ x00”;

This will not work. 这不行。

Reason: The code definitely contains adresses. 原因:代码肯定包含地址。 Mainly the address of the function execve() and the address of the string constant "/bin/sh". 主要是函数execve()的地址和字符串常量“/ bin / sh”的地址。

The executable using the "code[]" approach will not contain a string constant "/bin/sh" at all and the address of the function execve() will be different (if the function will be linked into the executable at all). 使用“code []”方法的可执行文件根本不包含字符串常量“/ bin / sh”,并且函数execve()的地址将不同(如果函数将完全链接到可执行文件中)。

Therefore the "call" instruction to the "execve()" function will jump to anywhere in the executable using the "code[]" approach. 因此,对“execve()”函数的“调用”指令将使用“code []”方法跳转到可执行文件中的任何位置。

Some theory about executables - just for your information: 关于可执行文件的一些理论 - 仅供参考:

There are two possibilities for executables: 可执行文件有两种可能性:

  • Statically linked: These executables contain all necessary code. 静态链接:这些可执行文件包含所有必需的代码。 Therefore they do not access dynamic libraries like "libc.so" 因此他们不访问像“libc.so”这样的动态库
  • Dynamically linked: These executables do not contain code that is frequently used. 动态链接:这些可执行文件不包含经常使用的代码。 Such code is stored in files common to all executables: The dynamic libraries (eg "libc.so") 此类代码存储在所有可执行文件通用的文件中:动态库(例如“libc.so”)

When the same C code is used then statically linked executables are much bigger than dynamically linked executables because all C functions (eg "printf", "execve", ...) must be bundled into the executable. 当使用相同的C代码时,静态链接的可执行文件比动态链接的可执行文件大得多,因为所有C函数(例如“printf”,“execve”,...)必须捆绑到可执行文件中。

When not using any of these library functions the statically linked executables are simpler and therefore easier to understand. 当不使用任何这些库函数时,静态链接的可执行文件更简单,因此更容易理解。

Statically linked executable behaviour 静态链接的可执行行为

A statically linked executable is loaded into the memory by the operating system (when it is started using execve()). 静态链接的可执行文件由操作系统加载到内存中(当使用execve()启动时)。 The executable contains an entry point address. 可执行文件包含入口点地址。 This address is stored in the file header of the executable. 该地址存储在可执行文件的文件头中。 You can see it using "objdump -h ...". 您可以使用“objdump -h ...”查看它。

The operating system performs a jump to that address so the program execution starts at this address. 操作系统跳转到该地址,以便程序从该地址开始执行。 The address is typically the function "_start" however this can be changed using command line options when linking using "ld". 地址通常是函数“_start”,但是当使用“ld”链接时,可以使用命令行选项更改此地址。

The code at "_start" will prepare the executable (eg initialize variables, calculate the values for "argc" and "argv", ...) and call the "main()" function. “_start”处的代码将准备可执行文件(例如初始化变量,计算“argc”和“argv”的值,...)并调用“main()”函数。 When "main()" returns the "_start" function will pass the value returned by "main()" to the "_exit()" function. 当“main()”返回时,“_ start”函数会将“main()”返回的值传递给“_exit()”函数。

Dynamically linked executable behaviour 动态链接的可执行行为

Such executables contain two additional sections. 此类可执行文件包含两个附加部分。 The first section contains the file name of the dynamic linker (maybe. "/lib/ld-linux.so.1"). 第一部分包含动态链接器的文件名(可能是“/lib/ld-linux.so.1”)。 The operating system will then load the executable and the dynamic linker and jump to the entry point of the dynamic linker (and not to that of the executable). 然后,操作系统将加载可执行文件和动态链接器,并跳转到动态链接器的入口点(而不是可执行文件的入口点)。

The dynamic linker will read the second additional section: It contains information about dynamic libraries (eg "libc.so") required by the executable. 动态链接器将读取第二个附加部分:它包含有关可执行文件所需的动态库(例如“libc.so”)的信息。 It will load all these libraries and initialize a lot of variables. 它将加载所有这些库并初始化许多变量。 Then it calls the initialization function ("_init()") of all libraries and of the executable. 然后它调用所有库和可执行文件的初始化函数(“_init()”)。

Note that both the operating system and the dynamic linker ignore the function and section names! 请注意,操作系统和动态链接器都会忽略函数和节名称! The address of the entry point is taken from the file header and the addresses of the "_init()" functions is taken from the additional section - the functions may be named differently! 入口点的地址取自文件头,“_ init()”函数的地址取自附加部分 - 函数的名称可能不同!

When all this is done the dynamic linker will jump to the entry point ("_start") of the executable. 完成所有这些操作后,动态链接器将跳转到可执行文件的入口点(“_start”)。

About the "GOT", "PLT", ... sections: 关于“GOT”,“PLT”,...部分:

These sections contain information about the addresses where the dynamic libraries have been loaded by the linker. 这些部分包含有关链接器加载动态库的地址的信息。 The "PLT" section contains wrapper code that will contain jumps to the dynamic libraries. “PLT”部分包含包含跳转到动态库的包装代码。 This means: The section "PLT" will contain a function "printf()" that will actually do nothing but jump to the "printf()" function in "libc.so". 这意味着:“PLT”部分将包含一个函数“printf()”,它实际上什么都不做,只是跳转到“libc.so”中的“printf()”函数。 This is done because directly calling a function in a dynamic library from C code would make linking much more difficult so C code will not call functions in a dynamic library directly. 这样做是因为直接从C代码调用动态库中的函数会使链接变得更加困难,因此C代码不会直接调用动态库中的函数。 Another advantage of this implementation is that "lazy linking" is possible. 该实现的另一个优点是“延迟链接”是可能的。

Some words about Windows 关于Windows的一些话

Windows only knows dynamically linked executables. Windows只知道动态链接的可执行文件。 Windows XP even refused to load an executable not requiring DLLs. Windows XP甚至拒绝加载不需要DLL的可执行文件。 The "dynamic linker" is integrated into the operating system and not a separate file. “动态链接器”已集成到操作系统中,而不是单独的文件。 There is also an equivalent of the "PLT" section. 还有一个相当于“PLT”的部分。 However many compilers support "directly" calling DLL code from C code without calling the code in the PLT section first (theoretically this would also be possible under Linux). 然而,许多编译器支持“直接”从C代码调用DLL代码而不首先调用PLT部分中的代码(理论上这在Linux下也是可能的)。 Lazy linking is not supported. 不支持延迟链接。

You should read this article: http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html . 你应该阅读这篇文章: http//www.muppetlabs.com/~breadbox/software/tiny/teensy.html It explains all you need to create really tiny program in great detail. 它解释了创建非常小的程序所需的所有细节。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM