简体   繁体   English

如何将目标文件“链接”到可执行/编译二进制文件?

[英]How to 'link' object file to executable/compiled binary?

Problem 问题

I wish to inject an object file into an existing binary. 我希望将一个目标文件注入现有的二进制文件中。 As a concrete example, consider a source Hello.c : 作为一个具体的例子,考虑一个源Hello.c

#include <stdlib.h>

int main(void)
{
    return EXIT_SUCCESS;
}

It can be compiled to an executable named Hello through gcc -std=gnu99 -Wall Hello.c -o Hello . 它可以通过gcc -std=gnu99 -Wall Hello.c -o HelloHello的可执行文件。 Furthermore, now consider Embed.c : 此外,现在考虑Embed.c

func1(void)
{
}

An object file Embed.o can be created from this through gcc -c Embed.c . 目标文件Embed.o可以从此通过创建gcc -c Embed.c My question is how to generically insert Embed.o into Hello in such a way that the necessary relocations are performed, and the appropriate ELF internal tables (eg symbol table, PLT, etc.) are patched properly? 我的问题是如何一般地将Embed.o插入到Hello中,以便执行必要的重定位,并且正确修补适当的ELF内部表(例如符号表,PLT等)?


Assumptions 假设

It can be assumed that the object file to be embedded has its dependencies statically linked already. 可以假设要嵌入的目标文件已经静态链接其依赖关系。 Any dynamic dependencies, such as the C runtime can be assumed to be present also in the target executable. 可以假设任何动态依赖项(例如C运行时)也存在于目标可执行文件中。


Current Attempts/Ideas 目前的尝试/想法

  • Use libbfd to copy sections from the object file into the binary. 使用libbfd将目标文件中的节复制到二进制文件中。 The progress I have made with this is that I can create a new object with the sections from the original binary and the sections from the object file. 我对此所取得的进展是,我可以使用原始二进制文件中的部分和目标文件中的部分创建一个新对象。 The problem is that since the object file is relocatable, its sections can not be copied properly to the output without performing the relocations first. 问题是,由于目标文件是可重定位的,因此无法在不先执行重定位的情况下将其部分正确复制到输出。
  • Convert the binary back to an object file and relink with ld . 将二进制文件转换回目标文件并使用ld重新链接。 So far I tried using objcopy to perform the conversion objcopy --input elf64-x86-64 --output elf64-x86-64 Hello Hello.o . 到目前为止,我尝试使用objcopy来执行转换objcopy --input elf64-x86-64 --output elf64-x86-64 Hello Hello.o Evidently this does not work as I intend since ld -o Hello2 Embed.o Hello.o will then result in ld: error: Hello.o: unsupported ELF file type 2 . 显然这不起作用,因为ld -o Hello2 Embed.o Hello.o将导致ld: error: Hello.o: unsupported ELF file type 2 I guess this should be expected though since Hello is not an object file. 我想这应该是预期的,因为Hello不是一个目标文件。
  • Find an existing tool which performs this sort of insertion? 找到执行此类插入的现有工具?

Rationale (Optional Read) 理由(可选阅读)

I am making a static executable editor, where the vision is to allow the instrumentation of arbitrary user-defined routines into an existing binary. 我正在制作一个静态可执行编辑器,其目的是允许将任意用户定义的例程检测到现有的二进制文件中。 This will work in two steps: 这将分两步进行:

  1. The injection of an object file (containing the user-defined routines) into the binary. 将对象文件(包含用户定义的例程)注入二进制文件。 This is a mandatory step and can not be worked around by alternatives such as injection of a shared object instead. 这是一个必需的步骤,不能通过注入共享对象等替代方法来解决。
  2. Performing static analysis on the new binary and using this to statically detour routines from the original code to the newly added code. 对新二进制文件执行静态分析,并使用它来静态地将例程从原始代码转移到新添加的代码。

I have, for the most part, already completed the work necessary for step 2, but I am having trouble with the injection of the object file. 在大多数情况下,我已经完成了第2步所需的工作,但是我在注入目标文件时遇到了问题。 The problem is definitely solvable given that other tools use the same method of object injection (eg EEL ). 鉴于其他工具使用相同的对象注入方法(例如EEL ),问题肯定是可以解决的。

If it were me, I'd look to create Embed.c into a shared object, libembed.so , like so: 如果是我,我会把Embed.c创建成一个共享对象, libembed.so ,就像这样:

gcc -Wall -shared -fPIC -o libembed.so Embed.c

That should created a relocatable shared object from Embed.c . 这应该从Embed.c创建一个可重定位的共享对象。 With that, you can force your target binary to load this shared object by setting the environment variable LD_PRELOAD when running it (see more information here ): 这样,您可以通过在运行时设置环境变量LD_PRELOAD来强制目标二进制文件加载此共享对象(请参阅此处的更多信息):

LD_PRELOAD=/path/to/libembed.so Hello

The "trick" here will be to figure out how to do your instrumentation, especially considering it's a static executable. 这里的“技巧”将是弄清楚如何进行检测,特别是考虑到它是一个静态可执行文件。 There, I can't help you, but this is one way to have code present in a process' memory space. 在那里,我无法帮助你,但这是让代码存在于进程内存空间中的一种方法。 You'll probably want to do some sort of initialization in a constructor, which you can do with an attribute (if you're using gcc , at least): 您可能希望在构造函数中进行某种初始化,您可以使用属性(如果您至少使用gcc ):

void __attribute__ ((constructor)) my_init()
{
    // put code here!
}

Assuming source code for first executable is available and is compiled with a linker script that allocates space for later object file(s), there is a relatively simpler solution. 假设第一个可执行文件的源代码可用并且使用为后面的目标文件分配空间的链接器脚本进行编译,则有一个相对简单的解决方案。 Since I am currently working on an ARM project examples below are compiled with the GNU ARM cross-compiler. 由于我目前正在研究ARM项目,下面的示例是使用GNU ARM交叉编译器编译的。

Primary source code file, hello.c 主要源代码文件hello.c

#include <stdio.h>

int main ()
{

   return 0;
}

is built with a simple linker script allocating space for an object to be embedded later: 使用简单的链接描述文件构建,为稍后要嵌入的对象分配空间:

SECTIONS
{
    .text :
    {
        KEEP (*(embed)) ;

        *(.text .text*) ;
    }
}

Like: 喜欢:

arm-none-eabi-gcc -nostartfiles -Ttest.ld -o hello hello.c
readelf -s hello

Num:    Value  Size Type    Bind   Vis      Ndx Name
 0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
 1: 00000000     0 SECTION LOCAL  DEFAULT    1 
 2: 00000000     0 SECTION LOCAL  DEFAULT    2 
 3: 00000000     0 SECTION LOCAL  DEFAULT    3 
 4: 00000000     0 FILE    LOCAL  DEFAULT  ABS hello.c
 5: 00000000     0 NOTYPE  LOCAL  DEFAULT    1 $a
 6: 00000000     0 FILE    LOCAL  DEFAULT  ABS 
 7: 00000000    28 FUNC    GLOBAL DEFAULT    1 main

Now lets compile the object to be embedded whose source is in embed.c 现在让我们编译要嵌入的对象,其源代码在embed.c中

void func1()
{
   /* Something useful here */
}

Recompile with the same linker script this time inserting new symbols: 这次插入新符号​​时使用相同的链接描述文件重新编译:

arm-none-eabi-gcc -c embed.c
arm-none-eabi-gcc -nostartfiles -Ttest.ld -o new_hello hello embed.o

See the results: 看结果:

readelf -s new_hello
Num:    Value  Size Type    Bind   Vis      Ndx Name
 0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
 1: 00000000     0 SECTION LOCAL  DEFAULT    1 
 2: 00000000     0 SECTION LOCAL  DEFAULT    2 
 3: 00000000     0 SECTION LOCAL  DEFAULT    3 
 4: 00000000     0 FILE    LOCAL  DEFAULT  ABS hello.c
 5: 00000000     0 NOTYPE  LOCAL  DEFAULT    1 $a
 6: 00000000     0 FILE    LOCAL  DEFAULT  ABS 
 7: 00000000     0 FILE    LOCAL  DEFAULT  ABS embed.c
 8: 0000001c     0 NOTYPE  LOCAL  DEFAULT    1 $a
 9: 00000000     0 FILE    LOCAL  DEFAULT  ABS 
10: 0000001c    20 FUNC    GLOBAL DEFAULT    1 func1
11: 00000000    28 FUNC    GLOBAL DEFAULT    1 main

You must make room for the relocatable code to fit in the executable by extending the executables text segment, just like a virus infection. 您必须通过扩展可执行文本段来为可重定位代码腾出空间以适应可执行文件,就像病毒感染一样。 Then after writing the relocatable code into that space, update the symbol table by adding symbols for anything in that relocatable object, and then apply the necessary relocation computations. 然后在将可重定位代码写入该空间后,通过为该可重定位目标文件中的任何内容添加符号来更新符号表,然后应用必要的重定位计算。 I've written code that does this pretty well with 32bit ELF's. 我编写的代码可以很好地处理32位ELF。

The problem is that .o's are not fully linked yet, and most references are still symbolic. 问题是.o还没有完全链接,大多数引用仍然是象征性的。 Binaries (shared libraries and executables) are one step closer to finally linked code. 二进制文件(共享库和可执行文件)距离最终链接的代码更近了一步。

Doing the linking step to a shared lib, doesn't mean you must load it via the dynamic lib loader. 执行到共享库的链接步骤并不意味着您必须通过动态lib加载器加载它。 The suggestion is more that an own loader for a binary or shared lib might be simpler than for .o. 建议更多的是二进制或共享库的自己的加载器可能比.o更简单。

Another possibility would be to customize that linking process yourself and call the linker and link it to be loaded on some fixed address. 另一种可能性是自己定制链接过程并调用链接器并链接它以加载到某个固定地址。 You might also look at the preparation of eg bootloaders, which also involve a basic linking step to do exactly this (fixate a piece of code to a known loading address). 您还可以查看引导加载程序的准备工作,其中还包含一个基本的链接步骤来完成此操作(将一段代码固定到已知的加载地址)。

If you don't link to a fixed address, and want to relocate runtime you will have to write a basic linker that takes the object file, relocates it to the destination address by doing the appropriate fixups. 如果您没有链接到固定地址,并且想要重新定位运行时,则必须编写一个基本链接器来获取目标文件,通过执行适当的修正将其重定位到目标地址。

I assume you already have it, seeing it is your master thesis, but this book: http://www.iecc.com/linker/ is the standard introduction about this. 我假设你已经拥有它,看到它是你的硕士论文,但是这本书: http//www.iecc.com/linker/是关于这一点的标准介绍。

Have you looked at the DyninstAPI ? 你看过DyninstAPI了吗? It appears support was recently added for linking a .o into a static executable. 最近添加了支持将.o链接到静态可执行文件中。

From the release site: 从发布网站:

Binary rewriter support for statically linked binaries on x86 and x86_64 platforms 二进制重写器支持x86和x86_64平台上的静态链接二进制文件

You cannot do this in any practical way. 你不能以任何实际的方式做到这一点。 The intended solution is to make that object into a shared lib and then call dlopen on it. 预期的解决方案是将该对象变为共享库,然后在其上调用dlopen。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM