在Linux上用C和汇编为x64编写自定义加载程序

Question

I'd like to write my own loader for binary code on x64 Linux. 我想在x64 Linux上编写自己的二进制代码加载器。 In the future I want to be able to perform the linking step myself and thus be able to call code from .o object-files. 将来，我希望自己能够执行链接步骤，从而能够从.o目标文件中调用代码。 But now, I want to call a function from an executable binary that has already been linked. 但是现在，我想从已经链接的可执行二进制文件中调用函数。

To create some function that should be callable from "outside", I started with the following piece of source code: 为了创建一些可以从“外部”调用的功能，我从以下源代码开始：

void foo(void)
{
  int a = 2;
  int b = 3;
  a + b;
}

int main(void)
{
  foo();
  return 0;
}

It's the foo() -function I want to call using my loader. 这是我想使用加载程序调用的foo()函数。 Using the following chain of commands 使用以下命令链

gcc -o /tmp/main main.c
strip -s /tmp/main
objdump -D /tmp/main

I obtained the assembly code of the foo() function, which looks like this: 我获得了foo()函数的汇编代码，如下所示：

...
0000000000001125 <foo>:
    1125:   55                      push   %rbp
    1126:   48 89 e5                mov    %rsp,%rbp
    1129:   c7 45 fc 02 00 00 00    movl   $0x2,-0x4(%rbp)
    1130:   c7 45 f8 03 00 00 00    movl   $0x3,-0x8(%rbp)
    1137:   90                      nop
    1138:   5d                      pop    %rbp
    1139:   c3                      retq
...

That means, that the foo() function starts at offset 0x1125 in main . 这意味着foo()函数从main中的偏移量0x1125开始。 I verified this using a hexeditor. 我使用hexeditor验证了这一点。

The following is my loader. 以下是我的装载机。 There is no error handling yet and the code is very ugly. 目前还没有错误处理，并且代码非常丑陋。 However, it should demonstrate, what I want to achieve: 但是，它应该演示我想要实现的目标：

#include <stdio.h>
#include <stdlib.h>

typedef void(*voidFunc)(void);

int main(int argc, char* argv[])
{
  FILE *fileptr;
  char *buffer;
  long filelen;
  voidFunc mainFunc;

  fileptr = fopen(argv[1], "rb");  // Open the file in binary mode
  fseek(fileptr, 0, SEEK_END);          // Jump to the end of the file
  filelen = ftell(fileptr);             // Get the current byte offset in the file
  rewind(fileptr);                      // Jump back to the beginning of the file

  buffer = (char *)malloc((filelen+1)*sizeof(char)); // Enough memory for file + \0
  fread(buffer, filelen, 1, fileptr); // Read in the entire file
  fclose(fileptr); // Close the file

  mainFunc = ((voidFunc)(buffer + 0x1125));

  mainFunc();

  free(buffer);

  return 0;
}

When executing this program objloader /tmp/main it results in a SEGFAULT. 执行此程序objloader /tmp/main将导致SEGFAULT。

The mainFunc variable points to the correct place. mainFunc变量指向正确的位置。 I verified this using gdb . 我使用gdb验证了这一点。

Is it a problem that the opcode lives on the heap? 操作码存在于堆中是否有问题？ Actually I decided to make the function I want to call as simple as possible (side-effects, required stack or registers for function parameters, ...). 实际上，我决定使要调用的函数尽可能简单（副作用，函数参数所需的堆栈或寄存器，...）。 But still, there is something, I don't really get. 但是，仍然有一些东西，我真的不明白。

Can anyone please point me to the right direction here? 有人可以在这里指出正确的方向吗？ Any hints on helpful literature in that regard are also highly appreciated! 在这方面对有用文献的任何暗示也将受到高度赞赏！

Answer 1

In order to make the buffer memory region executable, you will have to use mmap . 为了使buffer存储器区域可执行，您将必须使用mmap 。 Try 尝试

#include <sys/mman.h>
...
buffer = (char *)mmap(NULL, filelen /* + 1? Not sure why. */, PROT_EXEC | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);

That should give the memory region the permissions you want and have it work with the surrounding code. 这应该为内存区域提供所需的权限，并使其与周围的代码一起使用。 In fact, if you want to use mmap the way it was meant to be used, go for 实际上，如果您想按原本打算的方式使用mmap ，请继续

int fd = open(argv[1], O_RDONLY);
struct stat myfilestats;
fstat(fd, &myfilestats);
buffer = (char*)mmap(NULL, myfilestats.st_size, PROT_EXEC, MAP_PRIVATE, fd, 0);
fclose(fd);
...
munmap(buffer, myfilestats.st_size);

Using MAP_ANONYMOUS will make the memory region unassociated with a file descriptor, but the idea is that if it represents a file, the file descriptor should be associated with it. 使用MAP_ANONYMOUS将使内存区域与文件描述符不关联，但是想法是，如果它表示文件，则文件描述符应与其关联。 When you do this Linux will do all kinds of cool tricks, such as only load parts of the file that you actually end up accessing (lazy loading will also make the program very smooth when the file is large), and if multiple programs are all accessing the same file then they will all share the same physical memory location. 当您执行此操作时，Linux会执行各种很酷的技巧，例如仅加载您最终最终访问的文件部分（如果文件很大，则延迟加载也将使程序非常流畅），并且如果全部包含多个程序访问相同的文件，则它们将共享相同的物理内存位置。

Answer 2

This is the final version of my 'loader' which is based on Nicholas Pipiton's answer . 这是我的“加载程序”的最终版本，该版本基于Nicholas Pipiton的回答。 Again: no error-handling, simplified, not considering, that real-world scenarios are much more difficult, etc.: 再说一次：没有错误处理，没有简化，没有考虑到现实中的场景要困难得多，等等。

#include <fcntl.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>

#include <stdlib.h>

typedef void(*voidFunc)(void);

int main(int argc, char* argv[])
{
  char* buffer;
  voidFunc mainFunc;
  struct stat myfilestats;
  int fd;

  fd = open(argv[1], O_RDONLY);
  fstat(fd, &myfilestats);
  buffer = mmap(NULL, myfilestats.st_size, PROT_EXEC, MAP_PRIVATE, fd, 0);
  close(fd);

  mainFunc = ((voidFunc)(buffer + 0x1125));

  mainFunc();

  munmap(buffer, myfilestats.st_size);

  return EXIT_SUCCESS;
}

在Linux上用C和汇编为x64编写自定义加载程序

问题描述

2 个解决方案

解决方案1
3 已采纳 2018-08-02 14:18:41

解决方案2
0 2018-08-02 20:35:49

在Linux上用C和汇编为x64编写自定义加载程序

问题描述

2 个解决方案

解决方案1 3 已采纳 2018-08-02 14:18:41

解决方案2 0 2018-08-02 20:35:49

解决方案1
3 已采纳 2018-08-02 14:18:41

解决方案2
0 2018-08-02 20:35:49