Linux：是否可以在进程之间共享代码？

Question

I wonder if it's possible for a linux process to call code located in the memory of another process?我想知道一个 linux 进程是否有可能调用位于另一个进程内存中的代码？

Let's say we have a function f() in process A and we want process B to call it.假设我们在进程 A 中有一个函数 f() 并且我们希望进程 B 调用它。 What I thought about is using mmap with MAP_SHARED and PROT_EXEC flags to map the memory containing the function code and pass the pointer to B, assuming, that f() will not call any other function from A binary.我想到的是使用带有 MAP_SHARED 和 PROT_EXEC 标志的 mmap 来映射包含函数代码的内存并将指针传递给 B，假设 f() 不会从 A 二进制文件调用任何其他函数。 Will it ever work?它会奏效吗？ If yes, then how do I determine the size of f() in memory?如果是，那么如何确定内存中 f() 的大小？

=== EDIT === === 编辑 ===

I know, that shared libraries will do exactly that, but I wonder if it's possible to dynamically share code between processes.我知道，共享库会做到这一点，但我想知道是否可以在进程之间动态共享代码。

Answer 1

Yes, you can do that, but the first process must have first created the shared memory via mmap and either a memory-mapped file, or a shared area created with shm_open .是的，您可以这样做，但第一个进程必须首先通过mmap和内存映射文件或使用shm_open创建的共享区域创建共享内存。

If you are sharing compiled code then that's what shared libraries were created for .如果您正在共享编译代码，然后什么共享库对被创造。 You can link against them in the ordinary way and the sharing will happen automatically, or you can load them manually using dlopen (eg for a plugin).您可以以普通方式链接它们，共享将自动发生，或者您可以使用dlopen （例如，对于插件）手动加载它们。

Update:更新：

As the code has been generated by a compiler then you will have relocations to worry about.由于代码是由编译器生成的，因此您将需要担心重定位。 The compiler does not produce code that will Just Work anywhere.编译器不会生成在任何地方都能正常工作的代码。 It expects that the .data section is in a certain place, and that the .bss section has been zeroed.它期望.data部分在某个位置，并且.bss部分已被清零。 The GOT will need to be populated.需要填充 GOT。 Any static constructors will have to be called.必须调用任何静态构造函数。

In short, what you want is probably dlopen .简而言之，您想要的可能是dlopen 。 This system allows you to open a shared library like it was a file, and then extract function pointers by name.该系统允许您像打开文件一样打开共享库，然后按名称提取函数指针。 Each program that dlopen s the library will share the code sections, thus saving memory, but each will have its own copy of the data section, so they do not interfere with each other. dlopen s 库的每个程序都会共享代码段，从而节省内存，但每个程序都会有自己的数据段副本，因此它们不会相互干扰。

Beware that you need to compile your library code with -fPIC or else you won't get any code sharing either (actually, the linkers and dynamic loaders for many architectures probably don't support libraries that aren't PIC anyway).请注意，您需要使用-fPIC编译您的库代码，否则您也不会获得任何代码共享（实际上，许多架构的链接器和动态加载器可能不支持非 PIC 的库）。

Answer 2

The standard approach is to put the code of f() in a shared library libfoo.so .标准方法是将f()的代码放在共享库libfoo.so 。 Then you could either link to that library (eg by building program A with gcc -Wall ac -lfoo -o a.bin ), or load it dynamically (eg in program B ) using dlopen(3) then retrieving the address of f using dlsym .然后您可以链接到该库（例如通过使用gcc -Wall ac -lfoo -o a.bin构建程序A ），或者使用dlopen(3)动态加载它（例如在程序B 中）然后使用检索f的地址dlsym 。

When you compile a shared library you want to :当您编译共享库时，您希望：

compile each source file foo1.c with gcc -Wall -fPIC -c foo1.c -o foo1.pic.o into position independent code , and likewise for foo2.c into foo2.pic.o使用gcc -Wall -fPIC -c foo1.c -o foo1.pic.o将每个源文件foo1.c编译为位置无关代码，同样将foo2.c为foo2.pic.o
link all of them into libfoo.so with gcc -Wall -shared foo*.pic.o -o libfoo.so ;所有的人都链接到libfoo.so与gcc -Wall -shared foo*.pic.o -o libfoo.so ; notice that you can link additional shared libraries into lbfoo.so (eg by appending -lm to the linking command)请注意，您可以将其他共享库链接到lbfoo.so （例如，通过将-lm附加到链接命令）

See also the Program Library Howto .另请参阅程序库方法。

You could play insane tricks by mmap -ing some other /proc/1234/mem but that is not reasonable at all.你可以通过mmap -ing 一些其他的/proc/1234/mem来玩疯狂的把戏，但这根本不合理。 Use shared libraries.使用共享库。

PS.附注。 you can dlopen a big lot (hundreds of thousands) of shared objects lib*.so files;你可以dlopen大很多共享对象（数十万） lib*.so文件; you may want to dlclose them (but practically you don't have to).您可能想要dlclose它们（但实际上您不必这样做）。

Answer 3

It would be possible to do so, but that's exactly what shared libraries are for.这样做是可能的，但这正是共享库的用途。

Also, beware that you need to check that the address of the shared memory is the same for both processes, otherwise any references that are "absolute" (that is, a pointer to something in the shared code).另外，请注意，您需要检查两个进程的共享内存地址是否相同，否则任何引用都是“绝对”的（即指向共享代码中某些内容的指针）。 And like with shared libaries, the bitness of the code will have to be the same, and as with all shared memory, you need to make sure that you don't "mess up" for the other process if you modify any of the shared memory.并且与共享库一样，代码的位数必须相同，并且与所有共享内存一样，如果您修改任何共享内存，您需要确保不会“搞砸”其他进程记忆。

Determining the size of a function ranges from "hard" to "nearly impossible", depending on the actual code generated, and the level of information you have available.确定函数的大小范围从“困难”到“几乎不可能”，具体取决于生成的实际代码和可用信息的级别。 Debug symbols will have the size of a function, but beware that I have seen compilers generate code where two functions share the same "return" piece of code (that is, the compiler generates a jump to another function that has the same bit of code to return the result, because it saves a few bytes of code, and there was already going to be a jump anyway [eg there is a if/else that the compiler has to jump around]).调试符号将具有函数的大小，但请注意，我已经看到编译器生成代码，其中两个函数共享相同的“返回”代码段（即，编译器生成一个跳转到另一个具有相同代码位的函数）返回结果，因为它节省了几个字节的代码，并且无论如何已经会有一个跳转[例如，编译器必须跳转一个 if/else]）。

Answer 4

not directly不直接
that's what shared libraries are for这就是共享库的用途
relocations搬迁

Oh no!不好了！ Anyways...无论如何...

Here's the insane, unreasonable, not-good, purely academic demonstration of this capability.这是对这种能力的疯狂、不合理、不好、纯粹的学术展示。 It was fun for me, I hope it's fun for you.这对我来说很有趣，我希望对你来说也很有趣。

Overview概述

Program A will use shm_open to create a shared memory object, and mmap to map it to its memory space.程序A将使用shm_open创建一个共享内存对象，并使用mmap将其映射到其内存空间。 Then it it will copy some code from a function defined in A to the shared memory.然后它将一些代码从A定义的函数复制到共享内存。 Then program B will open up the shared memory, execute the function, and just for kicks, make a very simple modification to the code.然后程序B会打开共享内存，执行函数，只是为了踢球，对代码做一个非常简单的修改。 Then A will execute the code to demonstrate the change took effect.然后A将执行代码以演示更改生效。

Again, this is no recommendation for how to solve a problem, it's an academic demonstration.同样，这不是关于如何解决问题的建议，而是一种学术演示。

// A.c
#include <stdio.h>
#include <string.h>

#include <unistd.h>

#include <fcntl.h>
#include <sys/mman.h>
#include <sys/stat.h>

int foo(int y) {
  int x = 14;
  return x + y;
}

int main(int argc, char *argv[]) {
  const size_t mem_size = 0x1000;
  // create shared memory objects
  int shared_fd = shm_open("foobar2", O_RDWR | O_CREAT, 0777);
  ftruncate(shared_fd, mem_size);
  void *shared_mem =
      mmap(NULL, mem_size, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_SHARED, shared_fd, 0);
  // copy function to shared memory
  const size_t fn_size = 24;
  memcpy(shared_mem, &foo, fn_size);
  // wait
  getc(stdin);
  // execute the shared function
  int(*shared_foo)(int) = shared_mem;
  printf("shared_foo(3) = %d\n", shared_foo(3));
  // clean up
  shm_unlink("foobar2");
}

gcc A.c -lrt -o A

The constant fn_size was determined by looking at the output of objdump -dj .text A常量fn_size是通过查看objdump -dj .text A的输出确定objdump -dj .text A

...
000000000000088a <foo>:
 88a:   55                      push   %rbp
 88b:   48 89 e5                mov    %rsp,%rbp
 88e:   89 7d ec                mov    %edi,-0x14(%rbp)
 891:   c7 45 fc 0e 00 00 00    movl   $0xe,-0x4(%rbp)
 898:   8b 55 fc                mov    -0x4(%rbp),%edx
 89b:   8b 45 ec                mov    -0x14(%rbp),%eax
 89e:   01 d0                   add    %edx,%eax
 8a0:   5d                      pop    %rbp
 8a1:   c3                      retq   
...

I think that's 24 bytes, I dunno.我认为那是24个字节，我不知道。 I guess I could put anything larger than that and it would do the same thing.我想我可以放任何比这更大的东西，它会做同样的事情。 Anything shorter and I'll probably get an exception from the processor.任何更短的东西，我可能会从处理器那里得到一个例外。 Also, note that the value of x from foo ( 14 , that's (apparently) 0e 00 00 00 in LE) is located at foo + 10 .另外，请注意来自foo的x值（ 14 ，即（显然）LE 中的0e 00 00 00 ）位于foo + 10 。 This will be the constant x_offset in program B .这将是程序B的常量x_offset 。

// B.c
#include <stdio.h>

#include <unistd.h>

#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>

const int x_offset = 10;

int main(int argc, char *argv[]) {
  // create shared memory objects
  int shared_fd = shm_open("foobar2", O_RDWR | O_CREAT, 0777);
  void *shared_mem = mmap(NULL, 0x1000, PROT_EXEC | PROT_WRITE, MAP_SHARED, shared_fd, 0);
  int (*shared_foo)(int) = shared_mem;
  int z = shared_foo(13);
  printf("result: %d\n", z);
  int *x_p = (int*)((char*)shared_mem + x_offset);
  *x_p = 100;
  shm_unlink("foobar");
}

Anyways first I run A , then I run B .无论如何，我首先运行A ，然后运行B 。 The output of B is: B的输出是：

result: 27

Then I go back to A and push enter , then I get:然后我回到A并按下enter ，然后我得到：

shared_foo(3) = 103

Good enough for me.对我来说已经足够好了。

/dev/shm/foobar2 /dev/shm/foobar2

To completely eliminate the mystique of all this, after running A you can do something like为了完全消除这一切的神秘感，在运行A之后，您可以执行以下操作

xxd /dev/shm/foobar2 | vim -

Then, edit that constant 0e 00 00 00 just like before, then save the file with the 'ol然后，像以前一样编辑该常量0e 00 00 00 ，然后使用 'ol

:w !xxd -r > /dev/shm/foobar2

and push enter in A and see similar results as above.并在A按enter并查看与上述类似的结果。

Linux：是否可以在进程之间共享代码？

问题描述

4 个解决方案

解决方案1
5 2013-02-27 14:15:36

解决方案2
4 2013-02-27 14:15:38

解决方案3
2 2013-02-27 14:38:41

解决方案4
0 2020-11-15 21:23:52

Overview概述

/dev/shm/foobar2 /dev/shm/foobar2

Linux：是否可以在进程之间共享代码？

问题描述

4 个解决方案

解决方案1 5 2013-02-27 14:15:36

解决方案2 4 2013-02-27 14:15:38

解决方案3 2 2013-02-27 14:38:41

解决方案4 0 2020-11-15 21:23:52

Overview概述

/dev/shm/foobar2 /dev/shm/foobar2

解决方案1
5 2013-02-27 14:15:36

解决方案2
4 2013-02-27 14:15:38

解决方案3
2 2013-02-27 14:38:41

解决方案4
0 2020-11-15 21:23:52