简体   繁体   English

GCC -fPIC 选项

[英]GCC -fPIC option

I have read about GCC's Options for Code Generation Conventions , but could not understand what "Generate position-independent code (PIC)" does.我已阅读GCC 的代码生成约定选项,但无法理解“生成与位置无关的代码 (PIC)”的作用。 Please give an example to explain me what does it mean.请举一个例子来解释我是什么意思。

Position Independent Code means that the generated machine code is not dependent on being located at a specific address in order to work.位置无关代码意味着生成的机器代码不依赖于位于特定地址才能工作。

Eg jumps would be generated as relative rather than absolute.例如,跳跃将产生为相对而非绝对。

Pseudo-assembly:伪组装:

PIC: This would work whether the code was at address 100 or 1000 PIC:无论代码是在地址 100 还是 1000,这都会起作用

100: COMPARE REG1, REG2
101: JUMP_IF_EQUAL CURRENT+10
...
111: NOP

Non-PIC: This will only work if the code is at address 100非 PIC:仅当代码位于地址 100 时才有效

100: COMPARE REG1, REG2
101: JUMP_IF_EQUAL 111
...
111: NOP

EDIT: In response to comment.编辑:回应评论。

If your code is compiled with -fPIC, it's suitable for inclusion in a library - the library must be able to be relocated from its preferred location in memory to another address, there could be another already loaded library at the address your library prefers.如果您的代码是使用 -fPIC 编译的,则它适合包含在库中 - 库必须能够从其在内存中的首选位置重定位到另一个地址,在您的库喜欢的地址处可能有另一个已加载的库。

I'll try to explain what has already been said in a simpler way.我将尝试以更简单的方式解释已经说过的内容。

Whenever a shared lib is loaded, the loader (the code on the OS which load any program you run) changes some addresses in the code depending on where the object was loaded to.每当加载共享库时,加载程序(操作系统上加载您运行的任何程序的代码)都会根据对象加载到的位置更改代码中的某些地址。

In the above example, the "111" in the non-PIC code is written by the loader the first time it was loaded.在上面的例子中,非PIC代码中的“111”是加载器第一次加载时写入的。

For not shared objects, you may want it to be like that because the compiler can make some optimizations on that code.对于非共享对象,您可能希望它是这样的,因为编译器可以对该代码进行一些优化。

For shared object, if another process will want to "link" to that code it must read it to the same virtual addresses or the "111" will make no sense.对于共享对象,如果另一个进程想要“链接”到该代码,它必须将其读取到相同的虚拟地址,否则“111”将毫无意义。 But that virtual-space may already be in use in the second process.但是该虚拟空间可能已经在第二个过程中使用。

Code that is built into shared libraries should normally be position-independent code, so that the shared library can readily be loaded at (more or less) any address in memory.内置到共享库中的代码通常应该是与位置无关的代码,以便共享库可以轻松地(或多或少)加载到内存中的任何地址。 The -fPIC option ensures that GCC produces such code. -fPIC选项确保 GCC 生成这样的代码。

Adding further...进一步添加...

Every process has same virtual address space (If randomization of virtual address is stopped by using a flag in linux OS) (For more details Disable and re-enable address space layout randomization only for myself )每个进程都有相同的虚拟地址空间(如果在 linux OS 中使用标志停止虚拟地址的随机化)(有关更多详细信息,请仅为我自己禁用和重新启用地址空间布局随机化

So if its one exe with no shared linking (Hypothetical scenario), then we can always give same virtual address to same asm instruction without any harm.因此,如果它的一个 exe 没有共享链接(假设场景),那么我们总是可以将相同的虚拟地址提供给相同的 asm 指令而不会造成任何伤害。

But when we want to link shared object to the exe, then we are not sure of the start address assigned to shared object as it will depend upon the order the shared objects were linked.That being said, asm instruction inside .so will always have different virtual address depending upon the process its linking to.但是当我们想将共享对象链接到 exe 时,我们不确定分配给共享对象的起始地址,因为它取决于共享对象的链接顺序。也就是说,.so 中的 asm 指令总是有不同的虚拟地址取决于它链接到的进程。

So one process can give start address to .so as 0x45678910 in its own virtual space and other process at the same time can give start address of 0x12131415 and if they do not use relative addressing, .so will not work at all.因此,一个进程可以在自己的虚拟空间中将 .so 的起始地址作为 0x45678910 提供,而其他进程同时可以提供 0x12131415 的起始地址,如果它们不使用相对寻址,则 .so 将根本不起作用。

So they always have to use the relative addressing mode and hence fpic option.所以他们总是必须使用相对寻址模式,因此必须使用 fpic 选项。

The link to a function in a dynamic library is resolved when the library is loaded or at run time.在加载库时或在运行时解析动态库中函数的链接。 Therefore, both the executable file and dynamic library are loaded into memory when the program is run.因此,程序运行时,可执行文件和动态库都会被加载到内存中。 The memory address at which a dynamic library is loaded cannot be determined in advance, because a fixed address might clash with another dynamic library requiring the same address.加载动态库的内存地址无法提前确定,因为固定地址可能与另一个需要相同地址的动态库发生冲突。


There are two commonly used methods for dealing with this problem:有两种常用的方法来处理这个问题:

1.Relocation. 1.搬迁。 All pointers and addresses in the code are modified, if necessary, to fit the actual load address.如有必要,修改代码中的所有指针和地址以适合实际加载地址。 Relocation is done by the linker and the loader.重定位由链接器和加载器完成。

2.Position-independent code. 2.位置无关代码。 All addresses in the code are relative to the current position.代码中的所有地址都是相对于当前位置的。 Shared objects in Unix-like systems use position-independent code by default.默认情况下,类 Unix 系统中的共享对象使用与位置无关的代码。 This is less efficient than relocation if program run for a long time, especially in 32-bit mode.如果程序运行很长时间,特别是在 32 位模式下,这比重定位效率低。


The name " position-independent code " actually implies the following: 位置无关代码”这个名称实际上意味着以下内容:

  • The code section contains no absolute addresses that need relocation, but only self relative addresses.代码段不包含需要重定位的绝对地址,而只有自身相对地址。 Therefore, the code section can be loaded at an arbitrary memory address and shared between multiple processes.因此,代码段可以加载到任意内存地址并在多个进程之间共享。

  • The data section is not shared between multiple processes because it often contains writeable data.数据部分不在多个进程之间共享,因为它通常包含可写数据。 Therefore, the data section may contain pointers or addresses that need relocation.因此,数据部分可能包含需要重定位的指针或地址。

  • All public functions and public data can be overridden in Linux.所有公共功能和公共数据都可以在 Linux 中被覆盖。 If a function in the main executable has the same name as a function in a shared object, then the the version in main will take precedence, not only when called from main, but also when called from the shared object.如果 main 可执行文件中的函数与共享对象中的函数同名,则 main 中的版本将优先,不仅在从 main 调用时,而且在从共享对象调用时。 Likewise, when a global variable in the main has the same name as a global variable in the shared object, then the instance in main will be used, even when accessed from the shared object.同样,当 main 中的全局变量与共享对象中的全局变量同名时,即使从共享对象访问,也会使用 main 中的实例。 This so-called symbol interposition is intended to mimic the behavior of static libraries.这种所谓的符号插入旨在模仿静态库的行为。


A shared object has a table of pointers to its functions, called procedure linkage table (PLT), and a table of pointers to its variables called global offset table (GOT) in order to implement this "override" feature.共享对象有一个指向其函数的指针表,称为过程链接表 (PLT),以及一个指向其变量的指针表,称为全局偏移表 (GOT),以实现此“覆盖”功能。

All accesses to functions and public variables go through these tables.所有对函数和公共变量的访问都通过这些表。

ps Where dynamic linking cannot be avoided, there are various ways to avoid the time-consuming features of the position-independent code. ps 在无法避免动态链接的情况下,有多种方法可以避免位置无关代码的耗时特性。

You can read more from this article: http://www.agner.org/optimize/optimizing_cpp.pdf您可以从这篇文章中了解更多信息: http ://www.agner.org/optimize/optimizing_cpp.pdf

A minor addition to the answers already posted: object files not compiled to be position independent are relocatable;对已经发布的答案的一个小补充:未编译为与位置无关的目标文件是可重定位的; they contain relocation table entries.它们包含重定位表条目。

These entries allow the loader (that bit of code that loads a program into memory) to rewrite the absolute addresses to adjust for the actual load address in the virtual address space.这些条目允许加载程序(将程序加载到内存中的代码位)重写绝对地址以调整虚拟地址空间中的实际加载地址。

An operating system will try to share a single copy of a "shared object library" loaded into memory with all the programs that are linked to that same shared object library.操作系统将尝试与链接到同一共享对象库的所有程序共享加载到内存中的“共享对象库”的单个副本。

Since the code address space (unlike sections of the data space) need not be contiguous, and because most programs that link to a specific library have a fairly fixed library dependency tree, this succeeds most of the time.由于代码地址空间(与数据空间的部分不同)不需要是连续的,并且由于链接到特定库的大多数程序都有相当固定的库依赖树,因此大多数情况下都会成功。 In those rare cases where there is a discrepancy, yes, it may be necessary to have two or more copies of a shared object library in memory.在极少数存在差异的情况下,是的,可能需要在内存中拥有两个或多个共享对象库的副本。

Obviously, any attempt to randomize the load address of a library between programs and/or program instances (so as to reduce the possibility of creating an exploitable pattern) will make such cases common, not rare, so where a system has enabled this capability, one should make every attempt to compile all shared object libraries to be position independent.显然,任何在程序和/或程序实例之间随机化库的加载地址的尝试(以减少创建可利用模式的可能性)都会使这种情况变得普遍,而不是罕见,因此如果系统启用了此功能,应该尽一切努力将所有共享对象库编译为与位置无关。

Since calls into these libraries from the body of the main program will also be made relocatable, this makes it much less likely that a shared library will have to be copied.由于从主程序主体对这些库的调用也将成为可重定位的,这使得必须复制共享库的可能性大大降低。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM