简体   繁体   English

AArch64重定位前缀

[英]AArch64 relocation prefixes

I noticed a GNU asm relocation syntax for ARM 64-bit assembly. 我注意到ARM 64位程序集的GNU asm 重定位语法 What are those pieces like #:abs_g0_nc: and :pg_hi21: ? 那些像#:abs_g0_nc::pg_hi21: Where are they explained? 他们在哪里解释? Is there a pattern to them or are they made up on the go? 他们是否有一种模式,或者他们是否在旅途中弥补? Where can I learn more? 我在哪里可以了解更多?

Introduction 介绍

ELF64 defines two types of relocation entries, called REL and RELA : ELF64定义了两种类型的重定位条目,称为RELRELA

typedef struct
{
    Elf64_Addr r_offset;    /* Address of reference */
    Elf64_Xword r_info;     /* Symbol index and type of relocation */
} Elf64_Rel;

typedef struct
{
    Elf64_Addr r_offset;    /* Address of reference */
    Elf64_Xword r_info;     /* Symbol index and type of relocation */
    Elf64_Sxword r_addend;  /* Constant part of expression */
} Elf64_Rela;

The scope of each relocation entry is to give the loader (static or dynamic) four pieces of information: 每个重定位条目的范围是为加载器(静态或动态)提供四条信息:

  • The virtual address or the offset of the instruction to patch. 要修补的指令的虚拟地址或偏移量。
    This is given by r_offset . 这是由r_offset给出的。

  • The runtime address of the symbol accessed. 访问的符号的运行时地址。
    This is given by the higher part of r_info . 这是由r_info的较高部分r_info

  • A custom value called addend 一个名为addend的自定义值
    This value, eventually, as an operand in the expression used to calculate the value that will be written to patch the instruction. 该值最终作为表达式中的操作数,用于计算将被写入修补指令的值。
    RELA entries have this value in r_addend , REL entries extract it from the relocation site. RELA条目在r_addend具有此值,REL条目r_addend定位站点提取它。

  • The relocation type This determines the type of expression uses to calculate the value to patch the instruction. 重定位类型这决定了用于计算修补指令的值的表达式类型。 This is encoded in the lower part of r_info . 这是在r_info的下半部分编码的。

Relocating 重新定位

During the relocation phase the loader goes through all the relocation entries and write to the location specified by each r_offset , using a formula chosen by the lower part of r_info to compute the value to be stored from the addend ( r_addend for RELA) and the symbol address (obtainable from the upper part of r_info ). 在重定位阶段,加载程序遍历所有重定位条目并写入每个r_offset指定的位置,使用r_info下部选择的公式计算要从加数 (RELA的r_addend )和符号存储的值地址(可从r_info的上部获得)。

Actually the write part has been simplified, contrary to other architecture where the immediate field of an instruction usually occupy entirely separate byes from the ones used to encode the operation, in ARM, the immediate value is mixed with other encoding information. 实际上,写入部分已被简化,与其他架构相反,其中指令的直接字段通常与用于编码操作的字段完全分开,在ARM中,立即值与其他编码信息混合。
So the loader should know what kind of instruction is trying to relocate, if it is an instruction at all 1 , but instead of letting it disassemble the site of relocation, it is the assembler that set the relocation type according to the instruction. 所以装载机应该知道什么样的指令是试图重新定位,如果是在所有1的指令,而不是让它拆卸搬迁的网站,它是根据指令集的重定位类型的汇编。

Each relocation symbol can relocate only one or two, encoding-equivalent, instructions. 每个重定位符号只能重定位一个或两个编码等效的指令。
In specific case the relocation itself even change the type of instruction. 在特定情况下,重定位本身甚至会改变指令的类型。

The value compute computed during the relocation is implicitly extended to 64 bits, signed or unsigned based on the relocation type chosen. 根据所选的重定位类型,在重定位期间计算的值计算被隐式扩展为64位,有符号或无符号。

AArch64 relocation AArch64搬迁

Being ARM a RISC architecture with fixed instruction size, loading full width, ie 64 bits, immediate into a register is non trivial as no instruction can have a full width immediate field. 作为ARM具有固定指令大小的RISC架构,将全宽(即64位)立即加载到寄存器中并非易事,因为没有指令可以具有全宽度立即字段。

Relocation in AArch64 has to address this issue too, it is actually a two fold problem: first, find the real value that the programmer intended to use (this is the pure relocation part of the problem); AArch64中的重定位也必须解决这个问题,它实际上是一个双重问题:首先,找到程序员打算使用的真正价值(这是问题的纯粹重定位部分); second, find a way to put it into a register, since no instruction has a 64 bits immediate field. 第二,找到一种方法将它放入寄存器,因为没有指令有64位立即数字段。

The second issue is addressed by using group relocation , each relocation type in a group is used to compute a 16 bits part of the 64 bits value, therefore there can only be four relocation type in a group (ranging from G0 to G3 ). 第二个问题通过使用组重定位来解决, 组中的每个重定位类型用于计算64位值的16位部分,因此组中只能有四个重定位类型(范围从G0G3 )。

This slicing into 16 bits comes to fit with the movk (move keeping), movz (move zeroing) and movn (move negating logically). 这种切成16位的方式适合movk (移动保持), movz (移动归零)和movn (逻辑上移动否定)。
Other instructions, like b , bl , adrp , adr and so on, have a relocation type specially suited for them. 其他指令,如bbladrpadr等,具有特别适合它们的重定位类型。

Whenever there is only one, thus unambiguous, possible relocation type for a given instruction that reference a symbol, the assembler can generate the corresponding entry without the need, for the programmer, to specify it explicitly. 每当引用符号的给定指令只有一个明确的可能的重定位类型时,汇编器就可以生成相应的条目,而不需要程序员明确地指定它。

Group relocation doesn't fit into this category, they exist to allow the programmer some flexibility, thus are generally explicitly stated. 组重定位不适合这一类,它们的存在是为了让程序员具有一定的灵活性,因此通常都是明确说明的。 In a group, a relocation type can specify if the assembler must perform an overflow check or not. 在组中,重定位类型可以指定汇编程序是否必须执行溢出检查。
A G0 relocation, used to load the lower 16 bits of a value, unless explicitly suppressed, check that the value can fit 16 bits (signed or unsigned, depending on the specific type used). G0重定位,用于加载值的低16位,除非明确禁止,否则检查该值是否适合16位(有符号或无符号,具体取决于所使用的特定类型)。 The same is true for G1 , that loading bits 31-16 check that the values can fits 32 bits. G1也是如此,加载位31-16检查值是否适合32位。
As a consequence G3 is always non checking as every value fits 64 bits. 因此, G3总是不检查,因为每个值都适合64位。

Finally, relocation can be used to load integer values into register. 最后,重定位可用于将整数值加载到寄存器中。 In fact, an address of a symbol is nothing more than an arbitrary integer constant. 实际上,符号的地址只不过是一个任意的整数常量。
Note that r_addend is 64 bits wide. 请注意, r_addend是64位宽。


1 If r_offset points to a site in a data section the computed value is written as 64 bits word at the location indicated. 1如果r_offset指向数据部分中的某个站点,则计算出的值将在指定的位置写为64位字。

Relocation operators 搬迁运营商

First of all, some references: 首先,一些参考:

  • The ARM document that describes the relocation types for the ELF64 format is here , section 4.6 描述ELF64格式的重定位类型的ARM文档在此处 ,第4.6节

  • A test AArch64 assembly file that, presumably, contains all the relocation operators available to GAS is here 可能包含GAS可用的所有重定位运算符的测试AArch64程序集文件在此处

Conventions 约定

Following the ARM document convention we have: 遵循ARM文档约定,我们有:

S is the runtime address of the symbol being relocated. S是要重定位的符号的运行时地址。
A is the addend for the relocation. A是重新安置的加数。
P is the address of the relocation site (derived from r_offset ). P是重定位站点的地址(从r_offset派生)。
X is the result of a relocation operation, before any masking or bit-selection operation is applied. 在应用任何屏蔽或位选择操作之前, X是重定位操作的结果。
Page(expr) is the page address of the expression expr, defined as expr & ~0xFFF , ie expr with the lower 12 bits cleared. Page(expr)是表达式expr的页面地址,定义为expr & ~0xFFF ,即清除低12位的expr GOT is the address of the Global Offset Table . GOT全球抵消表的地址。
GDAT(S+A) represents a 64-bit entry in the GOT for address S+A. GDAT(S+A)表示地址S + A的GOT中的64位条目。 The entry will be relocated at run time with relocation R_AARCH64_GLOB_DAT(S+A). 该条目将在运行时重定位R_AARCH64_GLOB_DAT(S + A)。
G(expr) is the address of the GOT entry for the expression expr. G(expr)是表达式expr的GOT条目的地址
Delta(S) resolves to the difference between the static link address of S and the execution address of S . Delta(S)解析到的静态链接地址之间的差S和的执行地址S If S is the null symbol (ELF symbol index 0), resolves to the difference between the static link address of P and the execution address of P . 如果S是零码元(ELF符号索引0),解析到的静态链接地址之间的差P和的执行地址P
Indirect(expr) represents the result of calling expr as a function. Indirect(expr)表示将expr作为函数调用的结果。
[msb:lsb] is a bit-mask operation representing the selection of bits in a value, bounds are inclusive. [msb:lsb]是一个位掩码操作,表示值中位的选择,边界是包含的。

Operators 运营商

The relocation name is missing the prefix R_AARCH64_ for the sake of compactness. 为了紧凑,重定位名称缺少前缀R_AARCH64_

Expressions of the kind |X|≤2^16 are intended as -2^16 ≤ X < 2^16 , note the strict inequality on the right. | X |≤2^ 16的表达式为-2 ^16≤X<2 ^ 16注意右边的严格不等式。
This is an abuse of notation, called by the constrains of formatting a table. 这是滥用符号,由格式化表格的约束调用。

Group relocations 集团搬迁

Operator    | Relocation name | Operation | Inst | Immediate | Check
------------+-----------------+-----------+------+-----------+----------
:abs_g0:    | MOVW_UABS_G0    | S + A     | movz | X[15:0]   | 0≤X≤2^16
------------+-----------------+-----------+------+-----------+----------
:abs_g0_nc: | MOVW_UABS_G0_NC | S + A     | movk | X[15:0]   | 
------------+-----------------+-----------+------+-----------+----------
:abs_g1:    | MOVW_UABS_G1    | S + A     | movz | X[31:16]  | 0≤X≤2^32
------------+-----------------+-----------+------+-----------+----------
:abs_g1_nc: | MOVW_UABS_G1_NC | S + A     | movk | X[31:16]  | 
------------+-----------------+-----------+------+-----------+----------
:abs_g2:    | MOVW_UABS_G2    | S + A     | movz | X[47:32]  | 0≤X≤2^48
------------+-----------------+-----------+------+-----------+----------
:abs_g2_nc: | MOVW_UABS_G2_NC | S + A     | movk | X[47:32]  | 
------------+-----------------+-----------+------+-----------+----------
:abs_g3:    | MOVW_UABS_G3    | S + A     | movk | X[64:48]  | 
            |                 |           | movz |           |
------------+-----------------+-----------+------+-----------+----------
:abs_g0_s:  | MOVW_SABS_G0    | S + A     | movz | X[15:0]   | |X|≤2^16
            |                 |           | movn |           |
------------+-----------------+-----------+------+-----------+----------
:abs_g1_s:  | MOVW_SABS_G1    | S + A     | movz | X[31:16]  | |X|≤2^32
            |                 |           | movn |           |
------------+-----------------+-----------+------+-----------+----------
:abs_g2_s:  | MOVW_SABS_G2    | S + A     | movz | X[47:32]  | |X|≤2^48
            |                 |           | movn |           |
------------+-----------------+-----------+------+-----------+----------

In the table the ABS version is showed, the assembler can pickup the PREL (PC relative) or the GOTOFF (GOT relative) version depending on the symbol referenced and the type of output format. 在表中显示了ABS版本,汇编程序可以根据引用的符号和输出格式的类型拾取PREL (PC相对)或GOTOFF (GOT相对)版本。

A typical use of this relocation operators is 此重定位运算符的典型用法是

Unsigned 64 bits                      Signed 64 bits   
movz    x1,#:abs_g3:u64               movz  x1,#:abs_g3_s:u64
movk    x1,#:abs_g2_nc:u64            movk  x1,#:abs_g2_nc:u64
movk    x1,#:abs_g1_nc:u64            movk  x1,#:abs_g1_nc:u64
movk    x1,#:abs_g0_nc:u64            movk  x1,#:abs_g0_nc:u64

Usually one one checking operator is used, the one that set the highest part. 通常使用一个检查操作符,即设置最高部分的操作符。
That's why checking version relocates movz only, while the non checking version relocates movk (which partially set a register). 这就是检查版本仅重新定位movz ,而非检查版本重定位movk (部分设置寄存器)的原因。
G3 relocated both because it is intrinsically non checking as no value can exceed 64 bits. G3重新定位,因为它本质上没有检查,因为没有值可以超过64位。

The signed versions ends with _s and they are always checking. 签名版本以_s结尾,它们始终在检查。
There is no G3 version because if a 64 bits value is used the sign if sully specified in the value itself. 没有G3版本,因为如果使用64位值,则在值本身中指定了符号。
They are always used only to set the highest part, as the sign is relevant only there. 它们总是仅用于设置最高部分,因为符号仅在那里相关。
They are always checking as an overflow in a signed value make the value meaning less. 它们总是在签名值中检查溢出,使值意义更小。
These relocations change the type of the instruction to movn or movz based on the sign of the value, this effectively sign extend the value. 这些重定位根据值的符号将指令的类型更改为movnmovz ,这实际上是符号扩展值。

Group relocations, are also available 也可以进行团体搬迁

PC-relative, 19, 21, 33 bits addresses PC相对,19,21,33位地址

Operator    | Relocation name | Operation | Inst | Immediate | Check
------------+-----------------+-----------+------+-----------+----------
[implicit]  | LD_PREL_LO19    | S + A - P | ldr  | X[20:2]   | |X|≤2^20
------------+-----------------+-----------+------+-----------+----------
[implicit]  | LD_PREL_LO21    | S + A - P | adr  | X[20:0]   | |X|≤2^20
------------+-----------------+-----------+------+-----------+----------
[implicit]  | LD_PREL_LO21    | S + A - P | adr  | X[20:0]   | |X|≤2^20
------------+-----------------+-----------+------+-----------+----------
:pg_hi21:   | ADR_PREL_PG     | Page(S+A) | adrp | X[31:12]  | |X|≤2^32
            | _HI21           | - Page(P) |      |           |
------------+-----------------+-----------+------+-----------+----------
:pg_hi21_nc:| ADR_PREL_PG     | Page(S+A) | adrp | X[31:12]  | 
            | _HI21_NC        | - Page(P) |      |           |
------------+-----------------+-----------+------+-----------+----------
:lo12:      | ADD_ABS_LO12_NC | S + A     | add  | X[11:0]   | 
------------+-----------------+-----------+------+-----------+----------
:lo12:      | LDST8_ABS_LO12  | S + A     | ld   | X[11:0]   | 
            | _NC             |           | st   |           |
------------+-----------------+-----------+------+-----------+----------
:lo12:      | LDST16_ABS_LO12 | S + A     | ld   | X[11:1]   | 
            | _NC             |           | st   |           |
------------+-----------------+-----------+------+-----------+----------
:lo12:      | LDST32_ABS_LO12 | S + A     | ld   | X[11:2]   | 
            | _NC             |           | st   |           |
------------+-----------------+-----------+------+-----------+----------
:lo12:      | LDST64_ABS_LO12 | S + A     | prfm | X[11:3]   | 
            | _NC             |           |      |           |
------------+-----------------+-----------+------+-----------+----------
:lo12:      | LDST128_ABS     | S + A     | ?    | X[11:4]   | 
            | _LO12_NC        |           |      |           |

The :lo12: change meaning depending on the size of the data the instruction is handling (eg ldrb uses LDST8_ABS_LO12_NC , ldrh uses LDST16_ABS_LO12_NC ). :lo12:根据指令处理的数据大小改变含义(例如ldrb使用LDST8_ABS_LO12_NCldrh使用LDST16_ABS_LO12_NC )。

A GOT relative version of these relocations also exists, the assembler will pickup the right one. 这些重定位的GOT相对版本也存在,汇编器将拾取正确的版本。

Control flow relocations 控制流重定位

Operator    | Relocation name | Operation | Inst | Immediate | Check
------------+-----------------+-----------+------+-----------+----------
[implicit]  | TSTBR14         | S + A - P | tbz  | X[15:2]   | |X|≤2^15
            |                 |           | tbnz |           |  
------------+-----------------+-----------+------+-----------+----------
[implicit]  | CONDBR19        | S + A - P | b.*  | X[20:2]   | |X|≤2^20
------------+-----------------+-----------+------+-----------+----------
[implicit]  | JUMP26          | S + A - P | b    | X[27:2]   | |X|≤2^27
------------+-----------------+-----------+------+-----------+----------
[implicit]  | CALL26          | S + A - P | bl   | X[27:2]   | |X|≤2^27
------------+-----------------+-----------+------+-----------+----------

Epilogue 结语

I couldn't find an official documentation. 我找不到官方文件。
The tables above have been reconstructed from the GAS test case and the ARM document explaining the type of relocations available for AArch64 compliant ELFs. 上表是根据GAS测试用例和ARM文档重建的,该文档解释了符合AArch64标准的ELF可用的重定位类型。

The tables doesn't show all the relocations present in the ARM document, as most of them are complementary versions, picked up by the assembler automatically. 这些表没有显示ARM文档中存在的所有重定位,因为大多数都是互补版本,由汇编程序自动获取。

A section with examples would be great, but I don't have an ARM GAS. 带示例的部分会很棒,但我没有ARM GAS。
In the future I may extend this answer to include examples of assembly listings and relocations dumps. 在将来,我可以扩展这个答案,包括汇编列表和重定位转储的示例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM