[英]Is it possible to call a relative address with each instruction at most 3 bytes long, in 32-bit mode?
I'm working on an exercise in x86 assembly (using NASM) that has the niche requirement of limiting each instruction to a maximum of 3 bytes . 我正在进行x86程序集(使用NASM) 的练习 , 该练习具有将每条指令限制为最多3个字节的利基要求。
I'd like to call a label, but the normal way to do this (shown in the code example) always results in an instruction size of 5 bytes. 我想调用一个标签,但正常的方法(在代码示例中显示)总是会产生5个字节的指令大小。 I'm trying to find out if there's a series of instructions, 3 bytes or less each, that can accomplish this.
我试图找出是否有一系列指令,每个指令3个字节或更少,可以实现这一目标。
I've attempted to load the label address into a register and then call that register, but it seems like the address is then interpreted as an absolute address, instead of a relative one. 我试图将标签地址加载到寄存器中,然后调用该寄存器,但似乎地址被解释为绝对地址,而不是相对地址。
I looked around to see if there's a way to force call to interpret the address in the register as a relative address, but couldn't find anything. 我四处查看是否有办法强制调用将寄存器中的地址解释为相对地址,但找不到任何内容。 I have thought about simulating a call by pushing a return address to the stack and using
jmp rel8
, but am unsure how to get the absolute address of where I want to return to. 我已经考虑过通过将返回地址推送到堆栈并使用
jmp rel8
来模拟调用,但我不确定如何获取我想要返回的绝对地址。
Here is the normal way to do what I want: 这是做我想要的正常方式:
[BITS 32]
call func ; this results in a 5-byte call rel32 instruction
; series of instructions here that I would like to return to
func:
; some operations here
ret
I have tried things like this: 我尝试过这样的事情:
[BITS 32]
mov eax, func ; 5-byte mov r32, imm32
call eax ; 2-byte call r32
; this fails, seems to interpret func's relative address as an absolute
... ; series of instructions here that I would like to return to
func:
; some operations here
ret
I have a feeling there may be a way to do this using some sort of LEA magic, but I'm relatively new to assembly so I couldn't figure it out. 我觉得有一种方法可以使用某种LEA魔法来做到这一点,但我对装配比较新,所以我无法弄明白。
Any tips are appreciated! 任何提示表示赞赏!
In 32-bit x86, the only way to read your current instruction pointer is to do a call
instruction and read the stack. 在32位x86中,读取当前指令指针的唯一方法是执行
call
指令并读取堆栈。 Unless you have the address of a suitable gadget already in a register, you will have to use an immediate relative offset, which is a 5-byte instruction. 除非您已在寄存器中具有合适小工具的地址,否则您将必须使用立即相对偏移量,即5字节指令。
(In 64-bit x86, you can also use lea rax, [rip]
, but that is a 7-byte instruction.) (在64位x86中,您也可以使用
lea rax, [rip]
,但这是一个7字节的指令。)
However, it might be possible to cheat here. 但是,这里可能会作弊。 If the code that calls your NASM binary always calls your code with something like
call edi
, then you can just calculate from that register. 如果调用NASM二进制文件的代码总是使用类似
call edi
代码调用代码,那么您只需从该寄存器中进行计算。 It's a hack, but so is restricting yourself to 3-byte instructions. 这是一个黑客,但也限制自己使用3字节指令。
By the way, for a little trick, this is how you can load 32-bit constants in 3-byte (or 2-byte) instructions (loading 0xDEADBEEF as an example): 顺便说一句,对于一个小技巧,这就是如何在3字节(或2字节)指令中加载32位常量(以0xDEADBEEF为例):
mov al, 0xDE
mov ah, 0xAD
bswap eax
mov ah, 0xBE
mov al, 0xEF
There is no such thing as relative indirect near CALL
. 在
CALL
附近没有相对间接的东西。 You will have to find some other mechanism to do the call to the label func
. 您将不得不找到一些其他机制来调用标签
func
。 One method I can think of is building the absolute address in a register and doing an absolute indirect call through the register: 我能想到的一种方法是在寄存器中构建绝对地址并通过寄存器进行绝对间接调用:
It is unclear what the target of your code is. 目前还不清楚代码的目标是什么。 This assumes you are generating a 32-bit Linux program.
这假设您正在生成32位Linux程序。 I use a linker script to compute the individual bytes of the target label.
我使用链接描述文件来计算目标标签的各个字节。 Those bytes will be used by the program to build a return address in EAX and then an indirect near call via EAX will be performed.
程序将使用这些字节在EAX中构建返回地址,然后执行通过EAX的间接近程调用。 A couple methods of building the address are presented.
提出了几种构建地址的方法。
A linker script link.ld
that breaks a label's address into individual bytes: 链接描述文件
link.ld
,它将标签的地址分成单个字节:
SECTIONS
{
. = 0x8048000;
func_b0 = func & 0x000000ff;
func_b1 = (func & 0x0000ff00) >> 8;
func_b2 = (func & 0x00ff0000) >> 16;
func_b3 = (func & 0xff000000) >> 24;
}
Assembly code file myprog.asm
: 汇编代码文件
myprog.asm
:
[BITS 32]
global func
extern func_b0, func_b1, func_b2, func_b3
_start:
; Method 1
mov al, func_b3 ; EAX = ######b3
mov ah, func_b2 ; EAX = ####b2b3
bswap eax ; EAX = b3b2####
mov ah, func_b1 ; EAX = b3b2b1##
mov al, func_b0 ; EAX = b3b2b1b0
call eax
; Method 2
mov ah, func_b3 ; EAX = ####b3##
mov al, func_b2 ; EAX = ####b3b2
shl eax, 16 ; EAX = b3b20000
mov ah, func_b1 ; EAX = b3b2b100
mov al, func_b0 ; EAX = b3b2b1b0
call eax
; series of instructions here that I would like to return to
xor eax, eax
mov ebx, eax ; EBX = 0 return value
inc eax ; EAX = 1 exit system call
int 0x80 ; Do exit system call
func:
; some operations here
ret
Assemble and link with: 组装和链接:
nasm -f elf32 -F dwarf myprog.asm -o myprog.o
gcc -m32 -nostartfiles -g -Tlink.ld myprog.o -o myprog
If you run objdump -Mintel -Dx
the information of interest would look something similar to: 如果您运行
objdump -Mintel -Dx
,感兴趣的信息将类似于:
00000020 g *ABS* 00000000 func_b0 00000004 g *ABS* 00000000 func_b2 08048020 g .text 00000000 func 00000080 g *ABS* 00000000 func_b1 00000008 g *ABS* 00000000 func_b3 ... 08048000 <_start>: 8048000: b0 08 mov al,0x8 8048002: b4 04 mov ah,0x4 8048004: 0f c8 bswap eax 8048006: b4 80 mov ah,0x80 8048008: b0 20 mov al,0x20 804800a: ff d0 call eax 804800c: b4 08 mov ah,0x8 804800e: b0 04 mov al,0x4 8048010: c1 e0 10 shl eax,0x10 8048013: b4 80 mov ah,0x80 8048015: b0 20 mov al,0x20 8048017: ff d0 call eax 8048019: 31 c0 xor eax,eax 804801b: 89 c3 mov ebx,eax 804801d: 40 inc eax 804801e: cd 80 int 0x80 08048020 <func>: 8048020: c3 ret
In 64-bit code , 2-byte syscall
will set RCX = RIP (which the kernel usually uses for sysret
), so under most OSes you can make an invalid system call to get RCX=RIP. 在64位代码中 , 2字节的
syscall
将设置RCX = RIP(内核通常用于sysret
),因此在大多数操作系统下,您可以进行无效的系统调用以获取RCX = RIP。 (eg by setting EAX or RAX to -1 with 3-byte or eax,-1
, so under Linux syscall
will return with RAX = -ENOSYS.) Credit to @Myria for this idea. (例如,通过将3个字节
or eax,-1
设置为EAX或RAX为-1,因此在Linux syscall
将返回RAX = -ENOSYS。) 感谢@Myria这个想法。
It depends on the OS whether this method works: an OS can always return with iret
after doing anything it wants to the registers, so it would be possible to design a kernel ABI where this doesn't work. 这取决于操作系统这个方法是否有效:操作系统总是可以在执行任何想要寄存器的
iret
后返回iret
,因此可以设计一个不起作用的内核ABI。 But AFAIK it should work under any of the mainstream OSes. 但AFAIK应该适用于任何主流操作系统。 But again, only in long mode.
但同样,只有在长模式下。 AMD CPUs support
syscall
in 32-bit mode but it works differently. AMD CPU在32位模式下支持
syscall
,但它的工作方式不同。
In 32-bit code, the only normal/sane way to read EIP is with a call
instruction. 在32位代码中,读取EIP的唯一正常/理智方式是使用
call
指令。 So it's generally impossible to create position-independent code without using 5-byte call rel32
to get your own address. 因此,如果不使用5字节的
call rel32
来获取自己的地址,通常无法创建与位置无关的代码。
(Even self-modifying code would eventually execute a call rel32
). (即使是自修改代码最终也会执行
call rel32
)。
Other answers show ways to jump to a given absolute address using only small instructions. 其他答案显示了仅使用小指令跳转到给定绝对地址的方法。 But the target address isn't relative to the address of the machine code, except insofar as the absolute address of the machine code is also known so you can calculate the jump distance at build time.
但是目标地址与机器代码的地址无关,除非机器代码的绝对地址也是已知的,因此您可以在构建时计算跳转距离。
The same machine code would jump to the same address if loaded somewhere else, not to the same offset relative to its own address. 如果在其他地方加载,相同的机器代码将跳转到相同的地址,而不是相对于其自己的地址的相同偏移量。
Perhaps that's all your exercise was asking for. 也许这就是你所要求的一切。
If not, since we've rules out sane ways to write fully-PIC code, we need to consider insane ways. 如果没有, 因为我们已经排除了编写完全PIC代码的合理方法,我们需要考虑疯狂的方法。
Interrupts also push EIP onto the (kernel) stack, where an interrupt handler could access it. 中断还将EIP推送到(内核)堆栈,中断处理程序可以访问它。
If you're writing a kernel that can include interrupt handlers, you can include one that puts your current address into a register (for example EAX) by reading it from the stack with short instructions (like 3-byte mov eax, [ebp+4]
or whatever after setting up a stack-frame). 如果您正在编写一个可以包含中断处理程序的内核,那么您可以包含一个将当前地址放入寄存器(例如EAX)的程序,方法是通过简短指令从堆栈中读取它(如3字节
mov eax, [ebp+4]
或者在设置堆栈框架之后的任何事情)。
Then your normal code can invoke that interrupt handler with int 0x81
or whatever (a 3-byte instruction). 然后您的普通代码可以使用
int 0x81
或其他任何东西(3字节指令)调用该中断处理程序。
Setting up an interrupt-descriptor table should be possible if necessary: we can construct any value in registers using mov r8,imm8
and shifts as shown in other answer. 如果需要,应该可以设置中断描述符表:我们可以使用
mov r8,imm8
和shift来构造寄存器中的任何值,如其他答案所示。 Using this + 2 or 3-byte mov r/m32, r32
or 3-byte mov r/m8, imm8
we can store anything to any absolute address we choose by constructing the address (and optionally value) in a register. 使用这个+ 2或3字节的
mov r/m32, r32
或3字节的mov r/m8, imm8
我们可以通过在寄存器中构造地址(和可选的值)来将任何内容存储到我们选择的任何绝对地址。 This is setup to facilitate being able to run code that queries its own address with a compact "system call" instead of a call rel32
. 这是为了便于运行使用紧凑的“系统调用”而不是
call rel32
查询其自己的地址的代码。
Actually installing an IDT is possible with 3-byte lidt
( 0F 01 /3
with a simple addressing mode that uses ModRM + no extra byte). 实际上可以使用3字节的
lidt
安装IDT( 0F 01 /3
01/3使用简单的寻址模式,使用ModRM +无额外字节)。 Or query the current location with sidt
(same-length encoding). 或者使用
sidt
(相同长度编码)查询当前位置。
iret
is just 1-byte 0xCF
. iret
只是1字节0xCF
。 I don't think any of the necessary system-setup instructions have a minimum length of more than 3 bytes. 我不认为任何必要的系统设置指令的最小长度超过3个字节。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.