[英]Print newline with as little code as possible with NASM
I'm learning a bit of assembly for fun and I am probably too green to know the right terminology and find the answer myself. 我正在学习一些有趣的汇编程序,并且我可能太绿了,无法知道正确的术语并自己找到答案。
I want to print a newline at the end of my program. 我想在程序末尾打印换行符。
Below works fine. 下面工作正常。
section .data
newline db 10
section .text
_end:
mov rax, 1
mov rdi, 1
mov rsi, newline
mov rdx, 1
syscall
mov rax, 60
mov rdi, 0
syscall
But I'm hoping to achieve the same result without defining the newline in .data. 但是我希望在不定义.data换行符的情况下实现相同的结果。 Is it possible to call
sys_write
directly with the byte you want, or must it always be done with a reference to some predefined data (which I assume is what mov rsi, newline
is doing)? 是否可以直接使用所需的字节调用
sys_write
,还是必须始终通过引用一些预定义的数据来完成(我假设这是mov rsi, newline
在做什么)?
In short, why can't I replace mov rsi, newline
by mov rsi, 10
? 简而言之,为什么我不能用
mov rsi, 10
替换mov rsi, newline
?
You always need the data in memory to copy it to a file-descriptor. 您始终需要将内存中的数据复制到文件描述符中。 There is no system-call equivalent of C stdio
fputc
that takes data by value instead of by pointer. 没有与C stdio
fputc
等效的系统调用,它按值而不是指针获取数据。
mov rsi, newline
puts a pointer into a register (with a huge mov r64, imm64
instruction). mov rsi, newline
将指针放入寄存器(带有巨大的mov r64, imm64
指令)。 sys_write
doesn't special-case size=1 and treat its void *buf
arg as a char value if it's not a valid pointer. sys_write
大小= 1并没有特殊情况,如果不是有效的指针,则将其void *buf
arg视为char 值 。
There aren't any other system calls that would do the trick. 没有其他系统调用可以达到目的。
pwrite
and writev
are both more complicated (taking a file offset as well as a pointer, or taking an array of pointer+length to gather the data in kernel space). pwrite
和writev
都更加复杂(采用文件偏移量和指针,或者采用指针+长度的数组来收集内核空间中的数据)。
There is a lot you can do to optimize this for code-size, though. 但是,您可以做很多事情来优化代码大小。 See https://codegolf.stackexchange.com/questions/132981/tips-for-golfing-in-x86-x64-machine-code
参见https://codegolf.stackexchange.com/questions/132981/tips-for-golfing-in-x86-x64-machine-code
First, putting the newline character in static storage means you need to generate a static address in a register. 首先,将换行符放在静态存储中意味着您需要在寄存器中生成一个静态地址。 Your options here are:
您的选择如下:
mov esi, imm32
(only in Linux non-PIE executables, so static addresses are link-time constants and are known to be in the low 2GiB of virtual address space and thus work as 32-bit zero-extended or sign-extended) mov esi, imm32
(仅在Linux非PIE可执行文件中,因此静态地址是链接时间常数,并且已知位于虚拟地址空间的低2GiB中,因此可以用作32位零扩展或符号扩展)扩展) lea rsi, [rel newline]
Works everywhere, the only good option if you can't use the 5-byte mov-immediate. lea rsi, [rel newline]
随处可用,如果您不能使用5字节mov-immediate,这是唯一的好选择。 mov rsi, imm64
. mov rsi, imm64
。 This works even in PIE executables (eg if you link with gcc -nostdlib
without -static
, on a distro where PIE is the default.) But only via a runtime relocation fixup, and the code-size is terrible. -static
的发行版中与gcc -nostdlib
链接,则默认为PIE。)但是只能通过运行时重定位修正,并且代码大小很糟糕。 Compilers never use this because it's not faster than LEA. But like I said, we can avoid static addressing entirely: Use push
to put immediate data on the stack . 但是就像我说的那样, 我们可以完全避免静态寻址:使用
push
将立即数据放入堆栈中 。 This works even if we need zero-terminated strings, because push imm8
and push imm32
both sign-extend the immediate to 64-bit. 即使我们需要零终止的字符串,这也可以工作,因为
push imm8
和push imm32
都将立即数符号扩展为64位。 Since ASCII uses the low half of the 0..255 range, this is equivalent to zero-extension. 由于ASCII使用0..255范围的下半部分,因此这等效于零扩展。
Then we just need to copy RSP to RSI, because push
leave RSP pointing to the data that was pushed. 然后,我们只需要将RSP复制到RSI,因为
push
使RSP指向被推入的数据。 mov rsi, rsp
would be 3 bytes because it needs a REX prefix. mov rsi, rsp
为3个字节,因为它需要一个REX前缀。 If you were targeting 32-bit code or the x32 ABI (32-bit pointers in long mode) you could use 2-byte mov esi, esp
. 如果您瞄准的是32位代码或x32 ABI(长模式下的32位指针),则可以使用2字节
mov esi, esp
。 But Linux puts the stack pointer at top of user virtual address space, so on x86-64 that's 0x007ff..., right at the top of the low canonical range. 但是Linux将堆栈指针放在用户虚拟地址空间的顶部,因此在x86-64上为0x007ff ...,就在低规范范围的顶部。 So truncating a pointer to stack memory to 32 bits isn't an option;
因此,将堆栈存储器的指针截断为32位不是一个选择。 we'd get
-EFAULT
. 我们会得到
-EFAULT
。
But we can copy a 64-bit register with 1-byte push
+ 1-byte pop
. 但是我们可以复制带有1字节
push
+ 1字节pop
的64位寄存器。 (Assuming neither register needs a REX prefix to access.) (假设两个寄存器都不需要REX前缀来访问。)
default rel ; We don't use any explicit addressing modes, but no reason to leave this out.
_start:
push 10 ; \n
push rsp
pop rsi ; 2 bytes total vs. 3 for mov rsi,rsp
push 1 ; _NR_write call number
pop rax ; 3 bytes, vs. 5 for mov edi, 1
mov edx, eax ; length = call number by coincidence
mov edi, eax ; fd = length = call number also coincidence
syscall ; write(1, "\n", 1)
mov al, 60 ; assuming write didn't return -errno, replace the low byte and keep the high zeros
;xor edi, edi ; leave rdi = 1 from write
syscall ; _exit(1)
.size: db $ - _start
xor-zeroing is the most well-known x86 peephole optimization: it saves 3 bytes of code size, and is actually more efficient than mov edi, 0
. XOR归零是最公知x86窥视孔优化:它可以节省3个字节的代码尺寸,并且实际上比更有效的
mov edi, 0
。 But you only asked for the smallest code to print a newline, without specifying that it had to exit with status = 0. So we can save 2 bytes by leaving that out. 但是您只要求最小的代码来打印换行符,而无需指定必须以status = 0退出。所以我们可以省去2个字节。
Since we're just making an _exit
system call, we don't need to clean up the stack from the 10
we pushed. 因为我们只是在进行
_exit
系统调用,所以我们不需要清理我们推送的10
堆栈。
BTW, this will crash if the write
returns an error. 顺便说一句,如果
write
返回错误,这将崩溃。 (eg redirected to /dev/full
, or closed with ./newline >&-
, or whatever other condition.) That would leave RAX=-something, so mov al, 60
would give us RAX= 0xffff...3c
. (例如,重定向到
/dev/full
,或使用./newline >&-
或其他任何条件关闭。)这将使RAX = -something,因此mov al, 60
将mov al, 60
赋予我们RAX = 0xffff...3c
。 Then we'd get -ENOSYS
from the invalid call number, and fall off the end of _start
and decode whatever is next as instructions. 然后,我们将从无效的电话号码中获得
-ENOSYS
,然后掉到_start
的结尾并解码接下来的指令。 (Probably zero bytes which decode with [rax]
as an addressing mode. Then we'd fault with a SIGSEGV.) (可能是零字节,以
[rax]
作为寻址模式进行解码。然后我们将使用SIGSEGV出错。)
objdump -d -Mintel
disassembly of that code, after building with nasm -felf64
and linking with ld
在使用
nasm -felf64
并与ld
链接之后, objdump -d -Mintel
对该代码进行反汇编
0000000000401000 <_start>:
401000: 6a 0a push 0xa
401002: 54 push rsp
401003: 5e pop rsi
401004: 6a 01 push 0x1
401006: 58 pop rax
401007: 89 c2 mov edx,eax
401009: 89 c7 mov edi,eax
40100b: 0f 05 syscall
40100d: b0 3c mov al,0x3c
40100f: 0f 05 syscall
0000000000401011 <_start.size>:
401011: 11 .byte 0x11
So the total code-size is 0x11 = 17 bytes. 因此,总代码大小为0x11 = 17个字节。 vs. your version with 39 bytes of code + 1 byte of static data .
与您的版本(带有39个字节的代码+ 1个字节的静态数据)相比 。 Your first 3
mov
instructions alone are 5, 5, and 10 bytes long. 仅您的前3个
mov
指令长5、5和10个字节。 (Or 7 bytes long for mov rax,1
if you use YASM which doesn't optimize it to mov eax,1
). (或者
mov rax,1
如果使用YASM不能将其优化为mov eax,1
mov rax,1
那么mov rax,1
长度为7个字节)。
Running it: 运行它:
$ strace ./newline
execve("./newline", ["./newline"], 0x7ffd4e98d3f0 /* 54 vars */) = 0
write(1, "\n", 1
) = 1
exit(1) = ?
+++ exited with 1 +++
If you already have a pointer to some nearby static data in a register, you could do something like a 4-byte lea rsi, [rdx + newline-foo]
(REX.W + opcode + modrm + disp8), assuming the newline-foo
offset fits in a sign-extended disp8 and that RDX holds the address of foo
. 如果您已经有一个指向寄存器中附近静态数据的指针,则可以执行4字节
lea rsi, [rdx + newline-foo]
(REX.W + opcode + modrm + disp8)之类的操作,假设newline-foo
offset适合于符号扩展的disp8,并且RDX保留foo
的地址。
Then you can have newline: db 10
in static storage after all. 然后,您可以
newline: db 10
毕竟, newline: db 10
在静态存储中。 (Put it .rodata
or .data
, depending on which section you already had a pointer to). (将其放置为
.rodata
或.data
,具体取决于您已指向哪个部分)。
It expects an address of the string in rsi
register. 它期望该字符串在
rsi
寄存器中的地址。 Not a character or string. 不是字符或字符串。
mov rsi, newline
loads the address of newline
into rsi
. mov rsi, newline
的加载地址newline
到rsi
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.