简体   繁体   English

在 gcc 内联 x86_64 程序集中约束 r10 寄存器

[英]Constraining r10 register in gcc inline x86_64 assembly

I'm having a go at writing a very light weight libc replacement library so that I can better understand the kernel - application interface.我正在尝试编写一个非常轻量级的 libc 替换库,以便我可以更好地理解内核 - 应用程序接口。 The first task is clearly getting some system call wrappers in place.第一个任务显然是让一些系统调用包装器就位。 I've successfully got 1 to 3 argument wrappers working but I'm struggling with a 4 argument varient.我已经成功地让 1 到 3 个参数包装器工作,但我正在努力处理 4 个参数变体。 Here's my starting point:这是我的出发点:

long _syscall4(long type, long a1, long a2, long a3, long a4)
{
    long ret;
    asm
    (
        "syscall"
        : "=a"(ret)    // return value
        : "a"(type), "D"(a1), "S"(a2), "d"(a3), "r10"(a4)
        : "c", "r11", "memory"  // some syscalls read or write memory
                   // the syscall insn instruction itself destroys RCX and R11
    );
    return ret;
}

(Editor's note: this is safe and usable, and a good example, after applying answer's way to handle r10 . MUSL libc has some similar macros.) (编者注:这是安全且可用的,并且是一个很好的例子,在应用答案的方式来处理r10 。MUSL libc 有一些类似的宏。)

The compiler gives me the following error:编译器给了我以下错误:

error: matching constraint references invalid operand number

My _syscall3 function works fine but doesn't use r10 or have a clobber list.我的 _syscall3 函数工作正常,但不使用 r10 或具有 clobber 列表。

(Editor's note: it wouldn't be safe to have no clobber list: you need to tell the compiler that RCX and R11 are overwritten, and that "memory" should be in sync before the system call which may read or write memory. If you wanted to write specific wrappers for specific system calls, you could selectively omit "memory" clobbers, or use dummy memory operands based on which parameters are pointers for that system call. (编者注:没有 clobber 列表是不安全的:您需要告诉编译器 RCX 和 R11 被覆盖,并且“内存”应该在可能读取或写入内存的系统调用之前同步。如果您想为特定的系统调用编写特定的包装器,您可以有选择地省略"memory"破坏,或者根据哪些参数是该系统调用的指针来使用虚拟内存操作数。

If this _syscall4 function can't inline, the register and "memory" clobbers won't in practice cause any problem, but you should make these able to inline;如果这个_syscall4函数不能内联,寄存器和"memory" clobbers 在实践中不会引起任何问题,但你应该使它们能够内联; inlining this system call will take less code at the call site than calling a non-inline wrapper function.)与调用非内联包装函数相比,内联此系统调用将在调用站点占用更少的代码。)

There are no constraints for registers: %r8 .. %15 .寄存器没有限制: %r8 .. %15 However, more recent (as in gcc-4.x) should accept:但是,最近的(如在 gcc-4.x 中)应该接受:

register long r10 asm("r10") = a4;

then use the input constraint: "r" (r10) for your asm statement.然后使用输入约束: "r" (r10)作为你的 asm 语句。
https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html


Note that forcing the choice of an "r" constraint for Extended asm is the only behaviour that GCC guarantees for register-asm locals.请注意,强制为 Extended asm 选择"r"约束是 GCC 为 register-asm 本地变量保证的唯一行为。 Things like register void *rsp asm("rsp");诸如register void *rsp asm("rsp"); and void *stack_pointer = rsp;void *stack_pointer = rsp; do sometimes work, but are not guaranteed and not recommended anymore.有时可以工作,但不能保证,不再推荐。


You're going to want your syscall wrapper asm statement to be volatile and have a "memory" clobber, unless you write specific wrappers for specific system calls to know which args are pointers, using a dummy memory input or output (as per How can I indicate that the memory *pointed* to by an inline ASM argument may be used? )你会希望你的 syscall 包装器asm语句是volatile并且有一个"memory"破坏,除非你为特定的系统调用编写特定的包装器来知道哪些 args 是指针,使用虚拟内存输入或输出(根据How can我指出可以使用内联 ASM 参数*指向*的内存?

It needs to volatile because doing write(1, buf, 16) should print the buffer twice, not just CSE the return value!它需要volatile因为执行write(1, buf, 16)应该打印缓冲区两次,而不仅仅是CSE返回值! System calls are in general not Pure functions of their inputs, so you need volatile .系统调用通常不是其输入的纯函数,因此您需要volatile

(Some specific system call wrappers like getpid could be non-volatile, because they do return the same thing every time, unless you also use fork. But getpid is more efficient if done through the VDSO so it doesn't have to enter the kernel in the first place if you're on Linux, so if you're making a custom wrapper for getpid and clock_gettime you probably don't want syscall in the first place. See The Definitive Guide to Linux System Calls ) (某些特定的系统调用包装器如getpid可能是非易失性的,因为它们每次都返回相同的内容,除非您还使用 fork。但是如果通过 VDSO 完成getpid效率更高,因此它不必进入内核首先,如果您使用的是 Linux,那么如果您要为getpidclock_gettime制作自定义包装器,您可能首先不想要syscall 。请参阅Linux 系统调用权威指南

The "memory" clobber is needed because a pointer in a register does not imply that the pointed-to memory is also an input or output.需要"memory"破坏器,因为寄存器中的指针并不意味着指向的内存也是输入或输出。 Stores to a buffer that are only read by a write system call need to not be optimized away as dead stores.存储到缓存只能由一个读write系统调用需要不能死门店优化掉。 Or for munmap , the compiler had better have finished any loads/stores before the memory is unmapped.或者对于munmap ,编译器最好取消映射内存之前完成所有加载/存储。 Some system calls don't take any pointer inputs, and don't need "memory" , but a generic wrapper has to make worst-case assumptions.一些系统调用不接受任何指针输入,也不需要"memory" ,但通用包装器必须做出最坏的假设。

register ... asm("r10") does not in general require asm volatile or "memory" clobbers, but a syscall wrapper does. register ... asm("r10")一般要求asm volatile"memory" ,则会覆盖掉,而是一个系统调用封装器。

Presumably because no instructions have specific requirement for r10 register, the gcc folks didn't create a constraint for it (given that the constraints are primarily for the machine descriptions).大概是因为没有指令对r10寄存器有特定要求,gcc 人员没有为其创建约束(假设约束主要用于机器描述)。 If you insist on inline asm I don't think you can do better than using a generic "r" (or "m" ) constraint and moving into r10 yourself (and adding it to the clobber list).如果您坚持使用内联 asm,我认为您不会比使用通用的"r" (或"m" )约束并自己进入r10 (并将其添加到 clobber 列表中)做得更好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM