[英]What ensures reads/writes of operands occurs at desired timed with extended ASM?
According to GCC's Extended ASM and Assembler Template , to keep instructions consecutive, they must be in the same ASM block. 根据GCC的扩展ASM和汇编程序模板 ,要保持指令连续,它们必须位于同一ASM块中。 I'm having trouble understanding what provides the scheduling or timings of reads and writes to the operands in a block with multiple statements.
我很难理解是什么提供了对具有多个语句的块中的操作数进行读写的调度或计时。
As an example, EBX
or RBX
needs to be preserved when using CPUID
because, according to the ABI, the caller owns it. 例如,使用
CPUID
时需要保留EBX
或RBX
,因为根据ABI,调用者拥有它。 There are some open questions with respect to the use of EBX
and RBX
, so we want to preserve it unconditionally (its a requirement). 关于
EBX
和RBX
的使用存在一些未解决的问题,因此我们希望无条件地保留它(这是一个要求)。 So three instructions need to be encoded into a single ASM block to ensure the consecutive-ness of the instructions (re: the assembler template discussed in the first paragraph): 因此,需要将三个指令编码到单个ASM块中,以确保指令的连续性(例如:第一段中讨论的汇编模板):
unsigned int __FUNC = 1, __SUBFUNC = 0;
unsigned int __EAX, __EBX, __ECX, __EDX;
__asm__ __volatile__ (
"push %ebx;"
"cpuid;"
"pop %ebx"
: "=a"(__EAX), "=b"(__EBX), "=c"(__ECX), "=d"(__EDX)
: "a"(__FUNC), "c"(__SUBFUNC)
);
If the expression representing the operands is interpreted at the wrong point in time, then __EBX
will be the saved EBX
(and not the CPUID
's EBX
), which will likely be a pointer to the Global Offset Table (GOT) if PIC is enabled. 如果在错误的时间点解释了表示操作数的表达式,则
__EBX
将是保存的EBX
(而不是CPUID
的EBX
),如果启用了PIC,则它很可能是指向全局偏移表(GOT)的指针。
Where, exactly, does the expression specify that the store of CPUID
's %EBX
into __EBX
should happen (1) after the PUSH %EBX
; 该表达式确切地在何处指定将
CPUID
的%EBX
存储到__EBX
(1)在PUSH %EBX
; (2) after the CPUID
; (2)在
CPUID
; but (3) before the POP %EBX
? 但是(3)在
POP %EBX
之前?
In your question you present some code that does a push
and pop
of ebx
. 在您的问题中,您将提供一些执行
ebx
push
和pop
的代码。 The idea of saving ebx
in the event that you compile with gcc using -fPIC
(position independent code) is correct. 在使用
-fPIC
(位置无关代码)使用gcc进行编译时,保存ebx
的想法是正确的。 It is up to our function not to clobber ebx
upon return in that situation. 在这种情况下返回时,不要破坏
ebx
是我们的职责。 Unfortunately the way you have defined the constraints you explicitly use ebx
. 不幸的是,您使用
ebx
明确定义约束的方式。 Generally the compiler will warn you ( error: inconsistent operand constraints in an 'asm' ) if you are using PIC code and you specify =b
as an output constraint. 通常,如果您使用的是PIC代码并且将
=b
指定为输出约束,则编译器会警告您( 错误:'asm'中的操作数约束不一致 )。 Why it doesn't produce a warning for you is unusual. 为什么它不会为您发出警告,这很不寻常。
To get around this problem you can let the assembler template choose a register for you. 要解决此问题,您可以让汇编器模板为您选择一个寄存器。 Instead of pushing and popping we simply exchange
%ebx
with an unused register chosen by the compiler and restore it by exchanging it back after. 无需推送和弹出,我们只需将
%ebx
与编译器选择的未使用寄存器交换,然后通过将其交换回来来恢复它。 Since we don't wish to have the compiler clobber our input registers during the exchange we specify early clobber modifier, thus ending up with a constraint of =&r
(instead of =b
in the OPs code). 由于我们不希望在交换过程中让编译器破坏我们的输入寄存器,因此我们指定了早期的clobber修饰符,因此最终以
=&r
(而不是OPs代码中的=b
)为约束。 More on modifiers can be found here . 在这里可以找到更多关于修饰符的信息 。 Your code (for 32 bit) would look something like:
您的代码(32位)如下所示:
unsigned int __FUNC = 1, __SUBFUNC = 0;
unsigned int __EAX, __EBX, __ECX, __EDX;
__asm__ __volatile__ (
"xchgl\t%%ebx, %k1\n\t" \
"cpuid\n\t" \
"xchgl\t%%ebx, %k1\n\t"
: "=a"(__EAX), "=&r"(__EBX), "=c"(__ECX), "=d"(__EDX)
: "a"(__FUNC), "c"(__SUBFUNC));
If you intend to compile for X86_64 (64 bit) you'll need to save the entire contents of %rbx
. 如果打算针对X86_64(64位)进行编译,则需要保存
%rbx
的全部内容。 The code above will not quite work. 上面的代码将无法正常工作。 You'd have to use something like:
您将必须使用类似:
uint32_t __FUNC = 1, __SUBFUNC = 0;
uint32_t __EAX, __ECX, __EDX;
uint64_t __BX; /* Big enough to hold a 64 bit value */
__asm__ __volatile__ (
"xchgq\t%%rbx, %q1\n\t" \
"cpuid\n\t" \
"xchgq\t%%rbx, %q1\n\t"
: "=a"(__EAX), "=&r"(__BX), "=c"(__ECX), "=d"(__EDX)
: "a"(__FUNC), "c"(__SUBFUNC));
You could code this up using conditional compilation to deal with both X86_64 and i386: 您可以使用条件编译来处理X86_64和i386:
uint32_t __FUNC = 1, __SUBFUNC = 0;
uint32_t __EAX, __ECX, __EDX;
uint64_t __BX; /* Big enough to hold a 64 bit value */
#if defined(__i386__)
__asm__ __volatile__ (
"xchgl\t%%ebx, %k1\n\t" \
"cpuid\n\t" \
"xchgl\t%%ebx, %k1\n\t"
: "=a"(__EAX), "=&r"(__BX), "=c"(__ECX), "=d"(__EDX)
: "a"(__FUNC), "c"(__SUBFUNC));
#elif defined(__x86_64__)
__asm__ __volatile__ (
"xchgq\t%%rbx, %q1\n\t" \
"cpuid\n\t" \
"xchgq\t%%rbx, %q1\n\t"
: "=a"(__EAX), "=&r"(__BX), "=c"(__ECX), "=d"(__EDX)
: "a"(__FUNC), "c"(__SUBFUNC));
#else
#error "Unknown architecture."
#endif
GCC has a __cpuid
macro defined in cpuid.h
. GCC有
__cpuid
中定义的宏cpuid.h
。 It defined the macro so that it only saves the ebx
and rbx
register when required. 它定义了宏,以便仅在需要时保存
ebx
和rbx
寄存器。 You can find the GCC 4.8.1 macro definition here to get an idea of how they handle cpuid
in cpuid.h . 您可以在此处找到GCC 4.8.1宏定义,以了解它们如何处理cpuid.h中的
cpuid
。
The astute reader may ask the question - what stops the compiler from choosing ebx
or rbx
as the scratch register to use for the exchange. 精明的读者可能会问这个问题-是什么阻止了编译器选择
ebx
或rbx
作为交换的暂存器。 The compiler knows about ebx
and rbx
in the context of PIC, and will not allow it to be used as a scratch register. 编译器在PIC上下文中了解
ebx
和rbx
,因此不会将其用作暂存寄存器。 This is based on my personal observations over the years and reviewing the assembler (.s) files generated from C code. 这是基于我多年来的个人观察并回顾了从C代码生成的汇编器(.s)文件。 I can't say for certain how more ancient versions of gcc handled it so it could be a problem.
我不能肯定地说更古老的gcc版本如何处理它,所以可能是一个问题。
I think you understand, but to be clear, the "consecutive" rule means that this: 我认为您了解但明确地说,“连续”规则意味着:
asm ("a");
asm ("b");
asm ("c");
... might get other instructions interposed, so if that's not desirable then it must be rewritten like this: ...可能会插入其他指令,因此,如果不希望这样做,则必须像这样重写:
asm ("a\n"
"b\n"
"c");
... and now it will be inserted as a whole. ...现在将其作为一个整体插入。
As for the cpuid
snippet, we have two problems: 至于
cpuid
代码段,我们有两个问题:
The cpuid
instruction will overwrite ebx
, and hence clobber the data that PIC code must keep there. cpuid
指令将覆盖ebx
,从而破坏了PIC代码必须保留在其中的数据。
We want to extract the value that cpuid
places in ebx
while never returning to compiled code with the "wrong" ebx
value. 我们要提取
cpuid
放在ebx
的值,而永远不要返回带有“错误” ebx
值的编译代码。
One possible solution would be this: 一种可能的解决方案是:
unsigned int __FUNC = 1, __SUBFUNC = 0;
unsigned int __EAX, __EBX, __ECX, __EDX;
__asm__ __volatile__ (
"push %ebx;"
"cpuid;"
"mov %ebx, %ecx"
"pop %ebx"
: "=c"(__EBX)
: "a"(__FUNC), "c"(__SUBFUNC)
: "eax", "edx"
);
__asm__ __volatile__ (
"push %ebx;"
"cpuid;"
"pop %ebx"
: "=a"(__EAX), "=c"(__ECX), "=d"(__EDX)
: "a"(__FUNC), "c"(__SUBFUNC)
);
There's no need to mark ebx
as clobbered as you're putting it back how you found it. 无需将
ebx
标记为已破坏,而是将其放回原来的状态。
(I don't do much Intel programming, so I may have some of the assembler-specific details off there, but this is how asm
works.) (我没有做太多的Intel编程,所以我可能有一些特定于汇编器的详细信息,但这是
asm
工作方式。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.