简体   繁体   English

如何否定存储在 32 位寄存器对中的 64 位整数?

[英]How do I negate a 64-bit integer stored in a 32-bit register pair?

I've stored a 64-bit integer in the EDX:EAX register pair .我在EDX:EAX寄存器对中存储了一个64 位整数。 How can I correctly negate the number?我怎样才能正确否定这个数字?

For example: 123456789123-123456789123 .例如: 123456789123-123456789123

Ask a compiler for ideas: compile int64_t neg(int64_t a) { return -a; }向编译器int64_t neg(int64_t a) { return -a; }意见: compile int64_t neg(int64_t a) { return -a; } int64_t neg(int64_t a) { return -a; } in 32-bit mode. int64_t neg(int64_t a) { return -a; }在 32 位模式下。 Of course, different ways of asking the compiler will have the starting value in memory, in the compiler's choice of registers, or already in EDX:EAX.当然,询问编译器的不同方式将在内存中、在编译器选择的寄存器中或已经在 EDX:EAX 中具有起始值。 See all three ways on the Godbolt compiler explorer , with asm output from gcc, clang, and MSVC (aka CL). 在 Godbolt 编译器资源管理器上查看所有三种方式,以及来自 gcc、clang 和 MSVC(又名 CL)的 asm 输出。

There are of course lots of ways to accomplish this, but any possible sequence will need some kind of carry from low to high at some point, so there's no efficient way to avoid SBB or ADC.当然有很多方法可以实现这一点,但任何可能的序列在某个时候都需要某种从低到高的进位,因此没有有效的方法来避免 SBB 或 ADC。


If the value starts in memory , or you want to keep the original value in registers, xor-zero the destination and use SUB/SBB.如果该值在 memory 中开始,或者您想将原始值保留在寄存器中,请对目标进行异或零处理并使用 SUB/SBB。 The SysV x86-32 ABI passes args on the stack and returns 64-bit integers in EDX:EAX. SysV x86-32 ABI 在堆栈上传递参数并在 EDX:EAX 中返回 64 位整数。 This is what clang3.9.1 -m32 -O3 does , for neg_value_from_mem :这就是clang3.9.1 -m32 -O3所做的,对于neg_value_from_mem

    ; optimal for data coming from memory: just subtract from zero
    xor     eax, eax
    xor     edx, edx
    sub     eax, dword ptr [esp + 4]
    sbb     edx, dword ptr [esp + 8]

If you have the values in registers and don't need the result in-place , you can use NEG to set a register to 0 - itself, setting CF iff the input is non-zero.如果您在寄存器中有值并且不需要就地结果,您可以使用NEG将寄存器设置为 0 - 本身,如果输入不为零,则设置 CF。 ie the same way SUB would.即与 SUB 相同的方式。 Note that xor-zeroing is cheap , and not part of the latency critical path, so this is definitely better than gcc's 3-instruction sequence (below).请注意,异或归零很便宜,并且不是延迟关键路径的一部分,因此这绝对比 gcc 的 3 指令序列(如下)更好。

    ;; partially in-place: input in ecx:eax
    xor     edx, edx
    neg     eax         ; eax = 0-eax, setting flags appropriately
    sbb     edx, ecx    ;; result in edx:eax

Clang does this even for the in-place case, even though that costs an extra mov ecx,edx .即使对于就地情况,Clang 也会这样做,即使这会花费额外的mov ecx,edx That's optimal for latency on modern CPUs that have zero-latency mov reg,reg (Intel IvB+ and AMD Zen), but not for number of fused-domain uops (frontend throughput) or code-size.这对于具有零延迟 mov reg,reg(Intel IvB+ 和 AMD Zen)的现代 CPU 的延迟是最佳的,但不适用于融合域 uops 的数量(前端吞吐量)或代码大小。


gcc's sequence is interesting and not totally obvious. gcc 的序列很有趣,但不是很明显。 It saves an instruction vs. clang for the in-place case, but it's worse otherwise.它为就地情况保存了一条指令 vs. clang,但否则情况会更糟。

    ; gcc's in-place sequence, only good for in-place use
    neg     eax
    adc     edx, 0
    neg     edx
       ; disadvantage: higher latency for the upper half than subtract-from-zero
       ; advantage: result in edx:eax with no extra registers used

Unfortunately, gcc and MSVC both always use this, even when xor-zero + sub/sbb would be better.不幸的是,gcc 和 MSVC 都总是使用它,即使 xor-zero + sub/sbb 会更好。


For a more complete picture of what compilers do, have a look at their output for these functions ( on godbolt )要更全面地了解编译器的作用,请查看这些函数的输出( 在 Godbolt 上

#include <stdint.h>

int64_t neg_value_from_mem(int64_t a) {
     return -a;
}

int64_t neg_value_in_regs(int64_t a) {
    // The OR makes the compiler load+OR first
    // but it can choose regs to set up for the negate
    int64_t reg = a | 0x1111111111LL;
    // clang chooses mov reg,mem   / or reg,imm8 when possible,
    // otherwise     mov reg,imm32 / or reg,mem.  Nice :)
    return -reg;
}

int64_t foo();
int64_t neg_value_in_place(int64_t a) {
    // foo's return value will be in edx:eax
    return -foo();
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将64位数据加载到32位寄存器中 - Loading a 64-bit data into a 32-bit register 如何将32位NEON程序集转换为64位? - How do I convert 32-bit NEON assembly to 64-bit? 为什么 32 位寄存器上的 x86-64 指令将整个 64 位寄存器的上半部分归零? - Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register? 如何将两个32位寄存器合并为64位答案? - How can I comine two 32-bit registers into a 64-bit answer? 如何使用32位除法指令执行64位除法? - How can I perform 64-bit division with a 32-bit divide instruction? x64 MOV 32位立即到64位寄存器 - x64 MOV 32-bit immediate to 64-bit register 当数据类型为 64 位时,为什么使用 32 位寄存器? - why use 32-bit register when the data type is 64-bit? 汇编:使用两个32位寄存器中的值进行除法,就好像它们是一个64位整数一样 - Assembly: division using the values in two 32-bit registers as if they were one 64-bit integer 为什么32位mips指令存储在64位空间中? - Why are 32-bit mips instructions stored in 64-bit space? 在64位平台上运行的32位汇编代码 - 32-bit assembly code running on 64-bit platform
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM