简体   繁体   English

如何在 GPR 的特定位中设置进位标志而不进行移位/旋转?

[英]How to set carry flag in specific bit in GPR without shifts / rotations?

I'm writing a program in NASM for Intel 80386 processor and i need to set the value of carry flag in a specific bit in GPR (general purpose register) without changing other bits in register.我正在为英特尔 80386 处理器编写 NASM 中的程序,我需要在 GPR(通用寄存器)中的特定位中设置进位标志的值,而不更改寄存器中的其他位。

Are there any possibility of doing so without using shifts/rotations of any kind?是否有可能在不使用任何形式的轮班/轮换的情况下这样做?

One branch-free way of doing this would be to fill a scratch register with the carry flag, mask for the bit you want, then or it with your target register.这样做的一种无分支方法是用进位标志填充临时寄存器,掩码你想要的位,然后用你的目标寄存器或它。

Using EAX as scratch (taking this as 32-bit values being manipulated):使用EAX作为暂存器(将其视为正在操作的 32 位值):

sbb eax, eax
and eax, 1 << 16  ; Adjust bitshift in constant for the desired bit. Multiple bits can be used.

If the carry flag is unset, sbb will do eax = eax - eax = 0 .如果未设置进位标志, sbb将执行eax = eax - eax = 0 If the carry flag is set, sbb will do eax = eax - (eax + 1) = -1 , so all bits are set.如果设置了进位标志,则sbb将执行eax = eax - (eax + 1) = -1 ,因此所有位都已设置。 The desired bit(s) is/are then masked in.然后掩蔽所需的位。

After that, you need to set the appropriate bit(s) in the target.之后,您需要在目标中设置适当的位。 If the bit is in an initial known state, this could be simplified.如果该位在初始已知的 state 中,则可以简化。 Using EBX as target:使用EBX作为目标:

and ebx, ~(1 << 16)  ; Same value at before. This doesn't need to be a shifted bit, it could be a number.
or  ebx, eax

Depending on what previously happened to the scratch register ( EAX here) it might be worth looking at some optimization information like https://www.agner.org/optimize/ .根据之前暂存器发生的情况(此处为EAX ),可能值得查看一些优化信息,例如https://www.agner.org/optimize/ Some processors will recognize that the new value of EAX doesn't depend on the old value, some will see it as having a (false) dependency on that old value.一些处理器会认识到EAX的新值不依赖于旧值,一些处理器会认为它对旧值具有(错误)依赖关系。

After looking at it, the document "Optimizing subroutines in assembly language" mentions the above sbb trick in the section "Replacing conditional jumps with bit-manipulation instructions".看了之后,文档“Optimizing subroutines in assembly language”在“用位操作指令替换条件跳转”一节中提到了上面的sbb技巧。

Using EAX (the accumulator) as the scratch register will result in smaller code size.使用EAX (累加器)作为暂存寄存器将导致更小的代码大小。

Predictable Case可预测的案例

If the value CF assumes is highly predictable, use a conditional jump and code like this:如果CF假设的值是高度可预测的,请使用条件跳转和如下代码:

    ... operation that sets CF ...
    jnc   nc           ; skip setting bit if CF is clear
    or    eax, 1       ; set bit in eax
    jmp   end
nc: and   eax, ~1      ; clear CF in eax
end:

Unpredictable Case不可预知的案例

If branchless code is desired (eg because the value of CF is hard to predict or if this is a cryptographic application), consider using a conditional move.如果需要无分支代码(例如,因为CF的值难以预测,或者如果这是一个加密应用程序),请考虑使用条件移动。

Computing CF is on the Critical Path计算CF处于关键路径

Assuming we would like to set the least significant bit in eax to the value of CF after performing some operation:假设我们想在执行一些操作后将eax中的最低有效位设置为CF的值:

    mov    ecx, eax    ; make a copy of eax
    or     ecx, 1      ; set CF in the copy
    and    eax, ~1     ; clear CF in the original
    ... operation that sets CF ...
    cmovc  eax, ecx    ; set eax to ecx if CF was set

This code has more instructions than that in Thomas Jager's answer , but its critical path latency is shorter, assuming computing CF is on the critical path but computing the prior value of eax is not.这段代码比Thomas Jager 的答案有更多的指令,但它的关键路径延迟更短,假设计算CF在关键路径上但计算eax的先验值不是。 Whether it's actually better depends a lot on the circumstances.它是否实际上更好取决于环境。

Neverthless, the simple variant尽管如此,简单的变体

    and     eax, ~1    ; clear CF in eax
    ... operation that sets CF ...
    adc     eax, 0     ; add carry flag to eax

is probably best for this specific case (set least significant bit to CF ) as it avoids two µops while having the same critical path latency.可能最适合这种特定情况(将最低有效位设置为CF ),因为它避免了两个微操作,同时具有相同的关键路径延迟。

Computing EAX is on the Critical Path计算 EAX 是关键路径

If on the other hand computing eax is on the critical path but computing CF is not, Thomas Jager's solution is good, but it needs to be tweaked to clear the least significant bit of eax beforehand so the bit is cleared if CF was clear.另一方面,如果计算eax在关键路径上但计算CF不是,Thomas Jager 的解决方案很好,但需要预先调整以清除eax的最低有效位,以便在清除CF时清除该位。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM